- Teledyne (Goleta, CA)
- …and defense, factory automation, air and water quality environmental monitoring , electronics design and development, oceanographic research, deepwater oil and ... readouts, focal plane arrays and Dewars. Research and report on the measured performance and yield of these products. Work closely with product development to ensure… more
- NVIDIA (Santa Clara, CA)
- …aspects of large scale Observability & Telemetry collection platform with a focus on performance at scale, real time monitoring , logging and alerting + Engage in ... planning while keeping an eye on capacity, latency and performance . SRE is also a mindset and a set...Maintain services once they are live by measuring and monitoring availability, latency and overall system health + Scale… more
- NVIDIA (Santa Clara, CA)
- …operational and reliability aspects of large scale Kubernetes clusters with focus on performance at scale, real time monitoring , logging and alerting + Engage ... planning while keeping an eye on capacity, latency and performance . SRE is also a mindset and a set...Maintain services once they are live by measuring and monitoring availability, latency and overall system health. + Scale… more
- Teledyne (Rancho Cordova, CA)
- …and defense, factory automation, air and water quality environmental monitoring , electronics design and development, oceanographic research, deepwater oil and ... the following. Other duties may be assigned. + Responsible for the design, performance , and manufacturability of assigned products. + Learn and use a suite of… more
- Kratos Defense & Security Solutions, Inc. (Roseville, CA)
- …its expertise in developing, delivering, integrating, and supporting high- performance , cost-effective, jet-powered Unmanned Aerial Systems (Targets and Tactical). ... of engineering personnel to ensure successful overall project and organizational performance . + Ensures maximum productivity and cost-efficiency of assigned team. +… more
- GovCIO (Sacramento, CA)
- …Linux and Windows servers. + Perform routine systems administration, monitoring , and maintenance (eg, version upgrades, patching, account management, routine ... standards, policies, and other relevant guidelines. + Support the management and monitoring of numerous AWS-hosted features and services and maintain backups and… more
- Amazon (Sunnyvale, CA)
- …communication patterns for distributed AI training workloads * Develop comprehensive performance monitoring , metrics collection, and benchmarking tools for ... that powers the world's largest AI training clusters. We're developing high- performance RDMA and RoCE solutions that enable distributed training of… more
- Rubrik (Palo Alto, CA)
- …In addition, we provide common infrastructure services encompassing cluster health monitoring , system management, cluster operations and data migration. All of these ... infrastructure level + Design and develop infrastructure services for system monitoring , detecting faults, and automatically self-healing the distributed systems +… more
- Amazon (Cupertino, CA)
- …inspire us to never stop embracing our uniqueness. We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll ... issues - Understand and implement security and operational best-practices - Develop monitoring , alerting, and metric collection at scale - Contribute to and lead… more
- NVIDIA (Santa Clara, CA)
- …for enterprise readiness of NVIDIA Server platforms. + Designing and developing performance optimized active monitoring BMC solutions using DMTF Standards ... Arm architecture. + Hands on work with bringing up of BMC firmware, performance analysis and coding various manageability features for NVIDIA's Server platforms +… more