- Walmart (Sunnyvale, CA)
- …Operations (LLMOps):** Architect the systems and processes for robust application monitoring , CI/CD, and telemetry. Define novel testing strategies and quality ... Elevate:** Act as a force multiplier for the engineering organization. Mentor senior engineers, lead architectural reviews, and cultivate a culture of deep technical… more
- General Atomics (San Diego, CA)
- …and satellites, missile defense, power and energy, and process and monitoring applications for defense, industrial, and commercial customers worldwide. This software ... Provided **US Citizenship Required?** Yes **Clearance Required?** Desired **Clearance Level** Senior (8+ years) **Workstyle** Onsite General Atomics is committed to… more
- NVIDIA (Santa Clara, CA)
- …communicate new features and solutions. + Work on System and Device Monitoring /Management Tools for our Compute Professional Solutions products and contribute to ... NVIDIA's extensive suite of Device Monitoring libraries and tools! What we need to see:...hardware and software interfaces. + Experience working with device monitoring tools is a plus + Strong English written… more
- Zoom (San Jose, CA)
- …availability and performance optimization; + Operate, and maintain an in-house monitoring system to proactively identify and resolve system and application issues; ... enhancing operational efficiency and reliability; + Manage deployment and continuous monitoring of the async search platform, ensuring scalability and responsiveness… more
- Walmart (Sunnyvale, CA)
- …to the roadmap of Walmart's core machine learning capabilities. + Create monitoring dashboards; perform latency tuning of deep learning models, scaling solutions to ... compare models, features, and hyperparameters; utilize A/B testing and continuous monitoring to validate and adjust models. + Possess excellent communication skills… more
- LiveRamp (San Francisco, CA)
- …with Engineering teams** + **Setup and maintain Infrastructure & Product Reliability monitoring and alerting** + **Maintain and enhance CI/CD Tooling and Terraform ... Containers and public clouds (GCP or AWS)** + **Experience with deployment and monitoring of highly scalable products.** + **Hands on experience on FinOps and… more
- NVIDIA (Santa Clara, CA)
- …& Telemetry collection platform with a focus on performance at scale, real time monitoring , logging and alerting + Engage in and improve the whole lifecycle of ... launch reviews + Maintain services once they are live by measuring and monitoring availability, latency and overall system health + Scale systems sustainably through… more
- NVIDIA (Santa Clara, CA)
- …of large scale Kubernetes clusters with focus on performance at scale, real time monitoring , logging and alerting + Engage in and improve the whole lifecycle of ... reviews. + Maintain services once they are live by measuring and monitoring availability, latency and overall system health. + Scale systems sustainably through… more
- Teledyne (Rancho Cordova, CA)
- …and defense, factory automation, air and water quality environmental monitoring , electronics design and development, oceanographic research, deepwater oil and ... and defense, factory automation, air and water quality environmental monitoring , electronics design and development, oceanographic research, energy, medical imaging… more
- NVIDIA (Santa Clara, CA)
- …with kubernetes including cluster operations, operator development, node health monitoring and working with GPU resource scheduling. We welcome out-of-the-box ... software related to scheduling GPU resources on kubernetes. + Implementing monitoring and health management capabilities that enable industry leading reliability,… more