- pony.ai (Fremont, CA)
- …globally. Pony.ai went public at NASDAQ in November 2024. Responsibilities As a ( Senior ) Kubernetes Engineer , you will: + Design, operate, and optimize ... security policies, and operational guidelines. + Contribute to observability and SRE practices to ensure reliability at scale (SLOs, incident reviews, metrics-driven… more
- Walmart (Sunnyvale, CA)
- …high-performance checkout services running in Edge and Cloud. As a Site Reliability Engineer in the CPC Team, you will work with L2, Other dependent Applications, ... you'll do ** + Incident triage, Escalation and Resolution: Triage site-impacting production issues by quantifying impact, severity and urgency, analyzing systems for… more
- Oracle (Sacramento, CA)
- …it a world class engineering center with the focus on excellence. As a Senior Principal Site Reliability DevOps Engineer , you will be responsible for defining ... person who loves a challenge? Solve the complex puzzles you've been dreaming of as our Engineer . If you have a passion for innovation in tech, we want you on our… more
- TP-Link North America, Inc. (Irvine, CA)
- …simpler, smarter, and more reliable connectivity. We're looking for a passionate and experienced Senior Site Reliability Engineer to join our team and play a ... the Development and DevOps team. + Analyze and resolve production risks caused by insufficient resources, such as node...and tools + Help to mentor and train less senior members of the team + Ability to be… more
- Oracle (Sacramento, CA)
- …Oracle Cloud Infrastructure (OCI) is seeking a technically advanced and hands-on Senior Software Engineer to drive the design, implementation, and optimization ... utilization, network bottlenecks, and implement corrective actions at scale. + Production Delivery: Validate and deploy new features and enhancements across OCI's… more
- Oracle (Sacramento, CA)
- …person who loves a challenge? Solve the complex puzzles you've been dreaming of as our Engineer . If you have a passion for innovation in tech, we want you on our ... engineering center with the focus on excellence. As a Site Reliability DevOps Engineer , you will be responsible for defining and deploying key services with deep… more
- NVIDIA (Santa Clara, CA)
- …platform upon which every new AI-powered application is built. We are seeking a Senior Software Engineer focused on container and cloud infrastructure. You will ... dependency management, and artifact/registry topology. + Collaborate across research, backend, SRE , and product teams to ensure day-0 availability of new models.… more
- NVIDIA (Santa Clara, CA)
- …platform upon which every new AI-powered application is built. We are seeking a Senior Software Engineer focused on container and cloud infrastructure. You will ... management, and artifact/registry topology. + Collaborate across research, backend, SRE , and product teams to ensure day-0 availability of...What we need to see: + 10+ years building production software with a strong focus on containers and… more
- Confluent (Sacramento, CA)
- …One Team. One Data Streaming Platform. **About the Role:** We are seeking a Senior Software Engineer II to architect, build, and operate services that are ... while also ensuring these systems are reliable, observable, and resilient in production . You'll work across engineering, security, compliance, and platform orgs to… more
- NVIDIA (Santa Clara, CA)
- NVIDIA is looking for an outstanding, passionate, and talented Senior AI Infrastructure Engineer to join our DGX Cloud group. This engineering role will design, ... build and maintain large scale production systems with high efficiency and availability using the...cloud enabling technologies like Kubernetes and OpenStack. DGX Cloud SRE at NVIDIA ensures that our internal and external… more