- MongoDB (San Francisco, CA)
- We are looking for an experienced Senior or Staff Engineer for our SRE , InfraSec team, to guide the security of our cloud-based infrastructure. As a Staff SRE ... that reinforce the platform's security posture. This is an SRE team, which means you can expect a highly...on security work, with ideally 2+ years in a senior or staff engineering role Security Mindset: + A… more
- Google (Sunnyvale, CA)
- Senior Director, Platform Operations, GDC _corporate_fare_ Google _place_ Sunnyvale, CA, USA; Kirkland, WA, USA; +2 more; +1 more _bar_chart_ Director _info_outline_ ... in-store hardware and create dynamic, modern applications. As the Senior Director, you will establish and maintain the core...GDC. You will lead the dedicated Site Reliability Engineering ( SRE ) team within GDC, be responsible for the operational… more
- Zscaler (San Jose, CA)
- …and agility with a cloud-first strategy. We're seeking a highly skilled and experienced SRE Platform Engineer to join our SRE Cloud Platform Engineering Team. ... patching, scaling, and infrastructure management + Building portals for SRE dashboards, service level indicators/agreements (SLI/SLO/SLA), and metrics that support… more
- NVIDIA (Santa Clara, CA)
- Site Reliability Engineering ( SRE ) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... open source cloud enabling technologies like Kubernetes and OpenStack. SRE at NVIDIA ensures that our internal and external...while keeping an eye on capacity, latency and performance. SRE is also a mindset and a set of… more
- General Motors (Mountain View, CA)
- …both opportunity and complexity. At General Motors, our Site Reliability Engineering ( SRE ) organization is built on software engineering principles. We design and ... are looking for strong Software Engineers to join our SRE team as hands-on Individual Contributors. In this role,...operational toil at scale. As a Software Engineer in SRE , you will work across the full lifecycle of… more
- NVIDIA (Santa Clara, CA)
- …the world's most powerful GPU systems. Join our top team and apply your SRE and software engineering skills to craft robust, user-friendly platforms for seamless ML ... reproducibility and scalability across large-scale, distributed GPU clusters. + Apply SRE principles to diagnose, troubleshoot, and resolve complex system issues… more
- MongoDB (San Francisco, CA)
- **The Team** Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational functions that support the ... Overview** We are seeking a talented Site Reliability Engineer ( SRE ) with a strong networking background to join the...secure and efficient communication between our services. As an SRE on the Fabric team, you will leverage your… more
- PagerDuty (San Francisco, CA)
- …world-all in a flexible, award-winning workplace. PagerDuty is seeking a Senior Product Manager, Incident Analysis to join our talented, customer-focused Incident ... Management team! As Senior Product Manager, you will report to the Director...products trusted by some of the world's top DevOps, SRE , and digital operations teams. The ideal candidate thrives… more
- ServiceNow, Inc. (Santa Clara, CA)
- …AI technologies that unlock new work experiences in the future. **As a Senior Staff Machine Learning Engineer you will:** + Contribute to the design, development ... well, and remain reliable. + Contribute to the continuous improvement of the SRE practice by turning operational use cases into requirements for software tooling. +… more
- pony.ai (Fremont, CA)
- …Pony.ai went public at NASDAQ in November 2024. Responsibilities As a ( Senior ) Kubernetes Engineer, you will: + Design, operate, and optimize Kubernetes clusters ... security policies, and operational guidelines. + Contribute to observability and SRE practices to ensure reliability at scale (SLOs, incident reviews, metrics-driven… more