- ServiceNow, Inc. (Pleasanton, CA)
- …next-generation analytics to support ServiceNow's Cloud and AI growth. As our Senior Staff DevOps Engineer for Cloud Analytics & FinOps Engineering Platform, you ... SLIs/SLOs/SLAs for data platform services with error budget management, establish SRE practices including toil reduction and capacity planning, and create… more
- Google (Mountain View, CA)
- …systems is a true strategy, and a good one._ Site Reliability Engineering ( SRE ) is an engineering discipline that combines software and systems engineering to build ... and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google's services-both our internally critical and...while keeping an ever-watchful eye on capacity and performance. SRE is also a mindset and a set of… more
- LiveRamp (San Francisco, CA)
- …to build and maintain products operational documentation and setting up product SRE practices** + **Experience working with real-time and NoSQL Databases such as ... and rightsize Kubernetes containers.** + **Work in close collaboration with SRE team members and Engineering organizations based in California, Paris, Nantong,… more
- NVIDIA (Santa Clara, CA)
- Site Reliability Engineering ( SRE ) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... open source cloud enabling technologies like Kubernetes and OpenStack. SRE at NVIDIA ensures that our internal and external...while keeping an eye on capacity, latency and performance. SRE is also a mindset and a set of… more
- Celonis (Redwood City, CA)
- …The team applies advanced software engineering and Site Reliability Engineering ( SRE ) principles to drive system reliability, scalability, and operational excellence ... for a fleet of 80+ FedRAMP-compliant microservices running on Kubernetes, applying SRE principles to drive observability, automation, and incident prevention. + Own… more
- House of Blues (CA)
- Job Summary: JOB DESCRIPTION - Senior AI-Driven Platform Automation Engineer Location: Remote, US Division: Ticketmaster US Line Manager: Director, Software ... at scale. THE JOB We are looking for a Senior AI-Driven Platform Automation Engineer to join our high-impact...Automation Engineer to join our high-impact Platform Automation and SRE group within the Core Concerts division at Ticketmaster.… more
- NVIDIA (Santa Clara, CA)
- …and scalability across global public and private clouds. + Implement SRE fundamentals, including incident management, monitoring, and performance optimization, while ... or related field, or equivalent experience with 8+ years in Software Development, SRE , or Production Engineering. + Proficiency in Python and at least one other… more
- Palo Alto Networks (Santa Clara, CA)
- …in data engineering, cloud infrastructure, and a strong background in DevOps, SRE , or system engineering. The ideal candidate will be responsible for designing, ... for our data platforms (eg, Airflow, Spark clusters), applying SRE and DevOps best practices for performance, reliability, and...infrastructure. + Must have proven experience in a DevOps, SRE , or System Engineering role, with hands-on expertise in… more
- NVIDIA (Santa Clara, CA)
- GeForce Now is looking for a Manager, Network Site Reliability Engineer ( SRE ) to enhance our network infrastructure and operations. We are looking for a leader who ... ensuring a smooth user experience. The position focuses on managing Network SRE to streamline network operations, minimize manual tasks, and achieve service level… more
- General Motors (Mountain View, CA)
- …where we live and deliver a better future for generations to come. In this SRE SW Engineer role, you will develop and maintain key elements of the infrastructure ... us and let's innovate! **What You'll Do** + Implement scalable, reliable, secure SRE and Observability platform to monitor health of our production system and… more