• Senior Staff Site Reliability

    Palo Alto Networks (Santa Clara, CA)
    …actionable insights into our systems' performance and health. **Your Impact** As a Senior Staff SRE with the Cortex Observability team, you will: + Cloud Expertise: ... influence the operability of the product and ensure the reliability and availability of our services **Your Experience** +...DevOps/SRE Expertise: 5+ years of experience as a DevOps/SRE engineer with a passion for technology and a strong… more
    Palo Alto Networks (07/15/25)
    - Related Jobs
  • Senior Site Reliability

    NVIDIA (Santa Clara, CA)
    Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... internal and external facing GPU cloud services run maximum reliability and uptime as promised to the users and...be doing: + Design, implement and support operational and reliability aspects of large scale Observability & Telemetry collection… more
    NVIDIA (08/02/25)
    - Related Jobs
  • Senior Site Reliability

    NVIDIA (Santa Clara, CA)
    Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... internal and external facing GPU cloud services run maximum reliability and uptime as promised to the users and...be doing: + Design, implement and support operational and reliability aspects of large scale Kubernetes clusters with focus… more
    NVIDIA (08/01/25)
    - Related Jobs
  • Staff Site Reliability Engineer

    ServiceNow, Inc. (San Diego, CA)
    It all started in sunny San Diego, California in 2004 when a visionary engineer , Fred Luddy, saw the potential to transform how we work. Fast forward to today - ... engineers who are tasked with maintaining and developing the reliability , scalability and performance of the ServiceNow cloud infrastructure....as a company and the SRE role. As an Engineer on the SRE team you will: + Provide… more
    ServiceNow, Inc. (07/15/25)
    - Related Jobs
  • Senior Systems Reliability

    The Walt Disney Company (Sacramento, CA)
    …high availability, and clear observability + Maintain and improve the reliability of services and infrastructure + Troubleshoot and resolve performance and ... reliability issues across the stack, including cloud resources +...working with observability tools for optimal performance and 24/7 reliability (eg DataDog, New Relic, Grafana)** + Data center,… more
    The Walt Disney Company (08/08/25)
    - Related Jobs
  • Senior Site Reliability

    Rubrik (Palo Alto, CA)
    …and services with the objective of achieving and exceeding availability and reliability goals * Manage and streamline monitoring systems to enhance observability and ... visibility * Perform Production Readiness Assessments of new services to identify reliability needs and surface potential gaps * Develop and maintain documentation… more
    Rubrik (08/07/25)
    - Related Jobs
  • Senior Site Reliability

    LiveRamp (San Francisco, CA)
    …issues with Engineering teams** + **Setup and maintain Infrastructure & Product Reliability monitoring and alerting** + **Maintain and enhance CI/CD Tooling and ... Terraform scripts in support of the mission in close collaboration with DevOps team** + **Maintain and enhance Engineering Operational Documentation for supported products.** + **Provide expertise to build and maintain products operational documentation and… more
    LiveRamp (08/07/25)
    - Related Jobs
  • Staff Site Reliability Engineer

    MongoDB (San Francisco, CA)
    …to build next-generation, AI-powered applications. We are looking for an experienced Staff Engineer for our SRE, InfraSec team, to guide the security of our ... with a strong focus on security work, with ideally 2+ years in a senior or staff engineering role Security Mindset: + A comprehensive understanding of all facets… more
    MongoDB (08/08/25)
    - Related Jobs
  • Site Reliability Engineer

    Insight Global (Santa Clara, CA)
    …fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior SRE Engineer . The position will be part of a fast-paced crew ... that develops and maintains sophisticated internal cloud provisioning products. The team works with various other business units such as Graphics Processors, Mobile Processors, Deep Learning, Artificial Intelligence and Driverless Cars to cater to their… more
    Insight Global (08/01/25)
    - Related Jobs
  • Senior Manager, Network Site…

    NVIDIA (Santa Clara, CA)
    GeForce Now is looking for a Manager, Network Site Reliability Engineer (SRE) to enhance our network infrastructure and operations. We are looking for a leader ... be doing: + Cultivate a top-performing team of Network Site Reliability Engineers through encouraging a culture of collaboration, accountability, and technical… more
    NVIDIA (08/08/25)
    - Related Jobs