• Customer Reliability Engineer

    Cisco (San Jose, CA)
    …One, Datadog, GitLab, Google, and many more. **Your Impact** As a Technical Consulting Engineer , you are the tip of the spear in interacting with our customers. Our ... CRE team adapts the best practices of Site Reliability Engineering (SRE) and applies them to our customers....a proactive approach vs a reactive approach to customer reliability and you will use existing data to help… more
    Cisco (12/06/25)
    - Related Jobs
  • Staff Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... Cloud. SRE at NVIDIA ensures that our internal and external facing services run maximum reliability and uptime as promised to the users and at the same time enabling… more
    NVIDIA (11/01/25)
    - Related Jobs
  • Senior Site Reliability Engineer

    ServiceNow, Inc. (San Diego, CA)
    It all started in sunny San Diego, California in 2004 when a visionary engineer , Fred Luddy, saw the potential to transform how we work. Fast forward to today - ... engineers who are tasked with maintaining and developing the reliability , scalability and performance of the ServiceNow infrastructure. The...as a company and the SRE role. **As an Engineer on the SRE team you will:** + Provide… more
    ServiceNow, Inc. (11/18/25)
    - Related Jobs
  • Sr. Site Reliability Engineer

    Amazon (Culver City, CA)
    …all levels. Our Infrastructure Engineering team is looking for Sr Site Reliability Engineers to build, deploy, operate, and sustain our critical infrastructure and ... systems in AWS. The team will operationalize the stability and reliability of these systems and discover innovative ways to scale and operate them reliably as we… more
    Amazon (12/09/25)
    - Related Jobs
  • Senior Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... internal and external facing GPU cloud services run maximum reliability and uptime as promised to the users and...be doing: + Design, implement and support operational and reliability aspects of large scale Observability & Telemetry collection… more
    NVIDIA (12/19/25)
    - Related Jobs
  • Senior Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... internal and external facing GPU cloud services run maximum reliability and uptime as promised to the users and...be doing: + Design, implement and support operational and reliability aspects of large scale Kubernetes clusters with focus… more
    NVIDIA (11/05/25)
    - Related Jobs
  • Sr Staff Site Reliability Engineer

    Palo Alto Networks (Santa Clara, CA)
    …Citizen or Green Card holder.** **Your Career** We are seeking development-heavy Site Reliability Engineers (SREs) who are passionate about bringing new ideas to all ... ensure applications align with infrastructure requirements, focusing on scalability and reliability + Collaborate with PMs to deliver compliances (SOC2, Fedramp,… more
    Palo Alto Networks (12/12/25)
    - Related Jobs
  • Undergrad Site Reliability Engineer

    Oracle (Sacramento, CA)
    …will be joining the OCSC (Oracle Cloud Service Centre) as an SRD (site reliability developer). Your job role will be helping Oracle ensure the availability of cloud ... experiencing both development and operations. As a Cloud Service Centre Site Reliability Developer Intern you will be involved with: **Operations** + Administer… more
    Oracle (11/25/25)
    - Related Jobs
  • Principal Network Reliability

    Oracle (Sacramento, CA)
    **Job Description** The mission of our Network Reliability Engineering team is to provide exceptional network reliability and automation services that enable our ... customers to drive operational excellence in OCI networks at scale. By focusing on both reactive and proactive functions, we aim to minimize downtime, quickly resolve incidents, and continuously enhance network performance through automation, advanced… more
    Oracle (12/01/25)
    - Related Jobs
  • Principal Staff Site Reliability

    NVIDIA (Santa Clara, CA)
    …NTP/PTP, DHCP, and LDAP. This includes building for performance and reliability at global scale, covering automation, monitoring, high availability, capacity ... architectures and identify opportunities for containerization to improve scalability, reliability , and efficiency. + Strong analytical skills with the ability… more
    NVIDIA (11/20/25)
    - Related Jobs