• Senior Site Reliability

    NVIDIA (Santa Clara, CA)
    …once they are live by measuring and monitoring availability, latency and overall system health. + Scale systems sustainably through mechanisms like automation, ... time enabling developers to make changes to the existing system through careful preparation and planning while keeping an... systems by pushing for changes that improve reliability and velocity + Practice sustainable incident response and… more
    NVIDIA (10/02/25)
    - Related Jobs
  • Senior Site Reliability

    Coinbase (Sacramento, CA)
    …improvements. * Educate, mentor and hold accountable the engineering team to improve the reliability of our systems and make reliability a core value ... platform - and with it, the future global financial system . To achieve our mission, we're seeking a very...you'll be doing (ie. job duties):* * Improve observability, reliability and availability by defining and measuring key metrics… more
    Coinbase (08/09/25)
    - Related Jobs
  • Senior Site Reliability

    Rubrik (Sacramento, CA)
    … and services with the objective of achieving and exceeding availability and reliability goals * Manage and streamline monitoring systems to enhance ... enable teams at Rubrik to develop secure software and protect data and systems with appropriate security controls. Information Security also develops systems to… more
    Rubrik (08/20/25)
    - Related Jobs
  • Principal Staff Site Reliability

    NVIDIA (Santa Clara, CA)
    …NTP/PTP, DHCP, and LDAP. This includes building for performance and reliability at global scale, covering automation, monitoring, high availability, capacity ... efficiency of services and drive efficiency with software and hardware optimizations ( SR -IOV/ DPU) + Experience with Technologies like eBPF and XDP for Observability… more
    NVIDIA (08/21/25)
    - Related Jobs
  • Senior Site Reliability

    LiveRamp (San Francisco, CA)
    …issues with Engineering teams** + **Setup and maintain Infrastructure & Product Reliability monitoring and alerting** + **Maintain and enhance CI/CD Tooling and ... Dynamodb** + **Optimize the performance and cost of the systems and rightsize Kubernetes containers.** + **Work in close...code, and automate routine tasks** + **Experience with securing systems in a public cloud environment** + **Understands how… more
    LiveRamp (08/07/25)
    - Related Jobs
  • Site Reliability Engineer

    Insight Global (Santa Clara, CA)
    …fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior SRE Engineer . The position will be part of a fast-paced crew ... and Driverless Cars to cater to their infrastructure & systems needs. As an SRE, you'll also be working...Science, Information Technology, or related field, or equivalent experience. - System admin and Windows admin experience in an on… more
    Insight Global (09/09/25)
    - Related Jobs
  • Sr . Reliability Engineering…

    Celestica (San Jose, CA)
    …with a background in the medical, telecommunication, or defense sectors. + Certified Reliability Engineer (CRE) certification is preferred. + Expertise in Design ... the company's strategic direction. **Overview:** We are seeking a Sr . Reliability Engineering Consultant to join our...and driving continuous improvements. A strong background in the reliability of complex electronic systems and their… more
    Celestica (09/18/25)
    - Related Jobs
  • Reliability Engineering Manager

    Teledyne (El Segundo, CA)
    …issues to senior leadership. **Supervisory Responsibilities** Directly manage the Reliability Department Staff: Reliability Engineer (s) and ... data. + Manage the Failure Review and Corrective Action System (FRACAS) and ensure timely resolution of reliability...related problems. + Must communicate concise program status to senior management. + Must be able to communicate and… more
    Teledyne (09/23/25)
    - Related Jobs
  • Senior Manager, Network Site…

    NVIDIA (Santa Clara, CA)
    GeForce Now is looking for a Manager, Network Site Reliability Engineer (SRE) to enhance our network infrastructure and operations. We are looking for a leader ... be doing: + Cultivate a top-performing team of Network Site Reliability Engineers through encouraging a culture of collaboration, accountability, and technical… more
    NVIDIA (08/08/25)
    - Related Jobs
  • Distinguished Software Engineer

    LinkedIn (Mountain View, CA)
    …architectural transformations at internet-scale companies + Deep knowledge of systems reliability , observability frameworks, and fault-tolerant architecture ... in Sunnyvale, CA or San Francisco, CA. **Responsibilities** + Serve as a senior technical leader driving the long-term reliability and observability strategy… more
    LinkedIn (09/24/25)
    - Related Jobs