• Senior Site Reliability Engineer…

    NVIDIA (Santa Clara, CA)
    …+ Design, implement and support operational and reliability aspects of large scale Observability & Telemetry collection platform with a focus on performance at ... Site Reliability Engineering ( SRE ) at NVIDIA is an engineering discipline to...Production + 8+ years experience delivering foundational infrastructure and observability platforms. + Experience in one or more of… more
    NVIDIA (12/19/25)
    - Related Jobs
  • Senior Product Manager…

    NVIDIA (Santa Clara, CA)
    observability (DCGM, NVML, etc.) and integration into large‑scale telemetry systems. + Deep knowledge of AI/ML infrastructure, high‑performance computing (HPC), ... right in the center of this revolution. Resiliency and Observability are key to delivering customer value and exhilarating...GPU hardware, network, and software stack, along with the telemetry signals that reveal them, and how they correlate… more
    NVIDIA (01/06/26)
    - Related Jobs
  • Principal Observability Customer Success…

    Amazon (San Francisco, CA)
    …to build innovative solutions for their most complex challenges. Today, AWS's observability services are critical for customers running modern applications at scale. ... The insights provided by AWS' full stack observability solutions help detect, investigate, and remediate problems faster, and coupled with AI and ML, proactively… more
    Amazon (12/17/25)
    - Related Jobs
  • Senior Product Manager

    PagerDuty (San Francisco, CA)
    …experience. + Demonstrated fluency with data analysis or analytics products ( telemetry , observability , post-incident review). + Proven success shipping features ... in a flexible, award-winning workplace. PagerDuty is seeking a Senior Product Manager, Incident Analysis to join our talented,...products trusted by some of the world's top DevOps, SRE , and digital operations teams. The ideal candidate thrives… more
    PagerDuty (12/16/25)
    - Related Jobs
  • Senior System Software Engineer, Firmware

    NVIDIA (Santa Clara, CA)
    …infrastructure for bare metal provisioning, testing and bringup. + Knowledge of SRE principles ( observability , SLOs, logging, etc.). + Strong experience in ... We are looking for an outstanding architect for a Senior System Engineer role for system bringup and datacenter...such as TPM, TXT, and SecureBoot. + Exposure to telemetry catalog and observability stack. + Exposure… more
    NVIDIA (01/10/26)
    - Related Jobs
  • Senior Software Engineer - Platform…

    Confluent (Sacramento, CA)
    …One Team. One Data Streaming Platform. **About the Role:** We are seeking a Senior Software Engineer II to architect, build, and operate services that are core to ... services (authentication, authorization, identity, secrets management, policy enforcement, security telemetry pipelines, etc.), while also ensuring these systems are… more
    Confluent (12/12/25)
    - Related Jobs
  • Senior Manager, Software Engineering…

    NVIDIA (Santa Clara, CA)
    …external partners to facilitate product adoption + Track metrics and make telemetry based informed decisions with stakeholder alignment + Expand the visibility of ... communicate and collaborate with cross-functional teams such as Product, Research, SRE , security, sales, marketing, PLC, security teams. + Strong understanding of… more
    NVIDIA (01/10/26)
    - Related Jobs
  • Senior Staff Engineer

    Nutanix (San Jose, CA)
    …systems, high availability, and multi-site replication design. + Experience with ** observability , telemetry , and AIOps** for large-scale platforms. Additional: + ... Proven ability to work across cross-functional engineering, product, and SRE teams. + Excellent system design documentation and architecture diagramming skills. +… more
    Nutanix (12/18/25)
    - Related Jobs
  • DDOS Software Engineer

    Oracle (Sacramento, CA)
    …OCI's edge. - Contribute to scalable data and control planes (policy, signaling, telemetry , orchestration) with a focus on resiliency and fault isolation. - Help ... policy) with OCI networking, DNS, and edge services under guidance from senior engineers. - Participate in operational readiness: support SLO/SLA tracking, on-call… more
    Oracle (01/09/26)
    - Related Jobs
  • Infrastructure Operations Lead - Cloud…

    Humana (Sacramento, CA)
    …healthcare systems and compliance frameworks (HIPAA, HITRUST) + Proficiency with observability and telemetry platforms (eg, Splunk, DynaTrace, SolarWinds) and ... functions around Cloud compliance, metrics/reporting and cost optimization + Provide senior level expertise on decisions and priorities regarding the enterprises… more
    Humana (12/13/25)
    - Related Jobs