• Sr. Staff Software Development Engineer

    Zscaler (San Jose, CA)
    …agility with a cloud-first strategy. We're looking for an experienced Senior Staff Development Engineer to join our team. This role is hybrid and based in our San ... and overall architecture + Experience with graph databases such as Neo4j, alongside observability tools like Prometheus, Grafana, and logging systems such as the ELK… more
    Zscaler (09/25/25)
    - Related Jobs
  • Senior Software Engineer

    Identiv (Santa Ana, CA)
    …standards such as FIDO2, and Zero Trust Architecture. . Experience with monitoring and observability tools (eg, OpenTelemetry, Prometheus, Grafana, ELK Stack) ... and take delight in the journey to solutions. Position Summary As a Senior Software Engineer , you will play a key role on a focused engineering team driving the… more
    Identiv (09/12/25)
    - Related Jobs
  • Senior Systems Software Engineer

    NVIDIA (Santa Clara, CA)
    …the center of this revolution. We are seeking a motivated Senior Systems Software Engineer to join our AV Infrastructure organization and become a key driver in ... to support AV software builds, large-scale simulation testing, and real-time observability . + Innovate developer tooling and automation frameworks to mitigate… more
    NVIDIA (09/11/25)
    - Related Jobs
  • Forward Deployed Engineer , AI Accelerator

    NVIDIA (Santa Clara, CA)
    NVIDIA is seeking a Forward Deployed Engineer to join our AI Accelerator team, working directly with strategic customers to implement and optimize pioneering AI ... and optimize large-scale model training and inference workloads, implement monitoring solutions, and resolve scaling challenges + Integration Development: Build… more
    NVIDIA (09/02/25)
    - Related Jobs
  • Base Command Manager Engineer - Nvis NPI

    NVIDIA (Santa Clara, CA)
    …and monitoring tools (eg, Prometheus, Grafana, DCGM, and similar observability stacks). + Outstanding written and verbal communication skills, with the ability ... the world. We are seeking a dedicated Base Command Manager (BCM) Engineer to support product deployments/escalations and collaborate with Engineering and our Field… more
    NVIDIA (08/24/25)
    - Related Jobs
  • Principal Staff Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    …building for performance and reliability at global scale, covering automation, monitoring , high availability, capacity planning, and lifecycle management. + Define ... optimizations (SR-IOV/ DPU) + Experience with Technologies like eBPF and XDP for Observability & DDoS mitigation + Collect and review system data for capacity and… more
    NVIDIA (11/20/25)
    - Related Jobs
  • Staff Software Engineer - Notification…

    General Motors (Mountain View, CA)
    …lifecycle software development, from design and implementation to deployment and monitoring . + Provide technical leadership across multiple teams, ensuring alignment ... data flow across services. + Champion automation in testing, deployment, and monitoring to improve development velocity and system reliability. + Guide incident… more
    General Motors (11/15/25)
    - Related Jobs
  • Senior Software Systems Engineer , Release…

    General Motors (Sunnyvale, CA)
    …reliability or stability regressions. + **Integrate data pipelines** for continuous monitoring of release health, including automated collection of test, simulation, ... or equivalent). + Prior experience implementing **ELT/ETL pipelines** for quality monitoring , reliability, or release metrics. + Solid understanding of **system… more
    General Motors (10/28/25)
    - Related Jobs
  • Principal Hardware Engineer - Hardware…

    Cadence Design Systems, Inc. (San Jose, CA)
    …platform and processes to improve operations. Key Responsibilities: + Implement monitoring framework to improve infrastructure reliability, observability , and ... alerts. + Identifying and implementing automation opportunities to reduce manual work and acceleration delivery. + Drive technical decisions on architecture, automation, and tooling. + Develop processes to track and scale key metrics for reliability,… more
    Cadence Design Systems, Inc. (10/08/25)
    - Related Jobs
  • Principal Site Reliability Engineer (Prisma…

    Palo Alto Networks (Santa Clara, CA)
    …robust and performant. This includes automation, architecture, performance, observability , troubleshooting, security, and reliability. Our Infrastructure Platform ... and automation frameworks** , championing **Infrastructure as Code (IaC)** and ** Monitoring as Code (MaC)** principles. + **Automate robust deployments** and… more
    Palo Alto Networks (10/07/25)
    - Related Jobs