• Senior AI Observability

    NVIDIA (Santa Clara, CA)
    NVIDIA's AI Infrastructure organization is seeking a Senior AI Observability Engineer to help architect and implement distributed observability systems ... productivity of AI and HPC workloads. You will develop, deploy, and operate observability solutions for multiple compute clusters around the world. What You'll Be… more
    NVIDIA (07/22/25)
    - Related Jobs
  • Senior Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    …Design, implement and support operational and reliability aspects of large scale Observability & Telemetry collection platform with a focus on performance at scale, ... system in Production + 8+ years experience delivering foundational infrastructure and observability platforms. + Experience in one or more of the following: Python,… more
    NVIDIA (08/02/25)
    - Related Jobs
  • Senior Staff Engineer , Server…

    MongoDB (Palo Alto, CA)
    …trust MongoDB to build next-generation, AI-powered applications. The Networking & Observability Team builds infrastructure for low-overhead observability and ... or building core components for data processing systems + Familiarity with observability ecosystem and best practice + Excellent verbal and written technical… more
    MongoDB (06/17/25)
    - Related Jobs
  • Senior Staff Site Reliability…

    Palo Alto Networks (Santa Clara, CA)
    …and actionable insights into our systems' performance and health. **Your Impact** As a Senior Staff SRE with the Cortex Observability team, you will: + Cloud ... including the design, implementation, and continuous enhancement of our comprehensive observability systems. To meet the opportunities that such a role provides,… more
    Palo Alto Networks (07/15/25)
    - Related Jobs
  • Senior Software Engineer - Site…

    General Motors (Mountain View, CA)
    …live and deliver a better future for generations to come. In this SRE SW Engineer role, you will develop and maintain key elements of the infrastructure health and ... let's innovate! **What You'll Do** + Implement scalable, reliable, secure SRE and Observability platform to monitor health of our production system and provide a… more
    General Motors (07/13/25)
    - Related Jobs
  • Senior Software Engineer , AI…

    LinkedIn (Mountain View, CA)
    …to optimize their models and deliver the best performance possible. As a Senior Software Engineer , you will have first-hand opportunities to advance one ... performance optimizations across billions of user queries Model Training Infrastructure: As an engineer on the AI Training Infra team, you will play a crucial role… more
    LinkedIn (08/08/25)
    - Related Jobs
  • Senior Site Reliability Engineer

    Palo Alto Networks (Santa Clara, CA)
    …including the design, implementation, and continuous enhancement of our comprehensive observability systems. To meet the opportunities that such a role provides, ... you will have a deep knowledge of modern observability and monitoring tools and practices, having managed high...our systems' performance and health. **Your Impact** As a Senior SRE with the Cortex Cloud Security Posture Management… more
    Palo Alto Networks (08/08/25)
    - Related Jobs
  • Senior Distributed Golang Software…

    Cisco (San Jose, CA)
    Senior Distributed Golang Software Engineer , Isovalent Tetragon Team (US) Apply (https://jobs.cisco.com/jobs/Login?projectId=1444334) + Location:Offsite, San ... open-source software and enterprise solutions solving networking, security, and observability needs for modern cloud native infrastructure. The flagship technology,… more
    Cisco (08/15/25)
    - Related Jobs
  • Senior Software Engineer

    Coinbase (Sacramento, CA)
    …to ensure company wide system's reliability and less customer impact . As a * Senior Software Engineer * you will help to promote reliability culture across ... a daily basis. *What you'll be doing (ie. job duties):* * Improve observability , reliability and availability by defining and measuring key metrics * Build… more
    Coinbase (08/09/25)
    - Related Jobs
  • Senior AI Engineer (LLM Core)

    Capital One (San Francisco, CA)
    Senior AI Engineer (LLM Core) **Overview:** At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital ... agreed upon number of hours to be regularly worked. Cambridge, MA: $158,600 - $181,000 for Senior AI Engineer McLean, VA: $158,600 - $181,000 for Senior AI … more
    Capital One (08/20/25)
    - Related Jobs