• Software Engineer , Machine Learning…

    DoorDash (San Francisco, CA)
    …excited about this opportunity because you will + Build a world-class ML platform where models are developed, trained, and deployed seamlessly + Work closely with ... Data Scientists and Product Engineers to evolve the ML platform as per their use cases + You will...of DoorDash's business. + Improve the reliability, scalability, and observability of our training and inference infrastructure. We're excited… more
    DoorDash (07/09/25)
    - Related Jobs
  • Senior Site Reliability Engineer - FedRAMP

    Rubrik (Sacramento, CA)
    …and reliability goals * Manage and streamline monitoring systems to enhance observability and enable proactive identification of issues. * Coordinate and manage ... data management best practices and strong experience in any logging and/or SIEM platform * Experience with Vault, Terraform, Puppet, Jenkins and Github * Proficiency… more
    Rubrik (08/20/25)
    - Related Jobs
  • Staff Software Engineer - Compute…

    LinkedIn (Mountain View, CA)
    …Specifically, the LinkedIn Kubernetes Infrastructure team provides an on-premises Kubernetes platform for the entire company. The team provides capability to ... systems with security and compliance in mind. + You will Improve the observability and understandability of various systems with a focus on improving developer… more
    LinkedIn (09/24/25)
    - Related Jobs
  • Principal Site Reliability Engineer (Prisma…

    Palo Alto Networks (Santa Clara, CA)
    …robust and performant. This includes automation, architecture, performance, observability , troubleshooting, security, and reliability. Our Infrastructure Platform ... stack includes Terraform, Kubernetes, GitLab CI/CD, GitOps, Prometheus, Grafana, Loki, Docker, GCP, Backstage, MySQL, PagerDuty, FireHydrant, Python, Bash, Java, NodeJS and Go. **Your Impact** + Design, build, and operate reliable, secure Cloud infrastructure… more
    Palo Alto Networks (09/06/25)
    - Related Jobs
  • Principal Staff Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    …optimizations (SR-IOV/ DPU) + Experience with Technologies like eBPF and XDP for Observability & DDoS mitigation + Collect and review system data for capacity and ... field, or equivalent experience + 15+ years of proven experience in compute platform engineering with a focus on automation. + Experience in designing and deploying… more
    NVIDIA (08/21/25)
    - Related Jobs
  • Senior Site Reliability Engineer

    LiveRamp (San Francisco, CA)
    **LiveRamp is the data collaboration platform of choice for the world's most innovative companies. A groundbreaking leader in consumer privacy, data ethics, and ... Go programming language.** + **Experience with SRE best practices, working knowledge of observability principles is a big plus** + **Ability to lead and mentor other… more
    LiveRamp (08/07/25)
    - Related Jobs
  • Senior DGX Cloud Software Engineer

    NVIDIA (Santa Clara, CA)
    …compute infrastructure and codify reliability best-practices in the broader DGX Cloud platform ecosystem. What you'll be doing: + Design, build, and run cloud ... internal facing service level objectives and error budgets as part of our overall observability strategy. + Eliminate toil or automate it where the ROI of building… more
    NVIDIA (07/26/25)
    - Related Jobs
  • Sr. Customer Success Engineer

    Dynatrace (Mountain View, CA)
    …you will love being a Dynatracer** + Dynatrace is a leader in unified observability and security. + We provide a culture of excellence with competitive compensation ... other leading partners worldwide to create strategic alliances. + The Dynatrace platform uses cutting-edge technologies, including our own Davis hypermodal AI, to… more
    Dynatrace (07/26/25)
    - Related Jobs
  • Principal Hardware Engineer - Hardware…

    Cadence Design Systems, Inc. (San Jose, CA)
    …Cadence's Hardware Emulation Cloud to develop scalable and secure monitoring platform and processes to improve operations. Key Responsibilities: + Implement ... monitoring framework to improve infrastructure reliability, observability , and alerts. + Identifying and implementing automation opportunities to reduce manual work… more
    Cadence Design Systems, Inc. (07/10/25)
    - Related Jobs
  • Hybrid Multi Cloud Network Architect

    Rubrik (Palo Alto, CA)
    …you're passionate about making an impact and shaping the future of cloud and platform infrastructure-we'd love to hear from you. **About the Role:** We are seeking ... environment (GCP, AWS, Azure, and OCI). As part of our Global Infrastructure & Platform Services organization, you will serve as the subject matter expert (SME) and… more
    Rubrik (08/14/25)
    - Related Jobs