• Staff Software Engineer

    DataRobot (San Francisco, CA)
    …(eg Python, Java, Go, C++, or equivalent). + Strong understanding of observability and monitoring : metrics, tracing, logging; and instrumentation of services. ... their business - today and in the future. As a Staff Software Engineer focused on Application Scalability & Performance, you will lead the design, implementation,… more
    DataRobot (10/18/25)
    - Related Jobs
  • Senior Staff Infrastructure Operations…

    Zscaler (San Jose, CA)
    …operations, DevOps, or site reliability roles + Demonstrated expertise in system observability , including log analysis and metrics monitoring using tools like ... strategy. We are seeking an experienced Senior Staff Infrastructure Operations Engineer to join our team. This critical role involves designing, implementing,… more
    Zscaler (10/17/25)
    - Related Jobs
  • Senior MLOps Engineer

    NVIDIA (Santa Clara, CA)
    …ML to support data preparation, model training, validation, deployment, and monitoring . + Develop observability frameworks to monitor performance, utilization, ... NVIDIA is seeking a Senior MLOps Engineer to help design and scale the infrastructure that powers our AI research and product development. In this role, you will… more
    NVIDIA (10/03/25)
    - Related Jobs
  • Site Reliability Engineer - Platform

    Coinbase (Sacramento, CA)
    …systems capable of handling high throughput and low latency * Experience with observability and monitoring systems such as Kibana, Datadog, etc. * Familiarity ... projects within the context of strong support and mentorship. * Improve observability , reliability and availability by defining and measuring key metrics * Build… more
    Coinbase (11/14/25)
    - Related Jobs
  • Full Stack Software Engineer (Starlink)

    SpaceX (Hawthorne, CA)
    Full Stack Software Engineer (Starlink) Hawthorne, CA Apply SpaceX was founded under the belief that a future where humanity is out exploring the stars is ... of enabling human life on Mars. FULL STACK SOFTWARE ENGINEER (STARLINK) At SpaceX we're leveraging our experience in...in both Angular.js and Next.js/React) + Focus on continuous monitoring and alerting to foster data-driven business decisions and… more
    SpaceX (10/30/25)
    - Related Jobs
  • Distinguished Software Engineer (Data…

    Palo Alto Networks (Santa Clara, CA)
    …across the organization + **Operational Health** : Define and implement advanced observability , monitoring , and alerting strategies to ensure the end-to-end ... Summary** At Palo Alto Networks, we are redefining cybersecurity. As a Distinguished Engineer on the Enterprise DLP team, you will be the foremost technical leader… more
    Palo Alto Networks (11/14/25)
    - Related Jobs
  • Lead DevOps Engineer

    Lumen (Sacramento, CA)
    …with change management and production deployment standards. 3. Reliability and Monitoring *Establish observability practices (metrics, logging, tracing) and ... the world and shape the future. **The Role** The Lead DevOps Engineer is responsible for designing, implementing, and maintaining scalable, secure, and automated… more
    Lumen (11/13/25)
    - Related Jobs
  • Senior DevOps Engineer

    Eliassen Group (Sacramento, CA)
    …CI/CD processes, and integrating new tools into the DevOps environment. + ** Monitoring and observability :** Implement and manage advanced monitoring , ... **Senior DevOps Engineer ** **Anywhere** **Type:** Contract **Category:** Engineer **Industry:** Financial Services **Workplace Type:** Remote **Reference ID:**… more
    Eliassen Group (10/24/25)
    - Related Jobs
  • Principal Software Engineer - AI Systems

    Walmart (Sunnyvale, CA)
    …at scale. + Ensure models and agents are production-ready with strong observability , monitoring , and performance optimization. **2. Architecture & Scalability** ... integration** . + Familiarity with resilience engineering: disaster recovery, failover, monitoring , and high availability. + Exposure to **multi-modal AI** (text,… more
    Walmart (10/11/25)
    - Related Jobs
  • Principal Site Reliability Engineer (TDP)

    Palo Alto Networks (Santa Clara, CA)
    …infrastructure and is one of the largest GCP customers. As a Principle Site Reliability Engineer for the TDP team, you will be part of a team supporting the services ... running on this infrastructure. This includes automation, architecture, performance, observability , troubleshooting, security, and reliability. Our Infrastructure Platform stack… more
    Palo Alto Networks (11/04/25)
    - Related Jobs