• Senior Software Engineer - Release…

    Celonis (Los Angeles, CA)
    …experience. **The Role:** **Why This Role Matters:** As a Senior Software Engineer (Release Infrastructure), you'll take ownership of core systems that impact ... controlled, reliable, progressive rollouts. + **Scale reliability** : Enhance observability , incident response, and deployment safety across Celonis's software… more
    Celonis (07/18/25)
    - Related Jobs
  • Sr. IT Operations Engineer

    SpaceX (Hawthorne, CA)
    …approve complex changes, and enforce post‑implementation reviews. + Architect and optimize monitoring and observability strategy across on‑prem, cloud, and SaaS ... Sr. IT Operations Engineer Hawthorne, CA Apply SpaceX was founded under...corporate ITSM (Information Technology Service Management), incident, change, and monitoring practices. You'll own high‑visibility incidents, mentor peers, and… more
    SpaceX (09/02/25)
    - Related Jobs
  • Staff Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    …infrastructure as code (Terraform, Pulumi, Ansible, CloudFormation). + Proficient in observability and monitoring tools (Prometheus, Grafana, ELK Stack, ... production environments. We are seeking a deeply skilled Staff Site Reliability Engineer (SRE) to advance our enterprise security initiatives around identity and… more
    NVIDIA (09/30/25)
    - Related Jobs
  • Senior Software Engineer , GraphQL…

    Wayfair (Mountain View, CA)
    Senior Software Engineer , GraphQL Platforms Candidates for this position are preferred to be based in Mountain View, CA and will be expected to comply with their ... and ensure services are built to industry best practices, including observability , architectural patterns, and inter-team dependency mechanisms (like SLOs) +… more
    Wayfair (09/27/25)
    - Related Jobs
  • Senior Software Engineer - Infrastructure

    Amazon (San Francisco, CA)
    …analytics, reporting, and ML workloads. - Ensure data quality, reliability, and observability through monitoring , logging, and automated alerting. - Work closely ... interactive video and we are seeking a Data Infrastructure Engineer to design, build, and maintain the systems that...Strong hands-on expertise with AWS cloud - Background in observability and monitoring - Knowledge of data… more
    Amazon (09/13/25)
    - Related Jobs
  • Senior Software Engineer - Inference as…

    NVIDIA (Santa Clara, CA)
    …you can make a lasting impact on the world. We are seeking a Senior Software Engineer to join our Software Infrastructure Team in Santa Clara, CA. This team is at ... high availability of inference services. + Implement APIs for model deployment, monitoring , and management for a seamless user experience. + Collaborate with… more
    NVIDIA (08/23/25)
    - Related Jobs
  • Principal Software Engineer - Inference…

    NVIDIA (Santa Clara, CA)
    …you can make a lasting impact on the world. We are seeking a Principal Software Engineer to join our Software Infrastructure Team in Santa Clara, CA. This team is at ... of inference services. + Define and implement APIs for model deployment, monitoring , and management for a seamless user experience. + Optimize system performance… more
    NVIDIA (08/21/25)
    - Related Jobs
  • Senior Site Reliability Engineer

    Coinbase (Sacramento, CA)
    …systems capable of handling high throughput and low latency * Experience with observability and monitoring systems such as Kibana, Datadog, etc. * Familiarity ... in Q3 2023. *What you'll be doing (ie. job duties):* * Improve observability , reliability and availability by defining and measuring key metrics * Build automation… more
    Coinbase (08/09/25)
    - Related Jobs
  • Principal Engineer Data Engineering - US…

    Anywhere Real Estate (San Francisco, CA)
    …+ 5+ years' experience managing production data platforms. + 5+ years' experience building observability ( Monitoring & Alerting) using tools such as Data Dog and ... and AA & AB Operations. We're seeking a Principal Engineer to join our Data Platform Team. In this...Snowflake ETLs. + Production Support and enhancements to the observability of the Data Platform. **Team Leadership and Mentorship:**… more
    Anywhere Real Estate (10/01/25)
    - Related Jobs
  • Senior Manager, Machine Learning Engineer

    Cisco (San Jose, CA)
    Senior Manager, Machine Learning Engineer - ML Ops Apply (https://jobs.cisco.com/jobs/Login?projectId=1448871) + Location:San Jose, California, US + Area of ... Design and oversee scalable LLMOps pipelines including fine-tuning, evaluation, deployment, monitoring , and optimization of large language models. Work closely with… more
    Cisco (09/18/25)
    - Related Jobs