• Lead Systems Engineer - (Hybrid)

    CareFirst (Reston, VA)
    …+ Service Mesh: Istio, AWS App Mesh, OpenShift Service Mesh etc. + Platform Monitoring , Observability , & Performance Tools: Data Dog, Cloud Watch etc. + DevOps ... of total system. Defines system support requirements to include monitoring , capacity, staffing and patching/updating. Analyzes and resolves program support… more
    CareFirst (12/05/25)
    - Related Jobs
  • (USA) Senior, Software Engineer

    Walmart (Bentonville, AR)
    …and Orchestration using Airflow and Kubernetes. + Hands-on experience with application monitoring and observability using Grafana, Prometheus, and Splunk to ... ML/Genai applications incorporating Data Pipelines, Model Training, Inferencing, Versioning and Monitoring . + Experience in cloud platforms such as GCP and Azure.… more
    Walmart (11/23/25)
    - Related Jobs
  • Principal Staff Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    …building for performance and reliability at global scale, covering automation, monitoring , high availability, capacity planning, and lifecycle management. + Define ... optimizations (SR-IOV/ DPU) + Experience with Technologies like eBPF and XDP for Observability & DDoS mitigation + Collect and review system data for capacity and… more
    NVIDIA (11/20/25)
    - Related Jobs
  • Staff, Software Engineer - MLE, People.AI

    Walmart (Bentonville, AR)
    …Engineers, Product Managers, and MLOps/DevOps teams to streamline model deployment, monitoring , and lifecycle management. **What You Will Bring** + **Expert-level ... , including data ingestion, feature engineering, model training, deployment, and monitoring . + A solid understanding of **core machine learning principles** ,… more
    Walmart (11/14/25)
    - Related Jobs
  • Senior Escalation Support Engineer

    Cisco (Austin, TX)
    …Cloud. ThousandEyes is also a foundational component of Cisco's growing Full-Stack Observability ("FSO") business. About The Role We're all familiar with the ... is tasked with empowering our customers with ThousandEyes to ease their performance monitoring pains. If you enjoy variety in job responsibilities, this is the job… more
    Cisco (11/13/25)
    - Related Jobs
  • Software Engineer III, Fulfillment…

    Wayfair (Boston, MA)
    …databases. + Proven ability to architect for performance, scalability, and observability in complex, distributed service environments. + Hands-on experience with ... modern software development practices, including CI/CD, test automation, and monitoring . + Experience collaborating cross-functionally with product management, operations,… more
    Wayfair (11/05/25)
    - Related Jobs
  • Senior Software Systems Engineer , Release…

    General Motors (Sunnyvale, CA)
    …reliability or stability regressions. + **Integrate data pipelines** for continuous monitoring of release health, including automated collection of test, simulation, ... or equivalent). + Prior experience implementing **ELT/ETL pipelines** for quality monitoring , reliability, or release metrics. + Solid understanding of **system… more
    General Motors (10/28/25)
    - Related Jobs
  • Engineer , Software

    Publicis Groupe (Chicago, IL)
    …ingestion, transformation, cleaning, and preprocessing. + Working understanding of observability , monitoring , and performance tuning for production systems; ... experience building CI/CD pipelines and contributing to DevOps practices. + Knowledge of modern architectural styles (DDD, event-driven, ports and adapters, microservices) and standard processes for scalable, maintainable systems. + Strong foundation in… more
    Publicis Groupe (01/11/26)
    - Related Jobs
  • Distinguished Engineer , GPU Fleet…

    NVIDIA (Santa Clara, CA)
    …lead the development of DGX Cloud strategy for GPU fleet lifecycle, health, observability and utilization monitoring , and remediation. You will define and drive ... the technical strategy across multiple environments (bare metal, cloud service provider, and neoclouds). Including defining and developing the auto-remediation strategies to detect, fix, validate, and restore-to-service critical systems. You will work with… more
    NVIDIA (01/10/26)
    - Related Jobs
  • Software Engineer II (SAP Technologies)

    Microsoft Corporation (Redmond, WA)
    …tech - Leverage SAP BTP, Azure, Kyma/Kubernetes, AI/ML, DevOps pipelines, and modern observability tools at massive scale. + Drive impact fast - Operate in an ... for production-like testing **Reliability & Supportability** + Act as DRI for monitoring system degradation and downtime on simple problems + Follow playbook to… more
    Microsoft Corporation (01/10/26)
    - Related Jobs