• Staff Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    Site Reliability Engineering ( SRE ) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... and deployment, open source cloud enabling technologies like Kubernetes and Public Cloud. SRE at NVIDIA ensures that our internal and external facing services run… more
    NVIDIA (11/01/25)
    - Related Jobs
  • Staff, Software Engineer - DevOps

    Walmart (Sunnyvale, CA)
    …from fraudulent activity across our global transaction platforms. Our DevOps and SRE engineers ensure these systems are reliable, scalable, and secure, enabling ... and safe transactions at massive scale. **What You'll Do:** + Architect , implement, and optimize CI/CD pipelines and infrastructure-as-code solutions tailored for… more
    Walmart (12/05/25)
    - Related Jobs
  • Site Reliability Engineer

    Nightwing (Sterling, VA)
    …position is CONTINGENT upon contract award** The Site Reliability Engineer ( SRE ) collaboratively works closely with the contract leadership, Platform teams, and ... new software or new features to production as quickly as possible. The SRE executes and analyzes manual IT operations/admin tasks (log analysis, performance tuning,… more
    Nightwing (12/26/25)
    - Related Jobs
  • Sr. Manager - Enterprise Cloud & DevOps

    United Airlines (Chicago, IL)
    …integration and delivery (CI/CD) capabilities, and site reliability engineering ( SRE ) practices to support a secure, high-performing, and resilient technology ... + Lead and develop a high-performing team of platform, DevOps, and SRE engineers, fostering a culture of continuous improvement, collaboration, and technical… more
    United Airlines (12/03/25)
    - Related Jobs
  • Senior ML Platform Engineer - Lepton

    NVIDIA (Santa Clara, CA)
    …era of machine learning innovation. In this role, you will architect , build, and scale our high-performance ML infrastructure using modern Infrastructure-as-Code ... the world's most powerful GPU systems. Join our top team and apply your SRE and software engineering skills to craft robust, user-friendly platforms for seamless ML… more
    NVIDIA (11/04/25)
    - Related Jobs
  • Manager - Microsoft Infrastructure Consultant…

    Huron Consulting Group (Chicago, IL)
    …expanded leadership across AWS and GCP environments. You will architect secure landing zones, implement Infrastructure-as-Code and Policy-as-Code patterns, automate ... , **Bicep** , and **Azure Verified Modules (AVM)** . + Architect identity, networking, private connectivity, secrets management, backup/DR, and workload hosting… more
    Huron Consulting Group (12/09/25)
    - Related Jobs
  • Head of Next-Generation Managed Services

    EPAM Systems (NY)
    …implementing AI-first strategies while leading multiple strategic client accounts. You'll architect next-generation managed services that integrate DevOps, SRE ... **Responsibilities** + Develop transformative managed services approaches integrating DevOps, SRE , and AI-powered automation + Design non-linear commercial models… more
    EPAM Systems (12/19/25)
    - Related Jobs
  • Staff, Site Reliability Engineer

    Walmart (Bentonville, AR)
    …customer service platforms are resilient, scalable, and lightning-fast. You'll architect reliability frameworks, drive automation across incident response and ... observability, and collaborate with engineering and product teams to embed SRE principles into every layer of the stack. This role offers the excitement of solving… more
    Walmart (11/25/25)
    - Related Jobs
  • Sr Principal Software Developer

    Oracle (Reston, VA)
    **Job Description** ** Architect Operational Processes** : Design and implement scalable and automated operational processes for incident management, change ... cause analysis for critical issues. **Capacity and Performance Management** : Architect and implement systems to monitor, predict, and optimize infrastructure… more
    Oracle (12/31/25)
    - Related Jobs
  • Engineering Manager - AI DevOps

    NVIDIA (Santa Clara, CA)
    …infrastructure, test automation (SDET), and Infrastructure as Code (IaC) + Architect and implement scalable test automation strategies for AI inference workloads, ... effectively. + Attain operational proficiency encompassing 24x7 on-call rotations, SRE methodologies, automated monitoring, and self-repairing systems to guarantee… more
    NVIDIA (12/26/25)
    - Related Jobs