- NVIDIA (Santa Clara, CA)
- Site Reliability Engineering ( SRE ) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... and deployment, open source cloud enabling technologies like Kubernetes and Public Cloud. SRE at NVIDIA ensures that our internal and external facing services run… more
- Walmart (Sunnyvale, CA)
- …from fraudulent activity across our global transaction platforms. Our DevOps and SRE engineers ensure these systems are reliable, scalable, and secure, enabling ... and safe transactions at massive scale. **What You'll Do:** + Architect , implement, and optimize CI/CD pipelines and infrastructure-as-code solutions tailored for… more
- Nightwing (Sterling, VA)
- …position is CONTINGENT upon contract award** The Site Reliability Engineer ( SRE ) collaboratively works closely with the contract leadership, Platform teams, and ... new software or new features to production as quickly as possible. The SRE executes and analyzes manual IT operations/admin tasks (log analysis, performance tuning,… more
- United Airlines (Chicago, IL)
- …integration and delivery (CI/CD) capabilities, and site reliability engineering ( SRE ) practices to support a secure, high-performing, and resilient technology ... + Lead and develop a high-performing team of platform, DevOps, and SRE engineers, fostering a culture of continuous improvement, collaboration, and technical… more
- NVIDIA (Santa Clara, CA)
- …era of machine learning innovation. In this role, you will architect , build, and scale our high-performance ML infrastructure using modern Infrastructure-as-Code ... the world's most powerful GPU systems. Join our top team and apply your SRE and software engineering skills to craft robust, user-friendly platforms for seamless ML… more
- Huron Consulting Group (Chicago, IL)
- …expanded leadership across AWS and GCP environments. You will architect secure landing zones, implement Infrastructure-as-Code and Policy-as-Code patterns, automate ... , **Bicep** , and **Azure Verified Modules (AVM)** . + Architect identity, networking, private connectivity, secrets management, backup/DR, and workload hosting… more
- EPAM Systems (NY)
- …implementing AI-first strategies while leading multiple strategic client accounts. You'll architect next-generation managed services that integrate DevOps, SRE ... **Responsibilities** + Develop transformative managed services approaches integrating DevOps, SRE , and AI-powered automation + Design non-linear commercial models… more
- Walmart (Bentonville, AR)
- …customer service platforms are resilient, scalable, and lightning-fast. You'll architect reliability frameworks, drive automation across incident response and ... observability, and collaborate with engineering and product teams to embed SRE principles into every layer of the stack. This role offers the excitement of solving… more
- Oracle (Reston, VA)
- **Job Description** ** Architect Operational Processes** : Design and implement scalable and automated operational processes for incident management, change ... cause analysis for critical issues. **Capacity and Performance Management** : Architect and implement systems to monitor, predict, and optimize infrastructure… more
- NVIDIA (Santa Clara, CA)
- …infrastructure, test automation (SDET), and Infrastructure as Code (IaC) + Architect and implement scalable test automation strategies for AI inference workloads, ... effectively. + Attain operational proficiency encompassing 24x7 on-call rotations, SRE methodologies, automated monitoring, and self-repairing systems to guarantee… more
Recent Jobs
-
Specialty Tech- Progressive Care Unit
- Trinity Health (Mishawaka, IN)
-
ODA Program Manager
- Astronics (Waukegan, IL)
-
Senior Developer, Productivity Engineer
- Ford Motor Company (Palo Alto, CA)
-
Ground Mission Engineer
- SAIC (Chantilly, VA)