• Lead Infrastructure Engineer…

    Truist (Atlanta, GA)
    …description:** We are seeking a highly skilled and forward-thinking lead observability engineer to architect, implement, and evolve enterprise-grade observability ... technologies including Prometheus, Grafana, Jaeger, and commercial APM solutions. You will lead the strategy for metrics, traces, and synthetic monitoring - enabling… more
    Truist (07/15/25)
    - Related Jobs
  • Lead DevOps Engineer

    Wolters Kluwer (Kennesaw, GA)
    …requiring 8 days per month at an approved Wolters Kluwer location._** As a Lead DevOps Engineer, you will be responsible for leading the design and implementation of ... and partake in pilots/POC/technologyevaluations. + Conduct work activities using SRE principles, such as resilience, metrics, capacity planning, toil, Incident… more
    Wolters Kluwer (07/29/25)
    - Related Jobs
  • Lead Director - Observability Engineering

    CVS Health (Atlanta, GA)
    …to ensure our systems remain stable and performant as we scale. The Lead Director of Observability Engineering is a critical leadership role within Solutions ... Leading Software Development teams developing and managing applications for IT operations, SRE , logging and/or observability, with at least 5 years in a leadership… more
    CVS Health (07/09/25)
    - Related Jobs
  • Cloud Site Reliability Engineer

    Ford Motor Company (Atlanta, GA)
    …for you. Ford is seeking an experienced and passionate Site Reliability Engineer ( SRE ) to join our team in developing, enhancing, and expanding our global monitoring ... maintainability of our critical cloud services. You'll be at the intersection of SRE and Software Development, building and driving the adoption of our global… more
    Ford Motor Company (08/01/25)
    - Related Jobs
  • Lead Infrastructure engineer - OpenShift…

    US Bank (Atlanta, GA)
    …resilient systems.** **DevOps Mindset** **: Experience working in DevOps or SRE environments, with a focus on continuous integration, continuous delivery, and ... infrastructure automation. Using tools like ArgoCD, Argo Rollouts, and Gitlab** **Monitoring & Logging** **: Exposure to observability tools and practices for monitoring, logging, and alerting in distributed systems using Prometheus, Grafana, and DataDog.**… more
    US Bank (08/02/25)
    - Related Jobs
  • Senior Software Engineer - Site Reliability…

    General Motors (Roswell, GA)
    …where we live and deliver a better future for generations to come. In this SRE SW Engineer role, you will develop and maintain key elements of the infrastructure ... innovate! **What You'll Do** + Implement scalable, reliable, secure SRE and Observability platform to monitor health of our...+ You have a story to tell how you lead and influence cross-organization effort to improve uptime to… more
    General Motors (07/13/25)
    - Related Jobs
  • Director, Platform Engineering - Cloud Operations

    Cargill (Atlanta, GA)
    …**Job Purpose and Impact** The Director, Cloud Platform Operations will build and lead Cargill's Cloud Infrastructure team. The team will enhance reliability and the ... define the strategy, build the team, and mature the Operations and Platform SRE capability in close partnership with our Cloud Platform, Data Platform and… more
    Cargill (08/14/25)
    - Related Jobs
  • Staff Software Engineer - Site Reliability…

    General Motors (Roswell, GA)
    …per week._ **_The Role:_** **The Software Engineering Site Reliability Engineer ( SRE ) is** **a Software Engineer** **responsible for ensuring the reliability, ... **Additional Job Description** **What You'll Do** + Implement scalable, reliable, secure SRE and Observability platform to monitor health of our production system… more
    General Motors (06/24/25)
    - Related Jobs
  • Director, Next-Generation Managed Services

    EPAM Systems (Atlanta, GA)
    …client accounts. You'll architect next-generation managed services that integrate DevOps, SRE principles, and intelligent automation to transform how we deliver ... **Responsibilities** + Develop transformative managed services approaches integrating DevOps, SRE , and AI-powered automation + Design non-linear commercial models… more
    EPAM Systems (08/21/25)
    - Related Jobs
  • Principal Customer Experience Engineer

    Microsoft Corporation (Atlanta, GA)
    …(KPIs) in the current platforms. + **Work collaboratively across CRE to utilize SRE practices to improve our systems, processes, and platform reliability** . this ... the Customer Reliability Engineering (CRE) team to support Site Reliability Engineering ( SRE ) practices. The goal is to enhance the reliability, scalability, and… more
    Microsoft Corporation (08/14/25)
    - Related Jobs