- Truist (Atlanta, GA)
- …description:** We are seeking a highly skilled and forward-thinking lead observability engineer to architect, implement, and evolve enterprise-grade observability ... technologies including Prometheus, Grafana, Jaeger, and commercial APM solutions. You will lead the strategy for metrics, traces, and synthetic monitoring - enabling… more
- Wolters Kluwer (Kennesaw, GA)
- …requiring 8 days per month at an approved Wolters Kluwer location._** As a Lead DevOps Engineer, you will be responsible for leading the design and implementation of ... and partake in pilots/POC/technologyevaluations. + Conduct work activities using SRE principles, such as resilience, metrics, capacity planning, toil, Incident… more
- CVS Health (Atlanta, GA)
- …to ensure our systems remain stable and performant as we scale. The Lead Director of Observability Engineering is a critical leadership role within Solutions ... Leading Software Development teams developing and managing applications for IT operations, SRE , logging and/or observability, with at least 5 years in a leadership… more
- Ford Motor Company (Atlanta, GA)
- …for you. Ford is seeking an experienced and passionate Site Reliability Engineer ( SRE ) to join our team in developing, enhancing, and expanding our global monitoring ... maintainability of our critical cloud services. You'll be at the intersection of SRE and Software Development, building and driving the adoption of our global… more
- US Bank (Atlanta, GA)
- …resilient systems.** **DevOps Mindset** **: Experience working in DevOps or SRE environments, with a focus on continuous integration, continuous delivery, and ... infrastructure automation. Using tools like ArgoCD, Argo Rollouts, and Gitlab** **Monitoring & Logging** **: Exposure to observability tools and practices for monitoring, logging, and alerting in distributed systems using Prometheus, Grafana, and DataDog.**… more
- General Motors (Roswell, GA)
- …where we live and deliver a better future for generations to come. In this SRE SW Engineer role, you will develop and maintain key elements of the infrastructure ... innovate! **What You'll Do** + Implement scalable, reliable, secure SRE and Observability platform to monitor health of our...+ You have a story to tell how you lead and influence cross-organization effort to improve uptime to… more
- Cargill (Atlanta, GA)
- …**Job Purpose and Impact** The Director, Cloud Platform Operations will build and lead Cargill's Cloud Infrastructure team. The team will enhance reliability and the ... define the strategy, build the team, and mature the Operations and Platform SRE capability in close partnership with our Cloud Platform, Data Platform and… more
- General Motors (Roswell, GA)
- …per week._ **_The Role:_** **The Software Engineering Site Reliability Engineer ( SRE ) is** **a Software Engineer** **responsible for ensuring the reliability, ... **Additional Job Description** **What You'll Do** + Implement scalable, reliable, secure SRE and Observability platform to monitor health of our production system… more
- EPAM Systems (Atlanta, GA)
- …client accounts. You'll architect next-generation managed services that integrate DevOps, SRE principles, and intelligent automation to transform how we deliver ... **Responsibilities** + Develop transformative managed services approaches integrating DevOps, SRE , and AI-powered automation + Design non-linear commercial models… more
- Microsoft Corporation (Atlanta, GA)
- …(KPIs) in the current platforms. + **Work collaboratively across CRE to utilize SRE practices to improve our systems, processes, and platform reliability** . this ... the Customer Reliability Engineering (CRE) team to support Site Reliability Engineering ( SRE ) practices. The goal is to enhance the reliability, scalability, and… more