-
Staff Observability Operations Engineer
- CVS Health (Hartford, CT)
-
At CVS Health, we’re building a world of health around every consumer and surrounding ourselves with dedicated colleagues who are passionate about transforming health care.
As the nation’s leading health solutions company, we reach millions of Americans through our local presence, digital channels and more than 300,000 purpose-driven colleagues – caring for people where, when and how they choose in a way that is uniquely more connected, more convenient and more compassionate. And we do it all with heart, each and every day.
Company Overview:
CVS Health is a premier health innovation company helping people on their path to better health. We are pioneering a new approach to total health by making quality care more affordable, accessible, simple, and seamless. CVS Health is driven by a clear purpose: helping people on their path to better health. We are transforming health care by expanding access to innovative health solutions and leading the way with a patient-centric approach.
Position Summary:
We are currently seeking several experienced and highly skilled Staff Observability Operations Engineers with a strong background in Site Reliability Engineering (SRE), modern observability practices, and the management and implementation of modern observability and event management platforms. These roles are crucial in overseeing and optimizing our observability platform to ensure seamless and efficient operations. Responsibilities include deploying observability solutions, management and administration of observability and event management platforms, handling release management, system upgrades, patching, integrations, managing customer issues and requests, and troubleshooting incidents. Additionally, the roles involve continuous planning to enhance platform performance to support scalability and complexity. Successful candidates will play a key role in ensuring our observability infrastructure meets the current and future needs of CVS Health’s dynamic environment.
Key Responsibilities:
**Deployment and Implementation:** Deploy and implement modern observability solutions to meet organizational needs. Ensure successful integration of observability, event management, and notification tools and technologies within the existing environment. Work with partners to migrate legacy monitoring to modern solutions. Work with the observability engineering team to provide solutions for new requirements that arise, by leveraging existing or developing new solutions.
**Platform Management:** Manage and administer observability and event management platforms. Lead system upgrades, patching, and maintenance activities to ensure optimal performance and security.
**Release Management:** Coordinate and manage release cycles for observability platforms. Ensure smooth and timely releases with minimal disruption to services.
**Incident/Request Management:** Troubleshoot and resolve incidents related to observability platforms. Manage escalated customer issues and requests, ensuring timely and effective resolution. Document incident remediation activities to enable resolution by L1/L2 MSP partners; automate remediation activities where possible.
**Performance Optimization:** Continuously monitor and enhance platform performance to support scalability and complexity. Utilize telemetry data to automate performance optimization and capacity planning.
**Collaboration and Communication:** Collaborate with cross-functional infrastructure, application, and business stakeholders to ensure observability solutions align with the broader IT strategy and infrastructure requirements. Communicate effectively with team members, management, and other stakeholders.
**Continuous Improvement:** Identify opportunities for process optimization and efficiency gains. Stay current with industry trends and best practices to continuously improve observability operations.
**Customer Focus:** Ensure high levels of customer satisfaction by effectively managing customer relationships. Provide excellent customer service and support for observability solutions.
**Compliance and Security:** Ensure observability platforms comply with organizational policies and security standards. Implement tools and processes to detect and remediate configuration drifts and security risks.
**Documentation and Reporting:** Maintain comprehensive documentation of observability platform configurations, processes, and procedures. Generate and analyze reports on platform performance and capacity.
**Training and Mentoring:** Provide training and mentoring to junior engineers, team members, and our MSPs. Share knowledge and best practices to enhance the overall capability of the team.
Required Skills and Qualifications:
Technical Expertise:
+ 7+ Years of experience in IT operations, with significant responsibilities in system monitoring, performance tuning, and troubleshooting enterprise applications.
+ 5+ Years in a Site Reliability Engineering (SRE) role deploying and managing modern observability solutions.
+ 5+ Years managing and implementing observability and event management platforms (e.g., AppDynamics, Splunk, Prometheus, Grafana).
+ Experience developing and administering ServiceNow ITOM event management solutions, ensuring seamless integration with observability tools.
+ Experience deploying and managing service reliability platforms (e.g., xMatters, OpsGenie, PagerDuty), configuring incident notifications, incident command workflows, and automating incident remediation workflows.
+ Experience with and deep knowledge of cloud environments, cloud monitoring platforms, and container orchestration tools (e.g., AWS/CloudTrail, Azure/Monitor, GCP/GCM, Kubernetes, OpenShift).
+ Proficiency in Python and other scripting languages such as Ansible, PowerShell, and Bash for automation and configuration. Experience with and passion for deploying things “as code”.
Solution Implementation and Platform Management:
+ Hands-on experience deploying, managing, and administering observability platforms.
+ Hands-on experience leading, coordinating, and performing migration of application, platform, and infrastructure observability solutions (e.g., full-stack APM, RUM, Session Replay, Server, Storage, Network, Database, NLB, etc.) from legacy tools to modern platforms.
+ Hands on experience performing system upgrades, patching, and integrations to ensure platform stability and security.
+ Experience developing and implementing monitoring and logging standards for infrastructure, platforms, and applications.
+ Experience building and instrumenting dashboards to deliver technical and business process insights leveraging standard observability/BI platforms (e.g., AppDynamics, Grafana, Tableau, PowerBI).
+ Experience establishing and implementing event correlation policies and related rules to enrich event data, increase signal-to-noise-ratio for events, and reduce MTTD and MTTR.
Incident and Problem Resolution:
+ Excellent problem-solving skills, with the ability to handle multiple tasks, prioritize effectively, and work under pressure.
+ Proven ability to troubleshoot and resolve complex technical issues related to observability platforms.
+ Experience managing customer issues and requests, providing timely and effective solutions.
Performance Monitoring and Optimization:
+ Experience monitoring platform performance and implementing enhancements to support scalability and complexity.
+ Experience leveraging telemetry data to automate performance optimization and capacity planning.
+ Proficiency in scripting and programming languages such as Ansible, PowerShell, Bash, Python, YAML, XML, and JSON to automate deployment, configuration and instrumentation.
Release and Configuration Management:
+ Experience coordinating and managing release cycles for observability platforms.
+ Knowledge of best practices in release management to ensure smooth and timely deployments.
+ Experience configuring and leveraging source code management tools and workflows to manage and deploy Monitoring as Code.
Collaboration and Communication:
+ Excellent communication skills, both verbal and written.
+ Ability to collaborate effectively with cross-functional teams and stakeholders.
+ Strong interpersonal skills, with the ability to engage effectively with both technical teams and business stakeholders.
Continuous Improvement:
+ Commitment to continuous improvement and staying current with industry trends and best practices.
+ Ability to identify opportunities for process optimization and efficiency gains.
Customer Focus:
+ Strong customer service orientation with the ability to manage customer relationships effectively.
+ Experience in providing excellent customer service and support for observability solutions.
Compliance and Security:
+ Knowledge of compliance and security standards related to observability platforms.
+ Ability to implement tools and processes to detect and remediate configuration drift and security risks.
+ Experience managing operational data and systems access to ensure compliance with internal and external audit and regulatory requirements.
Documentation and Reporting:
+ Proficiency maintaining comprehensive documentation of observability platform configurations, processes, and procedures.
+ Ability to generate and analyze reports on platform performance, incidents, and customer requests.
Preferred Certifications
+ ITIL 4 Practitioner: Monitoring and Event Management
+ DevOps Institute Observability Foundation
+ DevOps Institute Site Reliability Engineering Foundation or Practitioner
+ ServiceNow CIS-Event Management Implementer
+ ServiceNow Certified Application Developer
+ xMatters Integrator
Education
+ Bachelor degree from accredited university or equivalent work experience (HS diploma + 4 years relevant experience)
BUSINESS OVERVIEW
Bring your heart to CVS Health Every one of us at CVS Health shares a single, clear purpose: Bringing our heart to every moment of your health. This purpose guides our commitment to deliver enhanced human-centric health care for a rapidly changing world. Anchored in our brand — with heart at its center — our purpose sends a personal message that how we deliver our services is just as important as what we deliver. Our Heart At Work Behaviors™ support this purpose. We want everyone who works at CVS Health to feel empowered by the role they play in transforming our culture and accelerating our ability to innovate and deliver solutions to make health care more personal, convenient and affordable. We strive to promote and sustain a culture of diversity, inclusion and belonging every day. CVS Health is an affirmative action employer, and is an equal opportunity employer, as are the physician-owned businesses for which CVS Health provides management services. We do not discriminate in recruiting, hiring, promotion, or any other personnel action based on race, ethnicity, color, national origin, sex/gender, sexual orientation, gender identity or expression, religion, age, disability, protected veteran status, or any other characteristic protected by applicable federal, state, or local law. We proudly support and encourage people with military experience (active, veterans, reservists and National Guard) as well as military spouses to apply for CVS Health job opportunities
Join CVS Health as a Staff Observability Operations Engineer and contribute to our mission of driving health care innovation and delivering cutting-edge health solutions.
Pay Range
The typical pay range for this role is:
$130,295.00 - $260,590.00
This pay range represents the base hourly rate or base annual full-time salary for all positions in the job grade within which this position falls. The actual base salary offer will depend on a variety of factors including experience, education, geography and other relevant factors. This position is eligible for a CVS Health bonus, commission or short-term incentive program in addition to the base pay range listed above. This position also includes an award target in the company’s equity award program.
Our people fuel our future. Our teams reflect the customers, patients, members and communities we serve and we are committed to fostering a workplace where every colleague feels valued and that they belong.
Great benefits for great people
We take pride in our comprehensive and competitive mix of pay and benefits – investing in the physical, emotional and financial wellness of our colleagues and their families to help them be the healthiest they can be. In addition to our competitive wages, our great benefits include:
+ **Affordable medical plan options,** a **401(k) plan** (including matching company contributions), and an **employee stock purchase plan** .
+ **No-cost programs for all colleagues** including wellness screenings, tobacco cessation and weight management programs, confidential counseling and financial coaching.
+ **Benefit solutions that address the different needs and preferences of our colleagues** including paid time off, flexible work schedules, family leave, dependent care resources, colleague assistance programs, tuition assistance, retiree medical access and many other benefits depending on eligibility.
For more information, visit https://jobs.cvshealth.com/us/en/benefits
We anticipate the application window for this opening will close on: 12/31/2025
Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state and local laws.
We are an equal opportunity and affirmative action employer. We do not discriminate in recruiting, hiring, promotion, or any other personnel action based on race, ethnicity, color, national origin, sex/gender, sexual orientation, gender identity or expression, religion, age, disability, protected veteran status, or any other characteristic protected by applicable federal, state, or local law.
-
Recent Searches
- HVAC Refrigeration Tech (Michigan)
- Senior Advanced Analytics Analyst (United States)
- Associate Operator Production 1st (United States)
- Receiving Freight Loader II (United States)
Recent Jobs
-
Staff Observability Operations Engineer
- CVS Health (Hartford, CT)
-
Senior Global Procurement Technical Analyst
- ADP (Norfolk, VA)
-
Staff, Software Engineer - Java
- Walmart (Bentonville, AR)