-
Director - Enterprise Site Reliability…
- United Airlines (Chicago, IL)
-
Achieving our goals starts with supporting yours. Grow your career, access top-tier health and wellness benefits, build lasting connections with your team and our customers, and travel the world using our extensive route network.
Come join us to create what’s next. Let’s define tomorrow, together.
Description
United's Digital Technology team is comprised of many talented individuals all working together with cutting-edge technology to build the best airline in the history of aviation. Our team designs, develops and maintains massively scaling technology solutions brought to life with innovative architectures, data analytics, and digital solutions.
Job overview and responsibilities
The Director of Enterprise Site Reliability and Enablement is responsible for driving operational excellence through data-driven insights, real-time dashboarding, and reliability initiatives. This role ensures IT systems operate efficiently, with a strong focus on incident response, observability, and Mean Time to Resolve (MTTR) improvement. The ideal candidate will have a deep understanding of IT operations, data analytics, and performance monitoring tools to proactively enhance service reliability and decision-making.
Operational Leadership:
+ Provide clear executive decision-making and priority management; coach and build confidence in team to make good business decisions using technology & analytical thinking; proactively plan, communicate & mitigate risks across stakeholders
+ Achieve operational excellence and superior user experience advances by building a high-performing team to achieve and exceed goals and objectives
+ Drive continuous infrastructure, tools, and process improvement, working with cross-functional teams in support of campaigns/projects, analytics/reporting/business intelligence
+ Work seamlessly with other Digital Technology & business unit leaders to architect and build best in class solutions and experience
Operational Data Analytics & Dashboarding:
+ Develop and manage real-time dashboards to visualize IT performance, system health, and reliability metrics
+ Leverage data analytics to identify trends, detect anomalies, and drive continuous improvements in IT operations
+ Standardize reporting processes for IT operations KPIs, including MTTR, uptime, SLAs, and incident volume
+ Implement AI/ML-driven analytics to predict and prevent IT failures before they impact business operations
Reliability & Incident Management:
+ Lead initiatives to improve IT system reliability, reducing downtime and optimizing service performance
+ Drive MTTR improvement strategies by enhancing incident response processes, automation, and root cause analysis (RCA)
+ Implement observability solutions to provide end-to-end visibility across infrastructure, applications, and services
+ Collaborate with engineering and DevOps teams to optimize system performance and availability
IT Operations Strategy & Continuous Improvement:
+ Define and execute strategies for operational resilience, ensuring high availability and performance
+ Introduce process automation and AIOps solutions to enhance IT efficiency and reduce manual effort
+ Align IT operations with business objectives, ensuring proactive issue resolution and continuous service optimization
Collaboration & Leadership:
+ Work cross-functionally with infrastructure, DevOps, security, and business teams to drive operational excellence
+ Present data-driven insights to executive leadership, influencing IT strategy and decision-making
+ Foster a culture of accountability, innovation, and continuous learning within IT operations
Qualifications
What’s needed to succeed (Minimum Qualifications):
+ Bachelor's degree in Computer Science, Information Systems, or related field
+ 10+ years of experience in IT operations, data analytics, observability, or related fields
+ ITIL v4 Certification
+ Site Reliability Engineering (SRE) Certification
+ Certified Kubernetes Administrator (CKA)
+ AWS/Azure/GCP Cloud Certifications
+ Data Analytics Certifications (e.g., Google Data Analytics, Microsoft Certified: Power BI Data Analyst)
+ Proven ability to maintain a high-level of client service
+ Expertise in IT performance monitoring tools (e.g., Splunk, Grafana, Datadog, New Relic, Dynatrace)
+ Strong knowledge of incident management, ITIL best practices, and service reliability engineering (SRE) principles
+ Hands-on experience with data visualization platforms (e.g., Power BI, Tableau, Looker)
+ Proven track record of reducing MTTR and improving system reliability through data-driven initiatives
The base pay range for this role is $155,895.00 to $212,410.00.
The base salary range/hourly rate listed is dependent on job-related, non-discriminatory factors such as experience, education, and skills. This position is also eligible for bonus and/or long-term incentive compensation awards.
You may be eligible for the following competitive benefits: medical, dental, vision, life, accident & disability, parental leave, employee assistance program, commuter, paid holidays, paid time off, 401(k) and flight privileges.
United Airlines is an equal opportunity employer. United Airlines recruits, employs, trains, compensates and promotes regardless of race, religion, color, national origin, gender identity, sexual orientation, physical ability, age, veteran status and other protected status as required by applicable law. Equal Opportunity Employer - Minorities/Women/Veterans/Disabled/LGBT.
We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform crucial job functions. Please contact [email protected] to request accommodation.
-
Recent Jobs
-
Director - Enterprise Site Reliability & Enablement
- United Airlines (Chicago, IL)
-
Management Analyst
- Air National Guard Units (Eastover, SC)
-
Director, Statistical Genetics
- Bristol Myers Squibb (Cambridge, MA)
-
Software Engineer Principal Senior
- PNC (Pittsburgh, PA)