-
Sr SRE (Ansible)
- Insight Global (Bellevue, WA)
-
Job Description
We're building automated disaster recovery failover solutions to ensure high availability across this enterprise telecom company's critical infrastructure. This is a fast-paced, high-impact role where you'll design and implement failover automation spanning applications, databases, and network layers.
We're looking for quick learners who deliver fast and leverage AI-assisted development to accelerate outcomes. You’ll work closely with database teams, application owners, and network engineers to build a robust automation framework that supports multi-database failover, network rerouting, and application-level resilience. This person will assist in the effort to create a push-button failover system that enables real-time disaster recovery across their critical applications. You will help create a dashboard-drive automation suite that empowers teams to manage failovers with confidence, reduce toil, and improve customer experience during outages.
Responsibilities
• Build Ansible playbooks and GitLab CI/CD pipelines for automated failover workflows, eventually migrating to AAP Platform as failover orchestration layer
• Independently onboard applications into the failover framework—gather requirements, understand architecture, and implement with minimal handholding from app teams
• Automate database failover (Oracle, MongoDB, PostgreSQL, MSSQL) and messaging systems
• Integrate with CyberArk, HashiCorp Vault, and F5 load balancers (GTM/LTM)
• Create ServiceNow change automation and observability dashboards
• Proactively engage application owners and drive conversations to unblock delivery
• Design and implement observability solutions—build monitoring dashboards, alerting, and health-check mechanisms to provide real-time visibility into failover readiness and execution
• Recommend and establish best practices—evaluate current processes, identify gaps, and propose improvements for failover patterns, automation standards, and operational runbooks
• Document everything—create clear, comprehensive technical documentation, architecture diagrams, runbooks, and onboarding guides that enable team scalability and knowledge transfer
• Build reusable automation frameworks—develop modular, maintainable automation components that can be extended across applications and environments
You Are
• Self-driven – You take ownership, find answers yourself, and don't wait to be told what to do next
• Fast learner – You ramp quickly on new tools and ecosystems with minimal guidance
• Independent operator – You can engage app teams directly, extract what you need, and fill gaps through your own research
• Delivery-focused – You ship iteratively and thrive in ambiguity
• Relationship builder – You build trust with stakeholders and drive conversations forward
• Strong communicator – You document well and proactively flag blockers
• Continuous improver – You don't just execute; you identify what's broken and propose better ways of doing things
• Knowledge sharer – You believe documentation is a first-class deliverable, not an afterthought
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to [email protected] learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Skills and Requirements
5+ years in SRE, DevOps, or Infrastructure Automation
Ansible & GitLab CI/CD expertise
Python/Bash scripting; strong YAML skills
AWS and Kubernetes experience
Familiarity with secret management (CyberArk, Vault)
Experience using AI coding tools (Claude, Copilot, ChatGPT) to accelerate delivery
Strong documentation skills—ability to translate complex systems into clear, actionable guides F5 GTM/LTM and network failover experience
Chaos engineering background
ServiceNow automation experience
Telecom or large enterprise environment experience
Experience with observability platforms (Splunk, Dynatrace, Grafana, Prometheus)
Track record of establishing automation standards and best practices in enterprise environments
-
Recent Jobs
-
Sr SRE (Ansible)
- Insight Global (Bellevue, WA)
-
Nursing Assistant for Various Units and Shifts at
- Corewell Health (Farmington Hills, MI)
-
Seasonal Specialist, Operations Experienced
- CHS Inc. (Warren, MN)
-
Nurse Auditor 2
- Humana (Little Rock, AR)