-
Site Reliability Engineer
- TEKsystems (Mclean, VA)
-
Description
Site Reliability & Observability Engineer (SRE) Overview We are seeking a Site Reliability Engineer (SRE) with a strong foundation in full-stack observability to ensure high availability, scalability, and reliability across modern distributed systems. This role combines deep expertise in infrastructure automation, observability platforms, and data-driven reliability engineering to deliver predictable, resilient, and self-healing environments at enterprise scale. Core Responsibilities • Design, implement, and maintain SRE principles focused on achieving 5-9s availability across mission-critical systems. • Build and operate cloud-native infrastructure leveraging Kubernetes (K8s) and leading public cloud platforms (AWS, Azure, GCP). • Architect and deploy observability frameworks using tools like Datadog, Splunk, New Relic, AppDynamics, Instana, and CatchPoint. • Develop automation pipelines for CI/CD, infrastructure provisioning, and configuration management using Terraform, Ansible, and Blue/Green deployments. • Integrate and optimize data and messaging systems such as Kafka, RabbitMQ, SQS, and Redis. • Perform proactive capacity planning, incident analysis, and system tuning to drive continuous improvement. • Collaborate with security, development, and operations teams to embed observability and reliability into every layer of the stack. • Develop Python scripts and automation frameworks for monitoring, data collection, and self-remediation workflows. • Contribute to evolving SRE practices, runbooks, and post-incident reviews to foster a blameless, learning-oriented culture. Required Skills & Experience Reliability & Availability • SRE design principles, resiliency engineering, fault isolation, predictive failure analysis. • Experience delivering and maintaining 5-9s availability SLAs in production. Cloud & Container Platforms • Hands-on expertise with Kubernetes (K8s) clusters in production. • Strong understanding of public cloud ecosystems (AWS, Azure, GCP). Observability & Monitoring • Advanced proficiency in Datadog, Splunk, New Relic, AppDynamics, Instana, or CatchPoint. • Experience implementing end-to-end observability (metrics, logs, traces, alerts, dashboards). • Familiarity with AIOps and predictive monitoring approaches. Data & Messaging Systems • Expertise in Kafka, RabbitMQ, SQS, and Redis. • Understanding of OLTP systems and distributed data architectures. Automation & CI/CD • Deep experience with Terraform, Ansible, and automated deployment pipelines. • Exposure to Blue/Green or Canary deployments for zero-downtime releases. Security & Infrastructure • Proficiency with Linux, networking, and firewall configuration. • Strong understanding of network observability, API gateways, and TLS/SSL. Languages & Scripting • Proficiency in Python or equivalent scripting language for automation and data analysis. Preferred Qualifications • Prior experience with multi-cloud architectures or hybrid deployments. • Knowledge of site reliability metrics (SLOs, SLIs, error budgets). • Familiarity with incident response automation and chaos engineering. • Background in software engineering with focus on scalable backend systems.
Skills
Python, Linux, ci/cd, jenkins, Terraform, Automation, ansible
Top Skills Details
Python,Linux,ci/cd,jenkins,Terraform,Automation,ansible
Additional Skills & Qualifications
You’ll be at the intersection of infrastructure, data, and reliability, ensuring our systems are observable, automated, and self-correcting — the foundation for true cloud-native excellence.
Experience Level
Expert Level
Pay and Benefits
The pay range for this position is $65.00 - $75.00/hr.
Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to specific elections, plan, or program terms. If eligible, the benefits available for this temporary role may include the following:
• Medical, dental & vision • Critical Illness, Accident, and Hospital • 401(k) Retirement Plan – Pre-tax and Roth post-tax contributions available • Life Insurance (Voluntary Life & AD&D for the employee and dependents) • Short and long-term disability • Health Spending Account (HSA) • Transportation benefits • Employee Assistance Program • Time Off/Leave (PTO, Vacation or Sick Leave)
Workplace Type
This is a hybrid position in McLean,VA.
Application Deadline
This position is anticipated to close on Oct 9, 2025.
h4>About TEKsystems:
We're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, including 80% of the Fortune 500, across North America, Europe and Asia. As an industry leader in Full-Stack Technology Services, Talent Services, and real-world application, we work with progressive leaders to drive change. That's the power of true partnership. TEKsystems is an Allegis Group company.
The company is an equal opportunity employer and will consider all applications without regards to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.
About TEKsystems and TEKsystems Global Services
We’re a leading provider of business and technology services. We accelerate business transformation for our customers. Our expertise in strategy, design, execution and operations unlocks business value through a range of solutions. We’re a team of 80,000 strong, working with over 6,000 customers, including 80% of the Fortune 500 across North America, Europe and Asia, who partner with us for our scale, full-stack capabilities and speed. We’re strategic thinkers, hands-on collaborators, helping customers capitalize on change and master the momentum of technology. We’re building tomorrow by delivering business outcomes and making positive impacts in our global communities. TEKsystems and TEKsystems Global Services are Allegis Group companies. Learn more at TEKsystems.com.
The company is an equal opportunity employer and will consider all applications without regard to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.
-
Recent Searches
- Technical Program Manager II (Texas)
- Search AI Engineering Analyst (California)
- Program Financial Analyst (Arizona)
- Data Analyst CFIS (Tennessee)
Recent Jobs
-
Site Reliability Engineer
- TEKsystems (Mclean, VA)
-
Senior Leader - Air Force
- CGI Technologies and Solutions, Inc. (Fairfax, VA)