-
SRE/MLOps Engineer
- SAIC (VA)
-
Description
We are seeking a versatile **SRE/MLOps Engineer with DevSecOps expertise** to design, automate, and operate secure, scalable, and repeatable **model deployment workflows** across the AI/ML Common Services environment. This role bridges **infrastructure reliability, CI/CD automation, and model operations** , enabling IRS mission teams to move from experimentation to production with confidence.
The engineer will not only support **ML lifecycle operations** (Databricks, MLflow, AWS SageMaker/Bedrock) but also bring **DevSecOps rigor** to ensure compliance, monitoring, and infrastructure-as-code are embedded in every step. By partnering with Infrastructure, Security, and Architecture teams, this role ensures the AAP environment is **resilient, automated, and compliance-ready** at enterprise scale.
Key Responsibilities:
+ Enable **secure, scalable, and repeatable** deployment workflows for both ML models and supporting infrastructure.
+ Build and maintain **runtime environments, service accounts, orchestration logic** for Databricks, MLflow, and AWS AI services.
+ Implement and maintain **CI/CD pipelines** (Bitbucket, Bamboo, Jenkins, or equivalent) for code, data, and model deployments.
+ Apply **DevSecOps practices** — integrating security scans, compliance checks, and audit logging into deployment pipelines.
+ Collaborate with **Infrastructure DSO** and **Solutions Architect** to integrate Terraform-based IaC for consistent, automated provisioning.
+ Implement **observability, alerting, and logging** (CloudWatch, Datadog, Prometheus) to monitor both application and ML workloads.
+ Align infrastructure with ML lifecycle needs — including staging, promotion, rollback, retraining, and compliance-aware tracking.
+ Develop **automation templates, reusable workflows, and guardrails** to accelerate onboarding of mission team models while ensuring security.
+ Contribute to **incident response, performance tuning, and reliability engineering** across ML and non-ML workloads.
Qualifications
Required Qualifications:
+ Bachelor’s or master’s degree in computer science, Data Engineering, or a related technical discipline.
+ 5+ years of experience in **Site Reliability Engineering, DevOps, or MLOps** with production-grade systems.
+ Must be a U.S. Citizen with the ability to obtain and maintain a Public Trust security clearance.
+ Hands-on experience with **Databricks, MLflow, or AWS SageMaker/Bedrock** for ML model lifecycle operations.
+ Strong proficiency in **Terraform, CI/CD pipelines** , and container orchestration (Docker, Kubernetes).
+ Experience implementing **security automation** (e.g., IaC scanning, container security, SAST/DAST tools) within CI/CD workflows.
+ Solid understanding of **observability stacks** (logs, metrics, tracing) and best operational practices.
Desired Skills:
+ Active IRS clearance highly desired.
+ Experience in **federal or regulated environments** with security, audit, and compliance requirements (FedRAMP, NIST 800-53).
+ Knowledge of **Trustworthy AI monitoring** (bias detection, drift monitoring, explainability).
+ Familiarity with **Unity Catalog, Delta Lake, and data pipeline orchestration** in Databricks.
+ Hands-on experience with **Zero Trust security models** and secure boundary implementations.
+ Relevant certifications such as:
+ **Databricks Certified Machine Learning Professional.**
+ **AWS DevOps Engineer – Professional.**
+ **Certified Kubernetes Administrator (CKA).**
+ **Security+ or equivalent security cert.**
Target salary range: $120,001 - $160,000. The estimate displayed represents the typical salary range for this position based on experience and other factors.
REQNUMBER: 2508971
SAIC is a premier technology integrator, solving our nation's most complex modernization and systems engineering challenges across the defense, space, federal civilian, and intelligence markets. Our robust portfolio of offerings includes high-end solutions in systems engineering and integration; enterprise IT, including cloud services; cyber; software; advanced analytics and simulation; and training. We are a team of 23,000 strong driven by mission, united purpose, and inspired by opportunity. Headquartered in Reston, Virginia, SAIC has annual revenues of approximately $6.5 billion. For more information, visit saic.com. For information on the benefits SAIC offers, see Working at SAIC. EOE AA M/F/Vet/Disability
-
Recent Jobs
-
SRE/MLOps Engineer
- SAIC (VA)
-
Principal Opto-Mechanical Engineer
- BAE Systems (Austin, TX)
-
Human Resources Business Analyst
- HCA Healthcare (Nashville, TN)
-
Manager, Studio Marketing Partnerships, Fandango
- NBC Universal (Los Angeles, CA)