-
AWS Site Reliability Engineer
- Insight Global (Reston, VA)
-
Job Description
We are seeking an experienced and motivated AWS Cloud Site Reliability Engineer (SRE) to join our dynamic team. As an AWS Site Reliability Engineer, you will play a critical role in ensuring the reliability, scalability, and performance of our cloud infrastructure on Amazon Web Services (AWS). The ideal candidate will have a strong background in AWS services, a deep understanding of infrastructure as code, and a passion for implementing best practices in site reliability engineering. The AWS Site Reliability Engineer (SRE) will collaborate closely with cross-functional teams, including development, quality assurance, and operations, to ensure seamless software releases and continuous improvement of our release processes.
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to [email protected] learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Skills and Requirements
-Secret clearance or higher
-AWS SME
-Kubernetes SME (EKS)
-Troubleshooting in EKS
-Check the health
-Measure levels
-Manage security
-Helm Charts
Create templates for deployments ; PROD vs Non PROD)
-Why do you use?
-Configuration management
-Tagging for all AWS resources
-Naming standard
-Optional tags
-Database tags
-Env tags (PROD or Non PROD)
0Splunk – for security logging
0Datadog – for monitoring and analytics -Lots of Data Dog
-Systemic monitoring
-SRE background mixed with SW Eng
-Feature flags
-Dora metrics
-Progressive delivery
-Java Spring boo
-