-
Site Reliability Engineer II
- Insight Global (Irving, TX)
-
Job Description
The Site Reliability Engineering (SRE) team provides leadership, direction, and accountability for
building and running large-scale software systems. As a Site Reliability Engineer II, you will identify
and deliver automation solutions designed to ensure high availability and resiliency using your
expertise in software development, complexity analysis, and scalable system design. Strong
collaboration skills will be required to work closely with other engineering teams to ensure
services/systems are highly stable and performant, meeting the expectations of our business partners
and end users.
Partner with the architecture and development teams on how to make applications highly
available, reliable, and performant at global scale
Collaborate with the architecture team to ensure Reliability factors are accounted for in
business features and enablers
Guide development teams in understanding established service level objectives and
consequences, and implementing appropriate SLIs to support the objectives
Collaborate with development team members to swarm, troubleshoot, and resolve problems
Guide ad-hoc teams to brainstorm solutions and build implementation plans based on the Root
Cause Analysis of production issues
Design and build automated solutions to optimize application/service/platform uptime with
minimal human intervention
Be available for an on-call rotation to participate in troubleshooting and communication efforts
outside of normal business hours
Implement and help create standards and best practices, and mentor other team members to
drive adoption across development teams
We are a company committed to creating inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity employer that believes everyone matters. Qualified candidates will receive consideration for employment opportunities without regard to race, religion, sex, age, marital status, national origin, sexual orientation, citizenship status, disability, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to Human Resources Request Form (https://airtable.com/app21VjYyxLDIX0ez/shrOg4IQS1J6dRiMo) . The EEOC "Know Your Rights" Poster is available here (https://www.eeoc.gov/sites/default/files/2023-06/22-088\_EEOC\_KnowYourRights6.12ScreenRdr.pdf) .
To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/ .
Skills and Requirements
- Expert in defining, implementing, and evaluating Service Level Objectives (SLO) and Service Level Indicators (SLI), and associated consequences
Software development expertise in two or more high-level programming and scripting
languages
Experience in evolutionary database design, query performance analysis, and indexing as a
cornerstone for delivering scalable, performant products and services
Experience in designing, building, and optimizing automated pipelines with automated testing
and automated security controls
Experience in performing Root Cause Analysis and Problem Management
Experience working in Agile Scrum teams with demonstrated success leading improvements
(getting better/faster/happier)
Help establish and maintain a culture of learning through the development and sharing of
skills, knowledge, process and tools; combat traditional silos that create us and them
environments
A driving passion for finding solutions to hard problems at scale and operationalizing them
Exceptional critical thinking and communication skills, with a passion for leveraging
documentation as a tool for constant improvement
3-5 years of experience in software development and test automation required
3-5 years of web development experience strongly preferred
Bachelors Degree in related field or equivalent experience required
Masters Degree in related field preferred Pipeline Automation: Azure DevOps (YAML, ARM), Terraform, Jenkins, Chef, Octopus Deploy
Code Scanning: SonarQube, Checkmarx
Source Code repos: Git
Containerization: Azure Kubernetes Service, Kubernetes (open source), Docker
High level programming languages: Java, C# (NET MVC and NET Core), Go
Scripting: PowerShell, Bash
Database: Oracle, Microsoft SQL Server, NoSQL (eg CosmosDB)
Test Automation: XamarinUITest, Specflow, DevTest, Selenium, Test Data Manager, Postman,
Maven, TestNG, JMeter
Operating systems: Windows, Linux
Cloud Platforms: Azure
Metrics and Monitoring: Splunk null
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal employment opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment without regard to race, color, ethnicity, religion,sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military oruniformed service member status, or any other status or characteristic protected by applicable laws, regulations, andordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request to [email protected].
-