-
Site Reliability Engineer- Azure
- Insight Global (Chandler, AZ)
-
Job Description
Resource will be part of a team responsible for reliability and support of Container (Openshift) on-prem and external cloud (MS Azure/AWS/Google). This includes monitoring and troubleshooting alerts and incidents related to the platforms, and any required Incident and Problem Management. Application onboarding, troubleshooting, and support throughout the lifecycle. The role will require weekend on-call coverage and shift coverage as part of 24x7 Global Ops team.
Resource will liaison regularly with teammates and shift leads. Additionally, as part of support will routinely interact with platform clients and vendors.
BS /MS degree in Computer Science or related
technical field involving systems or equivalent practical
experience.
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to [email protected] learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Skills and Requirements
5+ years of hands-on experience supporting
Kubernetes /Openshift / RKE / EKS Container platform.
Experience with Python, Ansible, Golang, and shell
scripting
Kubernetes /Openshift /Terraform certifications are a
plus
Strong experience in major services related to
Compute, Storage, Network and Security
Experience with monitoring tools like Prometheus and
Dynatrace, as well as cloud native tools like Azure Monitor
and Log Analytics
Strong understanding and background of working with
a complex IAM infrastructure, including Active Directory,
Azure AD Connect, Azure AD, and Ping Identity or other
SSO solutions.
Advanced knowledge of Linux OS, DNS, DHCP,
Kerberos and Windows Authentication
Experience with CI/CD tools git /Jenkins, GitOps model
Excellent understanding of Linux /Windows operating
systems administration
Experience in Container security and vulnerability
remediation.
Systematic problem-solving approach, sense of
ownership and drive
Ability to juggle competing priorities and adapt to
changes in project scope.
Excellent interpersonal, organizational and
communication (written, verbal, and presentation) skills
are a must.
Proven ability to work independently with minimal
supervision and as part of a team with direct
responsibilities.
Experience in Openshift, RKE, CSP Kubernetes
services such as AKS and EKS
Experience in Terraform, ArgoCD, Tekton, and K-native
technologies.
Experience in agile deployment methodologies
(GitOps)
Knowledge of various container runtimes
Familiarity with the operator deployment pattern.
Experience working in a highly available multidatacenter
environment
Experience working with monitoring tools such as
Prometheus, Splunk, Dynatrace, Sysdig, or similar tools.
Understanding of cost management, inventory
management, FinOps model
-
Recent Searches
- Lead OSM Tooling Engineer (United States)
- Principal Product Manager Data (California)
- Banner Programmer (North Carolina)
- Cybersecurity Information System Security (Alabama)
Recent Jobs
-
Site Reliability Engineer- Azure
- Insight Global (Chandler, AZ)
-
Aircraft Maintenance Supervisor Depot - T-45 Program - NAS
- V2X (Meridian, MS)