-
Site Reliability Engineer
- Insight Global (Atlanta, GA)
-
Job Description
Provide consulting services for improved system stability, availability, performance and reliability.
• Assist in determining the impact of operational issues and provide input into their resolution via data extraction and quantification.
• Work through day-to-day support issues, ensure effective and timely resolution of issues in production environment, troubleshoot customer impacting issues.
• Support multiple applications, specifically running Kubernetes/Gloo/AWS/Apigee/PCF/GCP/Java based systems in an enterprise environment.
• Supporting Gloo running on Kubernetes, Apigee opdk and saas, Grafana, Prometheus, Cassandra, Postgres, Spring Boot or Java based applications running on Kubernetes, PCF, and Java application servers.
• Apply GitOps principles to manage infrastructure and application configurations
• Apply monitoring and creating complex alerts and dashboards for production systems.
• Provide capacity analysis, tuning analysis for Apigee and Java applications hosted on LINUX and container platform.
• Available to provide 24X7 on call support on a rotating basis with other team members.
• Lead efforts in troubleshooting, recovery, and root cause investigation.
• Perform analysis of user requirements and problems to automate or improve systems and review system capabilities, workflow, and scheduling limitations.
• Able to follow and develop detailed work plans, schedules, project estimates, resource plans, and status reports.
• Facilitate HA (High Availability) /DR (Disaster Recovery) exercises to ensure that the team are fully prepared in any event.
• Lead root cause analysis session to understand what causes issues in Production and come up RCA Report along with solutions that will prevent them from happening in the future.
• Ensure documentation is created and remain updated for any related work.
• Strong understanding of UNIX operating systems and any scripting language.
• Forecast and plan for rapidly growing environment.
• Evaluate new software product and service solutions.
We are a company committed to creating inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity employer that believes everyone matters. Qualified candidates will receive consideration for employment opportunities without regard to race, religion, sex, age, marital status, national origin, sexual orientation, citizenship status, disability, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to [email protected] . The EEOC "Know Your Rights" Poster is available here (https://www.eeoc.gov/sites/default/files/2023-06/22-088\_EEOC\_KnowYourRights6.12ScreenRdr.pdf) .
To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/ .
Skills and Requirements
Expertise in analyzing and troubleshooting large-scale distributed systems.
• Strong experience with Kubernetes Container Orchestration Tool, Gloo, AWS, Apigee API Gateway,
• Experience with REST, SOAP, and GraphQL API support.
• Experience with tools like: Git, Gitlab, Docker, Postman, Splunk, App Dynamics, Imperva WAF and CI/CD tools
• Good Experience in GitOps process, performance measures & tuning, capacity planning and management, contingency, and disaster recovery
• Good understanding and strong experience with Unix/Linux operating systems.
• Ability to debug, optimize code, and automate routine tasks.
• Systematic problem-solving approach coupled with effective communication skills.
• Strong scripting knowledge and experience.
• Good understanding of networking, routing, and TLS/SSL
Masters degree in Information Technology, Computer Science, Computer Information Systems, Computer Applications, related field or its equivalent and 3 years of relevant work experience.
Bachelors degree in Information Technology, Computer Science, Computer Information Systems, Computer Applications, related field or its equivalent and 5 years of relevant work experience. null
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal employment opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment without regard to race, color, ethnicity, religion,sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military oruniformed service member status, or any other status or characteristic protected by applicable laws, regulations, andordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request to [email protected].
-