Alerted.org | Alerted.org - Powering better job alerts

Site Reliability Engineer

Insight Global (Plymouth, PA)

Apply Now

Job Description

We’re looking for a Site Reliability Engineer to help keep our applications running smoothly, reliably, and efficiently. You’ll work behind the scenes to monitor performance, automate operations, and support large-scale systems across cloud platforms like AWS and GCP. If you enjoy solving problems, improving systems, and working with modern tools like Terraform, Kubernetes, and Dynatrace, this role is for you.

This is a hybrid role 4 days onsite. Location options are Boca Raton, FL, Blue Bell, PA or Irving, TX.

We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to [email protected] learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.

Skills and Requirements

5+ years of experience in Site Reliability Engineering or similar roles.

Experience supporting production applications

Strong background in monitoring, incident response, troubleshoot issues, lead post-mortem analysis to prevent future problems, and system performance tuning.

Ability to automate infrastructure and operations using Terraform, Ansible, and scripting languages like Python or Java.

Manage and optimize Kubernetes clusters for scalable deployments.

Support cloud infrastructure across AWS and GCP, ensuring systems are secure, stable, and cost-efficient.

Experience managing distributed systems and dynamic cloud infrastructure.

Comfortable working in complex environments and solving ambiguous problems.

Exposure to post-mortem analysis and reliability-focused engineering practices.

Strong communication skills and ability to work cross-functionally. Experience with Dynatrace, Prometheus, or similar observability tools.

Familiarity with CI/CD pipelines and automation best practices.

Apply Now

"Alerted.org

Advanced Search

Site Reliability Engineer

Recent Searches

Recent Jobs

Account Login

Sign Up

Forgot your password?