"Alerted.org

Job Title, Industry, Employer
City & State or Zip Code
20 mi
  • 0 mi
  • 5 mi
  • 10 mi
  • 20 mi
  • 50 mi
  • 100 mi
Advanced Search

Advanced Search

Cancel
Remove
+ Add search criteria
City & State or Zip Code
20 mi
  • 0 mi
  • 5 mi
  • 10 mi
  • 20 mi
  • 50 mi
  • 100 mi
Related to

  • Site Reliability Engineer

    Cognizant (Arizona City, AZ)



    Apply Now

    About the role

     

    As a **Site Reliability Engineer** , you will make an impact by designing and implementing advanced observability solutions tailored for distributed edge computing environments. You will be a valued member of the **Technology & Engineering** team and collaborate closely with infrastructure, application, and DevOps teams to ensure system reliability across remote facilities and centralized platforms.

    In this role, you will:

    + Design and implement observability frameworks for edge environments, including monitoring, logging, tracing, and metrics collection

    + Define and maintain SLIs, SLOs, and business KPIs to measure and improve system reliability

    + Build dashboards, visualizations, and alerting systems for real-time insights and incident response

    + Implement distributed tracing and log aggregation to troubleshoot complex edge issues

    + Collaborate with engineering teams to embed observability best practices in resource-constrained environments

    + Drive proactive issue detection and resolution, reducing MTTD and MTTR across distributed systems

    + Lead incident postmortems and implement observability-driven improvements

    + Develop automation tools and scripts to enhance observability pipelines

    + Optimize data storage and querying strategies for performance and scalability

    + Stay current with emerging observability tools and trends, especially for edge computing

     

    Work model: On-site

     

    This is an onsite position requiring presence at a Cognizant or client location in Arizona City, Arizona and/or Scottsdale, Arizona. We strive to provide flexibility wherever possible and support a healthy work-life balance through our wellbeing programs.

     

    The working arrangements for this role are accurate as of the date of posting. This may change based on the project you’re engaged in, as well as business and client requirements. Rest assured; we will always be clear about role expectations.

     

    Applicants may be required to attend interviews in person or by video conference. In addition, candidates may be required to present their current state or government issued ID during each interview.

    What you need to have to be considered:

    + 8+ of overall experience with the technologies

    + 3–5 years of experience in service reliability/operations for large-scale hybrid environments

    + 3–5 years of experience in automation scripting and dashboard development for performance monitoring

    + 2–4 years of experience with programming languages such as Go, Python, Java, or Rust

    + Working knowledge of databases like Oracle, SQL Server, Redis, ClickHouse, PostgreSQL, MongoDB, or time-series databases

    + At least 2 years of experience with cloud platforms and containerization (GCP, AWS, Azure, Rancher, OpenShift)

    + Experience maintaining containerized apps in GKE/RKE/AKE environments

    + Hands-on experience implementing observability using OpenTelemetry (OTEL)

    + Experience with GraphQL frameworks (Apollo, Prisma, Hasura)

    + Strong understanding of networking protocols (TCP/IP, HTTP, DNS, Load Balancing, Service Mesh)

    These will help you stand out:

    + Proven experience managing 24/7 high-availability platforms for critical applications

    + Familiarity with monitoring tools like Splunk, AppDynamics, Grafana/Prometheus, Dynatrace

    + Experience with CI/CD tools and platforms (Rally, Confluence, etc.)

    + Hands-on experience with Redis and in-memory caching solutions

    + Strong debugging skills across integrated platforms and API gateways

    + Experience with GCS, Cloud SQL, Spanner, and Firestore

    + Background in enterprise-level infrastructure and operations

    + Expertise in Linux/Windows administration and distributed systems

    + Experience monitoring and troubleshooting HashiCorp Vault environments

    + Working knowledge of Vertex AI, Gen AI, and BigQuery

    Benefits:

    Cognizant offers the following benefits for this position, subject to applicable eligibility requirements:

    + Medical/Dental/Vision/Life Insurance

    + Paid holidays plus Paid Time Off

    + 401(k) plan and contributions

    + Long-term/Short-term Disability

    + Paid Parental Leave

    + Employee Stock Purchase Plan

     

    Cognizant is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to sex, gender identity, sexual orientation, race, color, religion, national origin, disability, protected Veteran status, age, or any other characteristic protected by law.

     


    Apply Now



Recent Searches

  • Network Engineering PM (United States)
  • lead service officer small (United States)
  • Demolition Laborer Operator (United States)
  • Staff UX Writer Content (United States)
[X] Clear History

Recent Jobs

  • Site Reliability Engineer
    Cognizant (Arizona City, AZ)
[X] Clear History

Account Login

Cancel
 
Forgot your password?

Not a member? Sign up

Sign Up

Cancel
 

Already have an account? Log in
Forgot your password?

Forgot your password?

Cancel
 
Enter the email associated with your account.

Already have an account? Sign in
Not a member? Sign up

© 2025 Alerted.org