• Senior Azure DevOps Engineer

    EPAM Systems (San Jose, CA)
    …balancer, DNS, etc. + Understand the concepts of Site Reliability Engineering ( SRE ) to maximize automation, reduce waste, increase scale, and apply systemic thinking ... + Ability to express ideas effectively in individual and group situations (including non-verbal communication), adjusting language or terminology to the characteristics and needs of the audience + Ability to listen effectively to others and give constructive… more
    EPAM Systems (08/21/25)
    - Related Jobs
  • Senior Software Engineer

    Microsoft Corporation (Mountain View, CA)
    …quality. + 1+ year(s) of experience applying site-reliability engineering ( SRE ) practices, including monitoring, incident response, and improving system resilience. ... Software Engineering IC4 - The typical base pay range for this role across the US is USD $119,800 - $234,700 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and… more
    Microsoft Corporation (08/21/25)
    - Related Jobs
  • Senior Network Operations Engineer - Layer…

    ServiceNow, Inc. (Santa Clara, CA)
    …(eg, Azure, AWS, GCP). + Partner with the Site Reliability Engineering ( SRE ) team to improve operational processes and reliability. + Review, consult, and ... prepare for planned changes and releases to the production environment. + Create and maintain detailed documentation of infrastructure, automation, and standard operating procedures. + Provide feedback to infrastructure architects and contribute to design… more
    ServiceNow, Inc. (08/21/25)
    - Related Jobs
  • Senior Product Manager - Observability…

    NVIDIA (Santa Clara, CA)
    …is built. From healthcare research applications to autonomous vehicles, or voice- recognition systems, there is a need to simplify and deliver predictability ... propose novel approaches and shape new proof‑of‑concepts. + Bridge development, SRE , and partner teams. Facilitate clear communication, triage emergent issues… more
    NVIDIA (08/19/25)
    - Related Jobs
  • Senior Observability Customer Success…

    Amazon (Mountain View, CA)
    …serverless) - Experience with DevOps practices and tools - Knowledge of SRE principles and practices About the team Diverse Experiences AWS values diverse ... experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't… more
    Amazon (08/19/25)
    - Related Jobs
  • Senior DevOps Engineer

    NVIDIA (Santa Clara, CA)
    …+ BS (or equivalent experience) with 5+ years of professional experience in DevOps, SRE , or Build/Release engineering roles at similar scale. + Fluent in Python for ... scripting, tooling, and automation + Deep hands‑on experience in CI/CD, virtualization, and container orchestration. Usage of tools like GitLab CI/CD, Jenkins, CircleCI, Docker, Kubernetes is required. Ways to stand out from the crowd: + Proven understanding… more
    NVIDIA (08/16/25)
    - Related Jobs
  • Senior Linux Performance Engineer…

    ServiceNow, Inc. (Santa Clara, CA)
    …perspective. Our engineers are responsible for restoring database/application services, guiding SRE and CS operations on any database-related issues, working with ... development on database defects and migrations, and strategizing the scaling of the ServiceNow platform. The ideal candidate for this position is a software engineer with a strong background in database technologies, performance analysis of databases and RHEL,… more
    ServiceNow, Inc. (08/13/25)
    - Related Jobs
  • Senior Site Reliability Engineer - Identity…

    Coinbase (Sacramento, CA)
    …Coinbase is hiring! We are looking for an experienced Site Reliability Engineer ( SRE ) to join the IT Operations Corporate Engineering team to build and scale ... our identity and access management tooling. A successful candidate will have demonstrated previous success in similar role(s) in rapidly growing, security-first environments. The right person is passionate about infrastructure as code, open source tooling,… more
    Coinbase (08/09/25)
    - Related Jobs
  • Senior Software Engineer - Bare Metal…

    NVIDIA (Santa Clara, CA)
    …of secure communication protocols (mutual-TLS, IPsec, or similar). + Knowledge of SRE principles (observability, SLOs, logging, etc.) Ways to stand out from the ... crowd: + Experience in a Hyperscale Cloud Service Provider (public facing or not). + Understanding of networking protocols such as IP, IPv6, BGP, HTTP, ICMP, tunneling protocols (VXLAN, Geneve, FoU, GRE), etc. + Familiarity with Infiniband networking. +… more
    NVIDIA (08/08/25)
    - Related Jobs
  • Senior DGX Cloud Software Engineer…

    NVIDIA (Santa Clara, CA)
    …developing multi-cloud infrastructure services. Experience teaching reliability engineering (eg SRE ) and/or other scale-oriented cloud systems practices to peers ... and/or other companies (eg CRE). Experience in running private or public cloud systems based on one or more of Kubernetes, OpenStack, Docker or Slurm. + Experience with accelerated compute and communications technologies such BlueField Networking, Infiniband… more
    NVIDIA (07/26/25)
    - Related Jobs