• Research Intern - Reliability of Cloud…

    Microsoft Corporation (Redmond, WA)
    …the future of reliable, large-scale cloud and AI systems? The **Systems Reliability Group** at **Microsoft Research** is looking for motivated Research Interns to ... challenges at the intersection of distributed systems, AI systems, and software engineering . We tackle some of the toughest challenges in modern computing-designing… more
    Microsoft Corporation (09/30/25)
    - Related Jobs
  • System Reliability & Support Lead

    PNC (Pittsburgh, PA)
    …for this position. * The following experience and tools are preferred: -- Reliability engineering -- Observability & application monitoring -- Dynatrace -- ... and have an opportunity to contribute to the company's success. As a System Reliability and Support Lead within PNC's Technology organization, you will be based in… more
    PNC (09/13/25)
    - Related Jobs
  • Senior Site Reliability Engineer

    TEKsystems (Atlanta, GA)
    …with infrastructure support and development * 7+ years of experience of Site Reliability Engineering and DevOps. * Proficient in scripting languages like Python, ... 1. 7+ years of experience within SRE. The hiring manager is more focused on SRE Practice (being able...another. Key Functions/Duties of Position: * Define, and track reliability and observability OKRs. This includes defining and tracking… more
    TEKsystems (10/11/25)
    - Related Jobs
  • Sr. Site Reliability Engineer

    Tractor Supply Company (Brentwood, TN)
    Sr. Site Reliability Engineer **Overall Job Summary** As a Sr Site Reliability Engineer you will play a key role in implementing modern Engineering and ... capabilities and automated self-healing and recovery. + Communicates state of reliability to prioritize technical debt and improvements on technology team roadmaps.… more
    Tractor Supply Company (09/04/25)
    - Related Jobs
  • Lead Speed and Reliability Engineer - DFP

    NVIDIA (Santa Clara, CA)
    …the best minds in NVIDIA across various teams (System Architecture, PDE, Application Engineering , Product Manager , Sales, Operations) in a dynamic, creative work ... The DFP team is looking for a Speed and Reliability Lead. You will be leading and crafting testability...need to see: + MS in EE, CE, Systems Engineering (or equivalent experience) and 8+ years of experience… more
    NVIDIA (08/28/25)
    - Related Jobs
  • Site Reliability Engineer

    Nightwing (Sterling, VA)
    …tools. Desired Certs: + Global Skill Development Council (GSDC) Site Reliability Engineering (SRE) Foundation Certification (CSREF). + AWS Certified ... intelligence community, defense, civil, and commercial markets. Job Title: Site Reliability Engineer Location: Sterling, VA Clearance: TS/SCI Poly **This position is… more
    Nightwing (09/26/25)
    - Related Jobs
  • Senior Software Engineer, Backend - Infra Core…

    Coinbase (Charlotte, NC)
    …fully supported. *What you'll be doing (ie. job duties):* *Team* - Core Reliability team is a vital part of Infrastructure(Platform) org responsible for paving the ... path for system's reliability and scalability. We manage multiple company wide projects...configurations & secrets by building/enhancing world class service configuration manager systems. Your customer focus skill will help reduce… more
    Coinbase (08/19/25)
    - Related Jobs
  • Network Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    …practice in network operations or related fields concentrating on automation & site reliability engineering . Familiarity with both enterprise and the data center ... on the world. We are seeking a highly skilled and experienced Network Site Reliability Engineer (SRE) to join our Enterprise Network Operations and SRE team. In this… more
    NVIDIA (09/23/25)
    - Related Jobs
  • Reliability Engineer

    ITW (Orting, WA)
    **Job Description:** We are looking for a data-focused Reliability Engineer to fix and prevent warranty issues. This role is crucial for improving product quality ... **Scope and Function:** Support and/or complete medium to high level, complex engineering tasks/projects as assigned to meet business objectives and support product… more
    ITW (09/23/25)
    - Related Jobs
  • Research Intern - Cloud Reliability

    Microsoft Corporation (Redmond, WA)
    …of technical expertise in machine learning, cloud systems and software engineering . We communicate our research both internally and externally through peer-reviewed ... of the research problems we are currently working on are: improving reliability and observability of Agentic Systems, workload-aware placement of compute resources,… more
    Microsoft Corporation (10/11/25)
    - Related Jobs