• Carnegie Mellon University (Pittsburgh, PA)
    We are seeking a highly skilled Resiliency Automation Engineer to join our team supporting embedded systems development in a regulated environment. This role ... mission-critical environments. (eg, aerospace, defense, embedded systems) + Familiarity with observability , logging, and monitoring tools as part of the software… more
    DirectEmployers Association (10/29/25)
    - Related Jobs
  • Carnegie Mellon University (Pittsburgh, PA)
    We are seeking a highly skilled Resiliency Automation Engineer to join our team supporting embedded systems development in a regulated environment. This role ... mission-critical environments. (eg, aerospace, defense, embedded systems) + Familiarity with observability , logging, and monitoring tools as part of the software… more
    DirectEmployers Association (10/29/25)
    - Related Jobs
  • Federal Reserve Bank of Boston (Richmond, VA)
    …residency commutable to one of our offices required. Responsibilities As a Principal Engineer of the SRE / Production Operations team for FedNow, you will operate ... services, and solutions. CI/CD and IaC Pipeline automation design and development. Resiliency , DR and BCP (including testing) The SRE / Production Operations team… more
    Appcast IO CPC (09/25/25)
    - Related Jobs
  • Unity Technologies (San Francisco, CA)
    …and deep debugging of complex issues + Drive operational excellence through improved observability , resiliency , and automation + Mentor engineers and raise the ... **San Francisco, CA, USA** **Staff Infrastructure Engineer , Ads Infrastructure** Location San Francisco, CA, USA...large scale data processing. + Deep understanding of networking, observability , and debugging distributed systems end to end **You… more
    DirectEmployers Association (10/08/25)
    - Related Jobs
  • Observability and Resiliency

    Vanguard (Malvern, PA)
    …are systems that reside in a technically complex and constantly evolving resiliency landscape. Passionate, technically skilled engineers are at the center of our ... resiliency operations, and we are looking to grow our...to grow our team. We are seeking an experienced engineer with broad, end-to-end software development experience, including operating… more
    Vanguard (11/04/25)
    - Related Jobs
  • Senior Director, Enterprise Observability

    Marriott (Bethesda, MD)
    …Senior Director to lead the global strategy and execution of **Enterprise Observability and Technology Resiliency & Recoverability** across Marriott's global ... & Technical Leadership** Define and execute a comprehensive vision for observability , infrastructure resiliency , and disaster recovery at enterprise scale… more
    Marriott (10/10/25)
    - Related Jobs
  • Senior Staff Engineer , Server Networking…

    MongoDB (New York, NY)
    …and resiliency of MongoDB Server + Design and implement observability improvements that enable MongoDB engineers and customers to quickly and accurately ... The Networking & Observability Team builds infrastructure for low-overhead observability and communication between MongoDB Server nodes, clients, and other… more
    MongoDB (10/08/25)
    - Related Jobs
  • Grafana Stack DevOps Engineer - Assistant…

    Citigroup (Tampa, FL)
    **Overview** We are seeking a highly skilled and motivated Grafana Stack DevOps Engineer to join our team as an Assistant Vice President (AVP) in Tampa. The ideal ... building, maintaining, and ensuring the stability and resilience of our Prime Observability Platform. This role requires a strong understanding of the Grafana… more
    Citigroup (08/26/25)
    - Related Jobs
  • Senior Software Engineer - CTJ - Poly

    Microsoft Corporation (Reston, VA)
    …of the entire team. We play a crucial role in supporting Microsoft's Resiliency services within the AirGapped clouds, which include Azure Backup, Azure Site ... and implementing best practices to safeguard and enhance the resiliency of our cloud infrastructure. Come join us as...cloud infrastructure. Come join us as a Senior Software Engineer : Come join us and make an impact as… more
    Microsoft Corporation (10/30/25)
    - Related Jobs
  • Senior Site Reliability Engineer

    TEKsystems (Raleigh, NC)
    …users. * Collaborate actively with development and operations teams to implement observability and resiliency requirements in order to ensure smooth deployment ... SRE practice/platform. The goal is to build a solid observability of the platform and evaluate the tool stack...are looking to augment that team with a staff engineer . What they need is someone with industry knowledge… more
    TEKsystems (10/29/25)
    - Related Jobs