• Senior DevOps Service Reliability Operations…

    NVIDIA (Santa Clara, CA)
    …runbooks as needed. + Help discover incidents and issues, including initiating the incident management procedure. + Bring in subject matter authorities or ... and high-performance computing environments. + Experience with observability and incident management tools (Grafana, OpenTelemetry, PagerDuty, JIRA). Cloud… more
    NVIDIA (11/15/25)
    - Related Jobs
  • Principal Solutions Engineer

    The Walt Disney Company (Santa Monica, CA)
    …maintain incident response processes, including incident escalation procedures, post- incident reviews, and incident management tooling + Identify ... performance, including key performance indicators (KPIs) and metrics to senior management for strategic decision-making + Drive a close engagement with Engineering… more
    The Walt Disney Company (11/08/25)
    - Related Jobs
  • Sr. Manager, Site Reliability Engineering

    Waystar (Atlanta, GA)
    …postmortems. + Drive automation of operational tasks and improve system observability. + ** Incident Management & Response** + Lead major incident response ... efforts, ensuring timely resolution and clear communication. + Establish and refine incident management processes, including root cause analysis and follow-up… more
    Waystar (11/07/25)
    - Related Jobs
  • Principal DevOps Engineer (Cortex- Prisma Cloud)

    Palo Alto Networks (Santa Clara, CA)
    …platforms to optimize our infrastructure + Manage Incidents - Participate in incident management , following established processes to ensure prompt resolution of ... A strong understanding of Linux internals allowing for quick troubleshooting + Incident and Alerts Management - Clear understanding of incident and alerts … more
    Palo Alto Networks (11/06/25)
    - Related Jobs
  • Supervisor, Security

    NTT America, Inc. (Mesa, AZ)
    …security technologies and computer systems, including access control systems, CCTV, and incident management software. + Familiar with occupational hazards and ... the Security Site Supervisor will review and analyze loss control, incident reports, coordinate and administer security clearances, prepare, and maintain all… more
    NTT America, Inc. (10/23/25)
    - Related Jobs
  • Director of Cyber Defense Security Operations…

    Experian (Allen, TX)
    …contains, eradicates, and recovers from events. Reports to: CFC Senior Director of Incident Management and Security Operations. This is a remote position. You ... to improve incident response effectiveness. + Oversee the daily operations, management , and professional development of the SecOps team to support global 24x7… more
    Experian (10/22/25)
    - Related Jobs
  • Sr Threat Intelligence Analyst

    ADM (Erlanger, KY)
    …with various technologies such as SIEM, IDS/IPS, Proxy, endpoint and enterprise incident management systems, as well as applications such as Microsoft ... to leadership and cyber security analysts in the security operations center, incident responders, hunt teams, vulnerability management , etc. The SCTIA will… more
    ADM (09/26/25)
    - Related Jobs
  • Software Developer 3

    Oracle (Seattle, WA)
    …engineering team dedicated to defining, designing, and developing advanced tools for incident and problem management at Oracle Cloud Infrastructure (OCI). Our ... + Architect, design, and develop scalable solutions and automation that address incident prevention, detection, analysis, resolution, and problem management at… more
    Oracle (11/25/25)
    - Related Jobs
  • Senior Software Engineer - SRE

    General Motors (Warren, MI)
    …by understanding its execution and the impact on system resources. + Incident Management : Experience handling production incidents, including root cause ... our observability stack to ensure robust telemetry and actionable insights. + Incident Response: Participate in an on-call rotation to diagnose, troubleshoot, and… more
    General Motors (12/13/25)
    - Related Jobs
  • Principal Software Engineer, Site Reliability…

    General Motors (Montpelier, VT)
    …by understanding its execution and the impact on system resources. + ** Incident Management ** : Experience handling production incidents, including root cause ... observability frameworks, enabling proactive detection and resolution of incidents. + ** Incident Response** : Participate in an on-call rotation to diagnose,… more
    General Motors (12/10/25)
    - Related Jobs