• Principal Site Reliability Engineer, AI…

    NVIDIA (Santa Clara, CA)
    …make a lasting impact on the world. What You Will Be Doing: + Architect , lead, and scale globally distributed production systems supporting AI/ML, HPC, and critical ... system architecture. What We Need to See: + 15+ years of experience in SRE , Production Engineering, or Cloud Infrastructure, with a strong track record of leading… more
    NVIDIA (07/31/25)
    - Related Jobs
  • Principal Observability Customer Success…

    Amazon (East Palo Alto, CA)
    …serverless) - Experience with DevOps practices and tools - Knowledge of SRE principles and practices About the team Diverse Experiences AWS values diverse ... SCRUM/Agile, SAFe certification - AWS certifications such as AWS Solutions Architect Associate and/or AWS SysOps Administrator - Experience implementing cloud… more
    Amazon (07/31/25)
    - Related Jobs
  • Sr Staff Software Engineer (Agent Security).…

    Palo Alto Networks (Santa Clara, CA)
    …decisions on technology, architecture,integration, and reusability + Influence and architect cloud native, distributed computing system design, data ingestion ... for emerging agentic AI workflows + Work cross-functionally with Product Management, SRE , Software, and Quality Engineering teams to deliver new security as a… more
    Palo Alto Networks (07/24/25)
    - Related Jobs
  • Principal Platform Engineer, Infrastructure…

    The Walt Disney Company (Glendale, CA)
    …role with strategic reach. You'll collaborate closely with security, DevOps, SRE , and application teams to deliver platform capabilities that improve developer ... integrated Istio service mesh for traffic management and observability. + Architect secure network configurations, including VPC design, IAM, peering, and… more
    The Walt Disney Company (07/17/25)
    - Related Jobs
  • Principal Product Security Engineer (US Citizen)

    Palo Alto Networks (Santa Clara, CA)
    …cross-functional executive leadership and teams in Product Management, Development, and DevOps/ SRE to embed and advance security throughout the entire product ... lifecycle. **Your Impact** + Architect , champion, and oversee the implementation of next-gen AppSec technologies with advanced automation into complex, large-scale… more
    Palo Alto Networks (07/16/25)
    - Related Jobs
  • Principal Site Reliability Engineer

    JPMorgan Chase (Palo Alto, CA)
    …enhance reliability and ensure operational efficiency. **Job responsibilities** + Architect and implement observability platforms and tools for proactive detection ... for anomaly detection and automated insights. + Collaborate with engineering and SRE teams to define service-level objectives (SLOs) and error budgets. + Provide… more
    JPMorgan Chase (07/16/25)
    - Related Jobs
  • Distinguished Software Engineer, Reliability Infra

    LinkedIn (Mountain View, CA)
    …the long-term reliability and observability strategy across LinkedIn's infrastructureRe- architect LinkedIn's backend systems to enable granular failure domains ... a high-growth or web-scale technology companySuggested Skills:-Site Reliability Engineering ( SRE )-Leadership-Large scale infrastructureLinkedIn is committed to fair and equitable… more
    LinkedIn (06/04/25)
    - Related Jobs