- NVIDIA (Santa Clara, CA)
- …make a lasting impact on the world. What You Will Be Doing: + Architect , lead, and scale globally distributed production systems supporting AI/ML, HPC, and critical ... system architecture. What We Need to See: + 15+ years of experience in SRE , Production Engineering, or Cloud Infrastructure, with a strong track record of leading… more
- Amazon (East Palo Alto, CA)
- …serverless) - Experience with DevOps practices and tools - Knowledge of SRE principles and practices About the team Diverse Experiences AWS values diverse ... SCRUM/Agile, SAFe certification - AWS certifications such as AWS Solutions Architect Associate and/or AWS SysOps Administrator - Experience implementing cloud… more
- Palo Alto Networks (Santa Clara, CA)
- …decisions on technology, architecture,integration, and reusability + Influence and architect cloud native, distributed computing system design, data ingestion ... for emerging agentic AI workflows + Work cross-functionally with Product Management, SRE , Software, and Quality Engineering teams to deliver new security as a… more
- The Walt Disney Company (Glendale, CA)
- …role with strategic reach. You'll collaborate closely with security, DevOps, SRE , and application teams to deliver platform capabilities that improve developer ... integrated Istio service mesh for traffic management and observability. + Architect secure network configurations, including VPC design, IAM, peering, and… more
- Palo Alto Networks (Santa Clara, CA)
- …cross-functional executive leadership and teams in Product Management, Development, and DevOps/ SRE to embed and advance security throughout the entire product ... lifecycle. **Your Impact** + Architect , champion, and oversee the implementation of next-gen AppSec technologies with advanced automation into complex, large-scale… more
- JPMorgan Chase (Palo Alto, CA)
- …enhance reliability and ensure operational efficiency. **Job responsibilities** + Architect and implement observability platforms and tools for proactive detection ... for anomaly detection and automated insights. + Collaborate with engineering and SRE teams to define service-level objectives (SLOs) and error budgets. + Provide… more
- LinkedIn (Mountain View, CA)
- …the long-term reliability and observability strategy across LinkedIn's infrastructureRe- architect LinkedIn's backend systems to enable granular failure domains ... a high-growth or web-scale technology companySuggested Skills:-Site Reliability Engineering ( SRE )-Leadership-Large scale infrastructureLinkedIn is committed to fair and equitable… more