- Palo Alto Networks (Santa Clara, CA)
- …automation, architecture, performance, observability, troubleshooting, security, and reliability . Our Infrastructure Platform stack includes Terraform, Kubernetes, ... GitLab CI/CD, GitOps, Prometheus, Grafana, Loki, Docker, GCP, Backstage, MySQL, PagerDuty, FireHydrant, Python, Bash, Java, NodeJS and Go. **Your Impact** + **Design, build, and operate** reliable, secure Cloud infrastructure across multi-cloud environments… more
- ServiceNow, Inc. (San Diego, CA)
- It all started in sunny San Diego, California in 2004 when a visionary engineer , Fred Luddy, saw the potential to transform how we work. Fast forward to today - ... engineers who are tasked with maintaining and developing the reliability , scalability and performance of the ServiceNow infrastructure. The...as a company and the SRE role. **As an Engineer on the SRE team you will:** + Provide… more
- NVIDIA (Santa Clara, CA)
- Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... internal and external facing GPU cloud services run maximum reliability and uptime as promised to the users and...be doing: + Design, implement and support operational and reliability aspects of large scale Observability & Telemetry collection… more
- NVIDIA (Santa Clara, CA)
- Site Reliability Engineering (SRE) at NVIDIA is an engineering discipline to design, build and maintain large scale production systems with high efficiency and ... internal and external facing GPU cloud services run maximum reliability and uptime as promised to the users and...be doing: + Design, implement and support operational and reliability aspects of large scale Kubernetes clusters with focus… more
- NVIDIA (Santa Clara, CA)
- …NTP/PTP, DHCP, and LDAP. This includes building for performance and reliability at global scale, covering automation, monitoring, high availability, capacity ... efficiency of services and drive efficiency with software and hardware optimizations ( SR -IOV/ DPU) + Experience with Technologies like eBPF and XDP for Observability… more
- Oracle (Sacramento, CA)
- …development lifecycle. Provides a broad set of guidance to technical and senior technical members of staff. Interprets input from leaders (technical and managers) ... related engineering field (or equivalent) with 5+ years Network Engineer experience or Masters with 5+ years Network ...Engineer experience or Masters with 5+ years Network Engineer experience . Experience working in a large ISP… more
- MongoDB (San Francisco, CA)
- We are looking for an experienced Senior or Staff Engineer for our SRE, InfraSec team, to guide the security of our cloud-based infrastructure. As a Staff SRE, ... with a strong focus on security work, with ideally 2+ years in a senior or staff engineering role Security Mindset: + A comprehensive understanding of all facets… more
- Zscaler (San Jose, CA)
- …a cloud-first strategy. We're seeking a highly skilled and experienced SRE Platform Engineer to join our SRE Cloud Platform Engineering Team. Reporting to the ... Director of Cloud Engineering, you will be responsible for: + Designing and maintaining scalable infrastructure solutions to support Zscaler's global cloud services + Enhancing observability practices across infrastructure and applications through monitoring,… more
- Insight Global (Santa Clara, CA)
- …fast-paced Infrastructure, Planning and Processes organization where you will be working as a Senior SRE Engineer . The position will be part of a fast-paced crew ... that develops and maintains sophisticated internal cloud provisioning products. The team works with various other business units such as Graphics Processors, Mobile Processors, Deep Learning, Artificial Intelligence and Driverless Cars to cater to their… more
- LinkedIn (Mountain View, CA)
- …in Sunnyvale, CA or San Francisco, CA. **Responsibilities** + Serve as a senior technical leader driving the long-term reliability and observability strategy ... enable the right business decisions around improving quality and reliability of our services and products + Act as...availability and performance + Previous experience in a Distinguished Engineer or equivalent role at a high-growth or web-scale… more