- NVIDIA (Santa Clara, CA)
- …advancement. Our team is searching for Graduate Intern to join our Board Level Reliability team to evaluate PCBA/Module level reliability . You will be work in ... the Board Level Reliability lab environment testing for various NV products including...environment testing for various NV products including large server systems and perform various functional tests for GPU/Tegra products.… more
- Oracle (Sacramento, CA)
- …management, and postmortems + Solid grasp of networking, security fundamentals, and performance engineering **Nice to have** + Experience in regulated or ... **Job Description** Senior Site Reliability Engineer - Cloud Automation (Oracle Health |...operations, and on-call excellence + Build automation and self-healing systems using IaC (eg, Terraform) and CI/CD + Design,… more
- IBM (San Jose, CA)
- …knowledge in monitoring/observability, issue response, and troubleshooting for optimal system performance . * Automation: knowledge in automation for ... systems and services around the clock, ensuring continuous reliability and optimal customer experience. * Cross-Functional Troubleshooting: Collaborate with … more
- LinkedIn (Mountain View, CA)
- …1 billion members and the application layer. We do this with a focus on performance , security, and reliability . As a Sr. Staff Software Engineer, you will fill ... will be based in Mountain View, CA. LinkedIn's Edge Engineering team builds and operates the infrastructure that resolves,...L4/7 proxies, CDN, RUM, WAF and DDoS, and web performance . + Experience with Linux operating systems … more
- NVIDIA (Santa Clara, CA)
- …in Computer Science or related field. + 8+ years of experience in site reliability engineering and/or software development roles. + Fluency in Python + In-depth ... and networking Ways to stand out from the crowd: + Experience with C++, high- performance computing, Kubernetes and/or system administration would be an asset +… more
- Oracle (Sacramento, CA)
- …services deployed in more than 40 regions worldwide. The mission of our Network Reliability Engineering team is to provide services that allow our customers to ... Scrum & Agile Methodologies + Strong technical knowledge in cloud networking, high performance computing, and GPU systems . \#LI-KR4 Oracle is an Equal Employment… more
- Oracle (Sacramento, CA)
- **Job Description** The mission of our Network Reliability Engineering team is to provide exceptional network reliability and automation services that enable ... minimize downtime, quickly resolve incidents, and continuously enhance network performance through automation, advanced monitoring, and a customer-centric approach.… more
- Oracle (Sacramento, CA)
- **Job Description** The mission of our Network Reliability Engineering team is to provide exceptional network reliability and automation services that enable ... minimize downtime, quickly resolve incidents, and continuously enhance network performance through automation, advanced monitoring, and a customer-centric approach.… more
- Google (Sunnyvale, CA)
- …on machine learning systems . + Experience with GPU/TPU architectures, AI system integration, and performance techniques. + Experience with data center ... Data Center Design Lead, System Engineering and Architecture _corporate_fare_ Google...infrastructure, including power, networking, storage, and cooling systems . + Experience with cost and performance … more
- Palo Alto Networks (Santa Clara, CA)
- … are robust and performant. This includes automation, architecture, performance , observability, troubleshooting, security, and reliability . Our Infrastructure ... and weekend, to support critical business operations and production systems and for incident response. + **Lead root cause...align cloud operations with business goals. **The Team** Our engineering team is at the core of our products… more