- NVIDIA (Santa Clara, CA)
- …wave of artificial intelligence. Join our team at NVIDIA as a Senior Site reliability engineer focused on HPC storage and play a crucial role in designing, ... implementing, and optimizing on-prem High-Performance Computing (HPC) storage solutions while harnessing the power of cloud computing. You will be responsible for crafting and deploying distributed storage solutions, build automation tools, and ensuring the… more
- Huntington Ingalls Industries (San Diego, CA)
- …Action System (FRACAS) to streamline corrective action processes using Reliability , Availability, Maintainability - Cost (RAM-C). Tracks parts consumption, maintains ... documentation, and deploys technical support to troubleshoot maintenance issues. Provides reports on repair action costs, particularly for high-cost scenarios, justifying the economic feasibility of corrective actions. Contributes to technology refreshment… more
- Tarana Wireless (Milpitas, CA)
- …bridging the digital divide in ways previously thought impossible. As a Senior Site Reliability Engineer, you will help us manage software that runs on the cloud and ... remotely manages millions of radio devices. You will work on a team and be a main point of contact during off shore hours and responsible for all aspects of cloud operations, such as: + Infrastructure as Code + Manage environments in AWS + Monitoring and… more
- NVIDIA (Santa Clara, CA)
- …and board designers, software/firmware engineers, HW/SW applications engineering, process/ reliability specialists, DFx engineers, ATE engineers, product managers, ... sales, and operations, in a multifaceted, high-energy work environment to bring industry-defining products to market. + Designing tools and scripts to automate characterization, data collection, test case execution, and results analysis. + Prototype and… more
- Coinbase (Sacramento, CA)
- …fully supported. Coinbase is hiring! We are looking for an experienced Site Reliability Engineer (SRE) to join the IT Operations Corporate Engineering team to build ... and scale our identity and access management tooling. A successful candidate will have demonstrated previous success in similar role(s) in rapidly growing, security-first environments. The right person is passionate about infrastructure as code, open source… more
- LinkedIn (Mountain View, CA)
- …Suggested Skills: . Distributed Systems . Technical Leadership . Infrastructure Reliability . Systems Infrastructure . Java/Golang/Rust/Python You will Benefit from ... our Culture: We strongly believe in the well-being of our employees and their families. That is why we offer generous health and wellness programs and time away for employees of all levels. LinkedIn is committed to fair and equitable compensation practices.… more
- ServiceNow, Inc. (Santa Clara, CA)
- …in the future. **As a Senior Staff Machine Learning Engineer - Site Reliability Engineer you will:** + Contribute to the design, development and implementation of ... infrastructure, platform, deployment and observability features that power AI workloads. + Collaborate with researchers, AI engineers, and infrastructure teams to ensure our GPU clusters perform efficiently, scale well, and remain reliable. + Contribute to the… more
- LiveRamp (San Francisco, CA)
- …issues with Engineering teams** + **Setup and maintain Infrastructure & Product Reliability monitoring and alerting** + **Maintain and enhance CI/CD Tooling and ... Terraform scripts in support of the mission in close collaboration with DevOps team** + **Maintain and enhance Engineering Operational Documentation for supported products.** + **Provide expertise to build and maintain products operational documentation and… more
- Palo Alto Networks (Santa Clara, CA)
- …automation, architecture, performance, observability, troubleshooting, security, and reliability . Our Infrastructure Platform stack includes Terraform, Kubernetes, ... GitLab CI/CD, GitOps, Prometheus, Grafana, Loki, Docker, GCP, Backstage, MySQL, PagerDuty, FireHydrant, Python, Bash, Java, NodeJS and Go. **Your Impact** + Design, build, and operate reliable, secure Cloud infrastructure across multi-cloud environments +… more
- Graphic Packaging International, LLC (Visalia, CA)
- Reliability Engineer Requisition ID: 10615 Location: Visalia, CA, US, 93278 Department: Manufacturing & Operations Travel: Up to 25% **If you are a GPI employee, ... please click the Employee Login before applying. (https://graphicpact2test.valhalla55.stage.jobs2web.com/)** **At Graphic Packaging International, we produce the paper cup that held your coffee this morning, the basket that transported those bottles of craft… more