- NVIDIA (Santa Clara, CA)
- …resource utilization. + Directly administer internal research clusters, conduct upgrades, incident response , and reliability improvements. + Develop and improve ... most advanced computing workloads. NVIDIA is looking for an AI/ML HPC Cluster Engineer to join our MARS team. You will provide technical engagement and problem… more
- Insight Global (South San Francisco, CA)
- …Prometheus, and Grafana. * Background in incident response and post-mortem strategy. * Prior involvement in cloud governance and policy implementation. ... Job Description We are seeking a hands-on Senior CloudOps Engineer to lead the execution of our AWS ...Engineer to lead the execution of our AWS cloud operations strategy. This role is critical to driving… more
- General Motors (Mountain View, CA)
- …you run it" culture from initial design through deployment, monitoring, and production incident response . **What Will Give You a Competitive Edge (Preferred ... Role** The Infrastructure Engineering organisation at GM is building a cloud -native platform that transforms how developers interact with automotive test hardware.… more
- Robert Half-Robert Half Corporate (San Ramon, CA)
- …and resolution of moderate to complex issues in production platforms, defining incident response approaches and resolution playbooks. + Provides Level III ... **Who We Are** Robert Half is seeking a Senior Software Engineer III - ATI to join our team supporting the underlying infrastructure, platforms, and services that… more
- Oracle (Santa Clara, CA)
- …supply chain + Establish and/or participate (as needed) in PSIRT (Product Security Incident Response Team) relationships with key Oracle hardware suppliers and ... the next generation of Oracle hardware that underlies all of Oracle's Cloud and Enterprise platform offerings. These systems utilize leading edge technology to… more
- Charles Schwab (San Francisco, CA)
- … with infrastructure as code. + Experience implementing monitoring, alerting, and incident response for large-scale distributed systems. + Proven track record ... serve our clients. As a Senior AI Site Reliability Engineer on AI.x, you will play a key role...datasets. + 3+ years of experience with containers and cloud -native applications, and the ability to operationalize them in… more
- KBR (El Segundo, CA)
- …(AWS EKS, Azure AKS). + Help with vulnerability assessments, security monitoring, and incident response automation. + Work with developers to implement secure ... Title: Senior DevOps Software Engineer Belong. Connect. Grow. with KBR! KBR's National...the instantiation of DevOps pipelines in both on-prem and cloud environments. Work Environment: + Location: On-site + Travel… more
- DoorDash (San Francisco, CA)
- …them away for other engineers. You understand concepts like SLOs, error budgets, and incident response though this is a platform development team, not an ... Control - building the systems engineers use to provision services, request cloud resources, and safely make config changes across traffic, compute, and secrets… more
- Deloitte (Costa Mesa, CA)
- …artifacts + Manage audit trails and automated compliance checks + Implement AI-specific incident response and develop regulatory disclosure playbooks + Manage AI ... Work you'll do As a Deloitte Manager, AI Security Engineer , you will be crucial in safeguarding our advanced...week. + 5+ years of experience in cybersecurity (application, cloud and data security) with strong proficiency in security… more
- Planetart (Calabasas, CA)
- …issues in production and development environments, with a focus on urgent incident response . + AWS Infrastructure Management: Deploy, configure, and manage ... well as in Europe. Job Overview PlanetArt is seeking a Senior Full-Stack Engineer to support the company's technology initiatives and lead its engineering efforts.… more