ML Inference Engineer Systems Jobs in California

159 jobs (page 1)

Categories

All Categories

Engineering (57)

Software/IT (23)

Staff ML Engineer , Inference…

General Motors (Sunnyvale, CA)

…the business. **This job is eligible for relocation assistance.** **About the Team:** The ML Inference Platform is part of the AI Compute Platforms organization ... efficiency. **About the Role:** We are seeking a Staff ML Infrastructure engineer to help build and...shaping the architecture, roadmap and user-experience of a robust ML inference service supporting real-time, batch, and… more

General Motors (10/21/25)
- Related Jobs
Senior Software Development Engineer - AI/…

Amazon (Cupertino, CA)

…integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and ... learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit includes an ML...models like the Llama family, DeepSeek and beyond. The Inference Enablement and Acceleration team works side by side… more

Amazon (12/10/25)
- Related Jobs
Senior Software Development Engineer , AI/…

Amazon (Cupertino, CA)

…integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and ... learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit includes an ML...models like the Llama family, DeepSeek and beyond. The Inference Enablement and Acceleration team works side by side… more

Amazon (01/06/26)
- Related Jobs
Software Development Engineer AI/ ML…

Amazon (Cupertino, CA)

…applications. Key job responsibilities * Architect and lead the design of distributed ML serving systems optimized for generative AI workloads * Drive technical ... the boundaries of what's possible in large-scale ML serving. Recent shares: https://github.com/aws-neuron/upstreaming-to-vllm/releases/tag/2.25.0 https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/nxd- inference… more

Amazon (12/21/25)
- Related Jobs
Senior Software Engineer , AI…

NVIDIA (Santa Clara, CA)

…highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency. You'll architect ... that pushes the pareto frontier for the field of ML Systems ; survey recent publications and find...theories. + Knowledgeable and passionate about performance engineering in ML frameworks (eg, PyTorch) and inference engines… more

NVIDIA (01/10/26)
- Related Jobs
Lead Engineer , Inference Platform

MongoDB (Palo Alto, CA)

…in multi-tenant environments + 1+ years of experience serving as TL for a large-scale ML inference or training platform SW project **Nice to Have** + Prior ... We're looking for a Lead Engineer , Inference Platform to join our...of experience in managing a technical team focused on ML inference or training infrastructure **Why Join… more

MongoDB (12/27/25)
- Related Jobs
Senior Software Engineer , Inference…

MongoDB (Palo Alto, CA)

…for developer-first experiences. As a Senior Engineer , you'll focus on building core systems and services that power model inference at scale. You'll own key ... **About the Role** We're looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic… more

MongoDB (01/08/26)
- Related Jobs
Senior Software Engineer , AI…

NVIDIA (CA)

…benchmarking, automation, and documentation processes to ensure low-latency, robust, and production-ready inference systems on GPU clusters. What we need to see: ... systems , including Rust-based runtime components, for large-scale AI inference workloads. + Implement inference scheduling and deployment solutions… more

NVIDIA (11/29/25)
- Related Jobs
Lead AI Engineer (FM Hosting, LLM…

Capital One (San Francisco, CA)

Lead AI Engineer (FM Hosting, LLM Inference ) **Overview** At Capital One, we are creating responsible and reliable AI systems , changing banking for good. For ... cost, latency, throughput - of large scale production AI systems . + Contribute to the technical vision and the...and supporting AI services + Experience developing AI and ML algorithms or technologies (eg LLM Inference ,… more

Capital One (11/04/25)
- Related Jobs
Senior GenAI Algorithms Engineer - Model…

NVIDIA (Santa Clara, CA)

…open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative generative AI models like LLMs, VLMs, multimodal and ... as large language models (LLM) and diffusion models for maximal inference efficiency using techniques ranging from quantization, speculative decoding, sparsity,… more

NVIDIA (01/10/26)
- Related Jobs

"Alerted.org

Advanced Search

Recent Searches

Recent Jobs

Account Login

Sign Up

Forgot your password?