Senior ML Inference Engineer Jobs

146 jobs (page 1)

Categories

All Categories

Engineering (59)

Software/IT (18)

Management (6)

Senior Software Development Engineer…

Amazon (Seattle, WA)

…integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and ... learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit includes an ML...models like the Llama family, DeepSeek and beyond. The Inference Enablement and Acceleration team works side by side… more

Amazon (01/06/26)
- Related Jobs
Senior Software Development Engineer…

Amazon (Seattle, WA)

…integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and ... learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit includes an ML...models like the Llama family, DeepSeek and beyond. The Inference Enablement and Acceleration team works side by side… more

Amazon (12/26/25)
- Related Jobs
Machine Learning Engineer , AWS Neuron…

Amazon (Seattle, WA)

…the Trn2 and future Trn3 servers that use them. This role is for a software engineer in the Machine Learning Applications ( ML Apps) team for AWS Neuron. This ... enables and performance tunes building blocks for all key ML model families, including Llama3, GPT OSS, Qwen3, DeepSeek...Llama3, GPT OSS, Qwen3, DeepSeek and beyond. The Neuron Inference Technology team works side by side with the… more

Amazon (12/24/25)
- Related Jobs
Software Engineer -AI/ ML , AWS…

Amazon (Seattle, WA)

…cloud-scale machine learning accelerators. This role is for a senior software engineer in the Machine Learning Inference Applications team. This role is ... for development and performance optimization of core building blocks of LLM Inference - Attention, MLP, Quantization, Speculative Decoding, Mixture of Experts, etc.… more

Amazon (12/21/25)
- Related Jobs
Senior Software Engineer…

MongoDB (Palo Alto, CA)

**About the Role** We're looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic ... with Atlas and designed for developer-first experiences. As a Senior Engineer , you'll focus on building core...a cloud-native environment + Work across product, infrastructure, and ML teams to ensure the inference platform… more

MongoDB (01/08/26)
- Related Jobs
Senior Software Engineer , AI…

NVIDIA (CA)

…how you can make a lasting impact on the world. We are now looking for a Senior System Software Engineer to work on user facing tools for Dynamo Inference ... of modern ML architectures with a keen intuition for optimizing inference performance. + Take full ownership of problems end-to-end, proactively acquiring any… more

NVIDIA (11/29/25)
- Related Jobs
Senior Software Engineer , AI…

NVIDIA (Santa Clara, CA)

…systems, deep learning theories. + Knowledgeable and passionate about performance engineering in ML frameworks (eg, PyTorch) and inference engines (eg, vLLM and ... motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency....that pushes the pareto frontier for the field of ML Systems; survey recent publications and find a way… more

NVIDIA (01/10/26)
- Related Jobs
Senior Principal Machine Learning…

Red Hat (Boston, MA)

…bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings operational simplicity to GenAI ... build, optimize, and scale LLM deployments. As a Machine Learning Engineer focused on distributed vLLM (https://github.com/vllm-project/) infrastructure in the LLM-D… more

Red Hat (01/08/26)
- Related Jobs
Senior GenAI Algorithms Engineer…

NVIDIA (Santa Clara, CA)

…streamlined deployment strategies with open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative generative ... as large language models (LLM) and diffusion models for maximal inference efficiency using techniques ranging from quantization, speculative decoding, sparsity,… more

NVIDIA (01/10/26)
- Related Jobs
Senior Technical Marketing Engineer…

NVIDIA (Santa Clara, CA)

…full-stack software ecosystem to power AI at scale. We are looking for a Senior Technical Marketing Engineer to join our growing accelerated computing product ... ensure a consistent, high-impact go-to-market strategy. This role will focus on AI inference at scale, ensuring that customers and partners understand how to best… more

NVIDIA (11/06/25)
- Related Jobs

"Alerted.org

Advanced Search

Recent Searches

Recent Jobs

Account Login

Sign Up

Forgot your password?