• Senior Software Development Engineer

    Amazon (Seattle, WA)
    …integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and ... learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit includes an ML...models like the Llama family, DeepSeek and beyond. The Inference Enablement and Acceleration team works side by side… more
    Amazon (01/06/26)
    - Related Jobs
  • Senior Software Development Engineer

    Amazon (Seattle, WA)
    …integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and ... learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit includes an ML...models like the Llama family, DeepSeek and beyond. The Inference Enablement and Acceleration team works side by side… more
    Amazon (12/26/25)
    - Related Jobs
  • Machine Learning Engineer , AWS Neuron…

    Amazon (Seattle, WA)
    …the Trn2 and future Trn3 servers that use them. This role is for a software engineer in the Machine Learning Applications ( ML Apps) team for AWS Neuron. This ... enables and performance tunes building blocks for all key ML model families, including Llama3, GPT OSS, Qwen3, DeepSeek...Llama3, GPT OSS, Qwen3, DeepSeek and beyond. The Neuron Inference Technology team works side by side with the… more
    Amazon (12/24/25)
    - Related Jobs
  • Software Engineer -AI/ ML , AWS…

    Amazon (Seattle, WA)
    …cloud-scale machine learning accelerators. This role is for a senior software engineer in the Machine Learning Inference Applications team. This role is ... for development and performance optimization of core building blocks of LLM Inference - Attention, MLP, Quantization, Speculative Decoding, Mixture of Experts, etc.… more
    Amazon (12/21/25)
    - Related Jobs
  • Senior Software Engineer

    MongoDB (Palo Alto, CA)
    **About the Role** We're looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic ... with Atlas and designed for developer-first experiences. As a Senior Engineer , you'll focus on building core...a cloud-native environment + Work across product, infrastructure, and ML teams to ensure the inference platform… more
    MongoDB (01/08/26)
    - Related Jobs
  • Senior Software Engineer , AI…

    NVIDIA (CA)
    …how you can make a lasting impact on the world. We are now looking for a Senior System Software Engineer to work on user facing tools for Dynamo Inference ... of modern ML architectures with a keen intuition for optimizing inference performance. + Take full ownership of problems end-to-end, proactively acquiring any… more
    NVIDIA (11/29/25)
    - Related Jobs
  • Senior Software Engineer , AI…

    NVIDIA (Santa Clara, CA)
    …systems, deep learning theories. + Knowledgeable and passionate about performance engineering in ML frameworks (eg, PyTorch) and inference engines (eg, vLLM and ... motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency....that pushes the pareto frontier for the field of ML Systems; survey recent publications and find a way… more
    NVIDIA (01/10/26)
    - Related Jobs
  • Senior Principal Machine Learning…

    Red Hat (Boston, MA)
    …bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings operational simplicity to GenAI ... build, optimize, and scale LLM deployments. As a Machine Learning Engineer focused on distributed vLLM (https://github.com/vllm-project/) infrastructure in the LLM-D… more
    Red Hat (01/08/26)
    - Related Jobs
  • Senior GenAI Algorithms Engineer

    NVIDIA (Santa Clara, CA)
    …streamlined deployment strategies with open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative generative ... as large language models (LLM) and diffusion models for maximal inference efficiency using techniques ranging from quantization, speculative decoding, sparsity,… more
    NVIDIA (01/10/26)
    - Related Jobs
  • Senior Technical Marketing Engineer

    NVIDIA (Santa Clara, CA)
    …full-stack software ecosystem to power AI at scale. We are looking for a Senior Technical Marketing Engineer to join our growing accelerated computing product ... ensure a consistent, high-impact go-to-market strategy. This role will focus on AI inference at scale, ensuring that customers and partners understand how to best… more
    NVIDIA (11/06/25)
    - Related Jobs