• Staff ML Engineer , Inference

    General Motors (Sunnyvale, CA)
    …the business. **This job is eligible for relocation assistance.** **About the Team:** The ML Inference Platform is part of the AI Compute Platforms organization ... efficiency. **About the Role:** We are seeking a Staff ML Infrastructure engineer to help build and...shaping the architecture, roadmap and user-experience of a robust ML inference service supporting real-time, batch, and… more
    General Motors (10/21/25)
    - Related Jobs
  • Senior Software Development Engineer - AI/…

    Amazon (Cupertino, CA)
    …integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and ... learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit includes an ML...models like the Llama family, DeepSeek and beyond. The Inference Enablement and Acceleration team works side by side… more
    Amazon (12/10/25)
    - Related Jobs
  • Senior Software Development Engineer , AI/…

    Amazon (Cupertino, CA)
    …integrates with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and ... learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit includes an ML...models like the Llama family, DeepSeek and beyond. The Inference Enablement and Acceleration team works side by side… more
    Amazon (01/06/26)
    - Related Jobs
  • Software Development Engineer AI/ ML

    Amazon (Cupertino, CA)
    …applications. Key job responsibilities * Architect and lead the design of distributed ML serving systems optimized for generative AI workloads * Drive technical ... the boundaries of what's possible in large-scale ML serving. Recent shares: https://github.com/aws-neuron/upstreaming-to-vllm/releases/tag/2.25.0 https://awsdocs-neuron.readthedocs-hosted.com/en/latest/libraries/nxd- inference more
    Amazon (12/21/25)
    - Related Jobs
  • Senior Software Engineer , AI…

    NVIDIA (Santa Clara, CA)
    …highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency. You'll architect ... that pushes the pareto frontier for the field of ML Systems ; survey recent publications and find...theories. + Knowledgeable and passionate about performance engineering in ML frameworks (eg, PyTorch) and inference engines… more
    NVIDIA (01/10/26)
    - Related Jobs
  • Lead Engineer , Inference Platform

    MongoDB (Palo Alto, CA)
    …in multi-tenant environments + 1+ years of experience serving as TL for a large-scale ML inference or training platform SW project **Nice to Have** + Prior ... We're looking for a Lead Engineer , Inference Platform to join our...of experience in managing a technical team focused on ML inference or training infrastructure **Why Join… more
    MongoDB (12/27/25)
    - Related Jobs
  • Senior Software Engineer , Inference

    MongoDB (Palo Alto, CA)
    …for developer-first experiences. As a Senior Engineer , you'll focus on building core systems and services that power model inference at scale. You'll own key ... **About the Role** We're looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic… more
    MongoDB (01/08/26)
    - Related Jobs
  • Senior Software Engineer , AI…

    NVIDIA (CA)
    …benchmarking, automation, and documentation processes to ensure low-latency, robust, and production-ready inference systems on GPU clusters. What we need to see: ... systems , including Rust-based runtime components, for large-scale AI inference workloads. + Implement inference scheduling and deployment solutions… more
    NVIDIA (11/29/25)
    - Related Jobs
  • Lead AI Engineer (FM Hosting, LLM…

    Capital One (San Francisco, CA)
    Lead AI Engineer (FM Hosting, LLM Inference ) **Overview** At Capital One, we are creating responsible and reliable AI systems , changing banking for good. For ... cost, latency, throughput - of large scale production AI systems . + Contribute to the technical vision and the...and supporting AI services + Experience developing AI and ML algorithms or technologies (eg LLM Inference ,… more
    Capital One (11/04/25)
    - Related Jobs
  • Senior GenAI Algorithms Engineer - Model…

    NVIDIA (Santa Clara, CA)
    …open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative generative AI models like LLMs, VLMs, multimodal and ... as large language models (LLM) and diffusion models for maximal inference efficiency using techniques ranging from quantization, speculative decoding, sparsity,… more
    NVIDIA (01/10/26)
    - Related Jobs