• Senior Software Development Engineer

    NVIDIA (Santa Clara, CA)
    We are now looking for a TensorRT - LLM Software Development Engineer ! NVIDIA is hiring software engineers for its TensorRT - LLM team. Academic and ... core backend software for LLM inference. + Improve the usability of the TensorRT - LLM library and build systems (CMake) What we need to see: + Masters or… more
    NVIDIA (01/10/26)
    - Related Jobs
  • Senior Deep Learning Software…

    NVIDIA (Santa Clara, CA)
    We are now looking for a Senior Deep Learning Software Engineer , LLM Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate ... learning community to implement the latest algorithms for public release in TensorRT LLM , VLLM, SGLang and LLM benchmarks. Identify performance opportunities… more
    NVIDIA (11/25/25)
    - Related Jobs
  • Principal Software Engineer - Large-Scale…

    NVIDIA (Santa Clara, CA)
    …deployment of cutting-edge LLM workloads. We are seeking a Principal Systems Engineer to define the vision and roadmap for memory management of large-scale ... large-scale LLM inference. + Architect and implement deep integrations with leading LLM serving engines (such as vLLM, SGLang, TensorRT - LLM ), with a… more
    NVIDIA (01/10/26)
    - Related Jobs
  • Software Engineer II - AI/ML, AWS Neuron,…

    Amazon (Cupertino, CA)
    …responsible for development, enablement and performance tuning of a wide variety of LLM model families, including massive scale large language models like the Llama ... we're building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews.… more
    Amazon (11/27/25)
    - Related Jobs
  • Senior GenAI Algorithms Engineer

    NVIDIA (Santa Clara, CA)
    …and streamlined deployment strategies with open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative LLMs, ... ( TensorRT Model Optimizer, Megatron-LM, Megatron-Bridge, Nvidia-NeMo, NeMo-AutoModel, TensorRT - LLM ) and open-source frameworks (PyTorch, Hugging Face, vLLM,… more
    NVIDIA (12/18/25)
    - Related Jobs
  • Senior AI Engineer , NeMo Retriever…

    NVIDIA (Santa Clara, CA)
    …on pre-optimized inference engines from NVIDIA and the community, including NVIDIA TensorRT and TensorRT - LLM , NIM microservices optimize response latency ... The NeMo Retriever team is looking for an AI Engineer to join our team, focusing on the intersection...deployments, etc. + Familiarity with ML libraries, especially PyTorch, TensorRT , or TensorRT - LLM . + Excellent… more
    NVIDIA (01/10/26)
    - Related Jobs
  • Senior DL Algorithms Engineer

    NVIDIA (Santa Clara, CA)
    We are now looking for a Senior DL Algorithms Engineer ! We are seeking a highly skilled Deep Learning Algorithms Engineer with hands-on experience optimizing ... deploy, and optimize models for efficient inference using frameworks such as TensorRT , TensorRT - LLM , vLLM, and SGLang. + Understand, analyze, profile, and… more
    NVIDIA (11/06/25)
    - Related Jobs
  • Senior Deep Learning Algorithm…

    NVIDIA (Santa Clara, CA)
    We are now looking for a Senior DL Algorithms Engineer ! We are seeking a highly skilled Deep Learning Algorithms Engineer with hands-on experience optimizing ... inference. + Convert and deploy models using frameworks such as TensorRT and TensorRT - LLM + Understand, analyze, profile, and optimize performance of… more
    NVIDIA (11/06/25)
    - Related Jobs
  • Senior Staff Machine Learning…

    NVIDIA (Santa Clara, CA)
    …Today, we are increasingly known as "the AI computing company." We are seeking a Senior Staff Machine Learning Engineer to join our Enterprise AI team and build ... frameworks such as PyTorch or TensorFlow; familiarity with CUDA-accelerated libraries (eg, TensorRT - LLM ) is a plus. + Proven track record to take a significant… more
    NVIDIA (01/12/26)
    - Related Jobs
  • AI Senior Staff Systems Engineer

    Cadence Design Systems, Inc. (San Jose, CA)
    …quantization, distillation, and using high-performance serving frameworks (eg, vLLM, TGI, TensorRT - LLM ) to maximize inference throughput and minimize latency. + ... implementing CI/CD pipelines for AI model development. + Advanced LLM Deployment & Optimization: Lead the deployment, serving, and...AI infrastructure. Proven track record as a Principal or Senior Staff Engineer . + Expert-level knowledge of… more
    Cadence Design Systems, Inc. (12/29/25)
    - Related Jobs