- Red Hat (Sacramento, CA)
- …on Github. As a Machine Learning ... open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings...challenges in model performance and efficiency. Your work with machine learning and high performance computing will… more
- Amazon (Cupertino, CA)
- …and the Trn1 and Inf1 servers that use them. This role is for a software engineer in the Machine Learning Applications (ML Apps) team for AWS Neuron. ... compiler engineers and runtime engineers to create, build and tune distributed inference solutions with Trn1. Experience optimizing inference performance for… more
- Palo Alto Networks (Santa Clara, CA)
- …while ensuring a formidable security posture from development through runtime. As a Principal Machine Learning Inference Engineer , you will serve as ... and long-term strategy of our AI platform - ML inference . Beyond individual contribution, you will lead complex technical...a deep focus on MLOps, ML systems, or productionizing machine learning models at scale. + Expert-level… more
- Amazon (Cupertino, CA)
- …scaling) of new and existing systems experience - Fundamentals of Machine learning and LLMs, their architecture, training and inference lifecycles along with ... learning and GenAI workloads on Amazon's custom machine learning accelerators, Inferentia and Trainium. The...ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement… more
- Amazon (Cupertino, CA)
- …scaling) of new and existing systems experience - Fundamentals of Machine learning and LLMs, their architecture, training and inference lifecycles along with ... learning and GenAI workloads on Amazon's custom machine learning accelerators, Inferentia and Trainium. The...ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement… more
- NVIDIA (Santa Clara, CA)
- …out from the crowd: + Contributions to PyTorch, JAX, vLLM, SGLang, or other machine learning training and inference frameworks. + Hands-on experience ... strategies with open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative generative AI models like… more
- General Motors (Sunnyvale, CA)
- …use cases. Our platform supports the serving of state-of-the-art (SOTA) machine learning models for experimental and bulk inference , with a focus on ... eligible for relocation assistance.** **About the Team:** The ML Inference Platform is part of the AI Compute Platforms...+ 8+ years of industry experience, with focus on machine learning systems or high performance backend… more
- quadric.io, Inc (Burlingame, CA)
- …network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and ... conventional C++ DSP and control code. Role: The AI Inference Engineer in Quadric is the key bridge between the world of AI/LLM models and Quadric unique… more
- Capital One (San Francisco, CA)
- …for good. For years, Capital One has been an industry leader in using machine learning to create real-time, personalized customer experiences. Our investments in ... Lead AI Engineer (FM Hosting, LLM Inference ) **Overview**...world-class talent - along with our deep experience in machine learning - position us to be… more
- Amazon (Cupertino, CA)
- …is the software stack powering AWS Inferentia and Trainium machine learning accelerators, designed to deliver high-performance, low-cost inference at scale. ... The Neuron Serving team develops infrastructure to serve modern machine learning models-including large language models (LLMs) and multimodal workloads-reliably… more