- NVIDIA (Santa Clara, CA)
- …open-sourced inference frameworks. Seeking a Senior Deep Learning Algorithms Engineer to improve innovative generative AI models like LLMs, VLMs, multimodal and ... as large language models (LLM) and diffusion models for maximal inference efficiency using techniques ranging from quantization, speculative decoding, sparsity,… more
- Amazon (Cupertino, CA)
- …The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on ... Labs team at AWS, is the backbone for accelerating deep learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit… more
- Amazon (Cupertino, CA)
- …The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software development kit used to accelerate deep learning and GenAI workloads on ... Labs team at AWS, is the backbone for accelerating deep learning and GenAI workloads on Amazon's Inferentia and Trainium ML accelerators. This comprehensive toolkit… more
- NVIDIA (Santa Clara, CA)
- NVIDIA is now looking for AI Software Engineers for our GenAI Frameworks (Megatron Core (https://github.com/NVIDIA/Megatron-LM/tree/main/megatron/core) and NeMo ... and Multimodal (MM) foundation model pretraining and post-training. Our GenAI Frameworks provide end-to-end model training, including pretraining, alignment,… more
- Red Hat (Sacramento, CA)
- …bring the power of open-source LLMs and vLLM to every enterprise. Red Hat Inference team accelerates AI for the enterprise and brings operational simplicity to ... GenAI deployments. As leading developers, maintainers of the vLLM project, and inventors of state-of-the-art techniques for model quantization and sparsification, our… more
- NVIDIA (Santa Clara, CA)
- We are now looking for a Senior Deep Learning Software Engineer , LLM Performance! NVIDIA is seeking an experienced Deep Learning Engineer passionate about ... set of teams involving performance modeling, performance analysis, kernel development and inference software development. What you'll be doing: + Performance… more
- DataRobot (San Francisco, CA)
- …that makes sense for their business - today and in the future. As a Principal Software Engineer for Generative AI at DataRobot, you will be the technical anchor ... & Libraries, LLM Onboarding,Tools, Multi-Agent Evaluations, Multimodality, etc.) and GenAI systems (eg Inference optimization, Distributed Training, Finetuning,… more
- Amazon (Palo Alto, CA)
- …strong entrepreneurial spirit and bias for action. We are looking for a talented Software Engineer with a strong background in machine learning engineering to ... the future of advertising. Key job responsibilities As a Software Development Engineer in Machine Learning, you...inference systems. * Pioneer the development of LLM inference infrastructure to support next-generation GenAI workloads… more
- Meta (Menlo Park, CA)
- …space of GenAI /LLM scaling reliability and performance. **Required Skills:** Software Engineer , SystemML - Scaling / Performance Responsibilities: 1. ... role, you will be a member of the Network.AI Software team and part of the bigger DC networking...and innovations to leverage our large-scale GPU training and inference fleet through an observable, reliable and high-performance distributed… more
- Walmart (Sunnyvale, CA)
- …Prometheus) and distributed tracing for actionable insights. + Optimize LLM inference (prompt caching, quantization, retrieval filtering) and system throughput. + ... engineering playbooks. + Drive experimentation (A/B testing, multi-armed bandits, causal inference ) and champion innovation. + **Product Integration & Delivery** +… more