- Apple Inc. (San Francisco, CA)
- Senior Software Engineer , Model Inference San Francisco Bay Area, California, United States Software and Services Join Apple Maps to help build the best ... production-ready systems, providing technical guidance and feedback to influence upstream model design. Optimize inference execution across heterogeneous compute… more
- OpenAI (San Francisco, CA)
- A leading AI research company in San Francisco seeks an engineer to optimize their powerful AI models for high-volume production environments. The ideal candidate ... has over 5 years of software engineering experience, strong familiarity with ML architectures, and experience with distributed systems. This role involves collaboration with researchers and focus on performance optimization. Compensation ranges from $325K to… more
- Amazon (San Francisco, CA)
- Senior Software Development Engineer , AI/ML, AWS Neuron, Model Inference Job ID: 3067759 | Amazon.com Services LLC The Annapurna Labs team at Amazon Web ... with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and Acceleration team… more
- Amazon (San Francisco, CA)
- Software Development Engineer , AI/ML, AWS Neuron, Model Inference The Annapurna Labs team at Amazon Web Services (AWS) builds AWS Neuron, the software ... with popular ML frameworks like PyTorch and JAX enabling unparalleled ML inference and training performance. The Inference Enablement and Acceleration team… more
- NVIDIA Corporation (Santa Clara, CA)
- Senior Deep Learning Software Engineer , Inference page is loaded## Senior Deep Learning Software Engineer , Inferencelocations: US, CA, Santa Clara: ... requisition id: JR2002670NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for...vLLM, which are at the forefront of efficient large-scale model serving and inference . You will play… more
- Hamilton Barnes Associates Limited (San Francisco, CA)
- …thousands of H100s, H200s, and B200s, ready to go for experimentation, full-scale model training, or inference . Our client operates high-performance GPU clusters ... with cost-efficient batch inference and expanding into low-latency, real-time inference and custom model hosting. This is a unique chance to join at an early… more
- Menlo Ventures (San Francisco, CA)
- …to new inference features (eg, structured sampling, prompt caching) Supporting inference for new model architectures Analyzing observability data to tune ... to build beneficial AI systems. About the role Our Inference team is responsible for building and maintaining the...by serving our models via the industry's largest compute-agnostic inference deployments. We are responsible for the entire stack… more
- quadric.io, Inc (Burlingame, CA)
- … model deployment for efficient inference ; [3] profile and benchmark the model performance. This senior technical role demands deep knowledge of AI ... conventional C++ DSP and control code. Role: The AI Inference Engineer in Quadric is the key...and/or Electric Engineering. 5+ years of experience in AI/LLM model inference and deployment frameworks/tools experience with… more
- The Association of Technology, Management and Applied… (Morgan Hill, CA)
- …qualifications Minimum 8 years of relevant experience required. Experience in Model Ops and design, software development with proven effectiveness in delivering ... levels within the organization. Experience with deploying models using vLLM/Triton Inference Server Performance Tuning those models and deployment to provide higher… more
- OpenAI (San Francisco, CA)
- …research progression via model inference . About the Role We're looking for a senior engineer to design and build the load balancer that will sit at the ... About the Team Our Inference team brings OpenAI's most capable research and...jobs where requests must stay "sticky" to the same model instance for hours or days and where even… more