- Google (Seattle, WA)
- …working across the full stack, from low-level hardware acceleration and compiler optimizations to high-level model architecture and production APIs, transforming ... your research expertise into robust, scalable products. + Optimize complex system performance by analyzing and fixing performance bottlenecks, memory inefficiencies, and errors in production systems to meet stringent customer goals. + Elevate engineering… more
- Amazon (Seattle, WA)
- …of Experts, etc. The team works side by side with chip architects, compiler engineers and runtime engineers to deliver performance and accuracy on Neuron devices ... across a range of models such as Llama 3.3 70B, 3.1 405B, DBRX, Mixtral, and so on. Key job responsibilities Responsibilities of this role include adapting latest research in LLM optimization to Neuron chips to extract best performance from both open source as… more
- Amazon (Seattle, WA)
- …designed by Annapurna Labs inside AWS. The Neuron SDK consists of a compiler , runtime, frameworks, and tooling customers need. It's also preinstalled in AWS Deep ... Learning AMIs and Deep Learning Containers for customers to quickly get started with running high performance and cost-effective inference and training. This position is for a Software Engineer for the AWS Neuron SDK team with a deep background in Linux and… more
- Amazon (Seattle, WA)
- …of Experts, etc. The team works side by side with chip architects, compiler engineers and runtime engineers to deliver performance and accuracy on Neuron devices ... across a range of models such as Llama 3.3 70B, 3.1 405B, DBRX, Mixtral, and so on. Key job responsibilities Responsibilities of this role include adapting latest research in LLM optimization to Neuron chips to extract best performance from both open source as… more