• Sr. Research Engineer, Machine Learning, AGI…

    Amazon (Bellevue, WA)
    …post-training multimodal LLMs. - Scale training of models on hyper large GPU and AWS Trainium clusters - Optimize training workflows using distributed ... training/parallelism techniques - Optimize low-level details of the training stack, including CUDA kernels, communication collectives, network I/O. - Utilize, build and extend upon industry leading frameworks (NeMo, Megatron Core, PyTorch, Jax, vLLM, TRT, etc)… more
    Amazon (07/26/25)
    - Related Jobs
  • Sr. Software Development Engineer- ML Engineer,…

    Amazon (Seattle, WA)
    …as data, tensor, model, and pipeline parallelism. - Monitor and optimize GPU memory and throughput for training large models efficiently. - Collaborate ... cross-functionally with research, data infra teams to integrate new models and features - Deep understanding of LLM algorithm and deep learning framework like PyTorch - Mathematics and Statistics: Strong understanding of linear algebra, calculus, probability,… more
    Amazon (07/26/25)
    - Related Jobs
  • EFA Network Software Engineer, EFA Software Team

    Amazon (Seattle, WA)
    …multiple projects written in C, our team enables customers to network thousands of GPU and CPU instance types to handle the toughest clustered workloads. Be a part ... of a dynamic, fast-paced group that has a big impact every day on the hottest companies doing AI and HPC today. Key job responsibilities You will write the highest-performing code in C for multiple open source projects supporting EFA, such as Libfabric and… more
    Amazon (07/25/25)
    - Related Jobs
  • Research Scientist

    Meta (Bellevue, WA)
    …best enterprise modern parallel environments: distributed clusters, multicore SMP, or GPU 21. 9. Developing highly scalable classifiers and tools leveraging machine ... learning, statistics, regression, rules-based models, or mathematical models 22. 10. Java, C++, Perl, PHP, or Python **Public Compensation:** $184,695/year to $200,200/year + bonus + equity + benefits **Industry:** Internet **Equal Opportunity:** Meta is proud… more
    Meta (07/24/25)
    - Related Jobs
  • Senior Firmware Engineer

    NVIDIA (Seattle, WA)
    NVIDIA's invention of the GPU fueled the PC gaming market. The company's groundbreaking work in accelerated computing-a supercharged form of computing at the ... intersection of computer graphics, high performance computing and AI-is reshaping industries, such as transportation, healthcare and manufacturing, and fueling the growth of others. In 2020, NVIDIA acquired Mellanox, a leading supplier of end-to-end Ethernet… more
    NVIDIA (07/23/25)
    - Related Jobs
  • Software Development Manager, Neuron Tools,…

    Amazon (Seattle, WA)
    …large models, working with Pytorch and/or Tensorflow using large distributed fleets of GPU or other accelerated systems. - * Experience with Linux distributions such ... as Ubuntu or CentOS, kernel development, and tooling such as perf and gdb. - * Experience with performance profiling, tracing, and analysis of AI training/inference applications. - * Experience with large scale, distributed AI training/inference applications,… more
    Amazon (07/22/25)
    - Related Jobs
  • Senior Researcher - CoreAI

    Microsoft Corporation (Redmond, WA)
    …continual pre-training, large-scale deep reinforcement learning running on extensive GPU resources, and significant efforts to curate and synthesize training ... data. In addition, the team employs various fine-tuning approaches to support both research and product development. The team also develops advanced AI technologies that integrate language and multi-modality for a range of Microsoft products. The team is… more
    Microsoft Corporation (07/22/25)
    - Related Jobs
  • Research Engineer - Conversational AI - Reality…

    Meta (Redmond, WA)
    …to best exploit modern parallel environments (eg distributed clusters, multicore SMP, and GPU ) 5. Work with a large and globally distributed team 6. Contribute to ... publications and open-sourcing efforts **Minimum Qualifications:** Minimum Qualifications: 7. Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience 8. Research experience in machine learning,… more
    Meta (07/22/25)
    - Related Jobs
  • Research Scientist

    Meta (Bellevue, WA)
    …best enterprise modern parallel environments: distributed clusters, multicore SMP, or GPU 22. 10. -Developing highly scalable classifiers and tools leveraging ... machine learning, statistics, regression, rules-based models, or mathematical models and 23. 11. -Java, C++, Perl, PHP, or Python. **Public Compensation:** $186,437/year to $200,200/year + bonus + equity + benefits **Industry:** Internet **Equal Opportunity:**… more
    Meta (07/19/25)
    - Related Jobs
  • Senior Software Engineer - Generative AI, AGI…

    Amazon (Seattle, WA)
    …equivalent - Experience with Large Language Model inference - Experience with GPU programming (eg TensorRT-LLM) or Amazon AI chip programming (Trainium) - Experience ... with Python, PyTorch, and C++ programming, particularly performance optimization Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status. Our inclusive culture… more
    Amazon (07/18/25)
    - Related Jobs