• ML Acceleration / Framework Engineer…

    Amazon (Seattle, WA)
    …Machine Learning Engineer on one of our AWS Neuron teams: - The ML Distributed Training team works side by side with chip architects, compiler engineers and runtime ... engineers to create, build and tune distributed training solutions with Trainium instances. Experience with training these large models using Python is a must. FSDP… more
    Amazon (07/15/25)
    - Related Jobs
  • Sr. Software Engineer- AI/ML, AWS Neuron…

    Amazon (Seattle, WA)
    …well as Stable Diffusion, Vision Transformers (ViT) and many more. The ML Distributed Training team works side by side with chip architects, compiler engineers and ... runtime engineers to create, build and tune distributed training solutions with Trainium instances. Experience with training these large models using Python is a… more
    Amazon (07/15/25)
    - Related Jobs
  • Software Dev Engineer III, AWS Distributed

    Amazon (Seattle, WA)
    …you will work on the storage component of the Aurora DSQL Data Plane. Distributed data management is at the heart of AWS database services and is responsible ... highly scalable performance. We are building and operating large scale, distributed , fault-tolerant data and transaction management solutions using specialized data… more
    Amazon (06/21/25)
    - Related Jobs
  • Sr. Software Engineer- AI/ML, AWS Neuron…

    Amazon (Seattle, WA)
    …compiler engineers and runtime engineers to create , build and tune distributed training solutions with Trn1. Experience training these large models using Python ... is a must. FSDP, Deepspeed and other distributed training libraries are central to this and extending...responsibilities This role will help lead the efforts building distributed training and inference support into Pytorch, Tensorflow, Jax… more
    Amazon (07/16/25)
    - Related Jobs
  • Senior Engineering Manager, Google…

    Google (Kirkland, WA)
    …+ 8 years of experience building and developing large-scale infrastructure, distributed systems or networks, or experience with compute technologies, storage, or ... such as information retrieval, artificial intelligence, natural language processing, distributed computing, large-scale system design, networking, security, data compression,… more
    Google (08/08/25)
    - Related Jobs
  • Software Development Manager, AWS…

    Amazon (Seattle, WA)
    Description Amazon Aurora DSQL is a serverless, distributed SQL database with virtually unlimited scale, highest availability, and zero infrastructure management. ... coaching a high performing team of engineers with expertise in both low-level and distributed sysetms You be responsible for the full development life cycle and and… more
    Amazon (07/18/25)
    - Related Jobs
  • Software Development Engineer, AWS…

    Amazon (Seattle, WA)
    …base. You'll bring a passion for innovation, data, search, analytics, and distributed systems. You'll also: - Solve challenging technical problems, often ones not ... solved before, at every layer of the stack. - Design, implement, test, deploy and maintain innovative software solutions to transform service performance, durability, cost, and security. - Build high-quality, highly available, always-on products. - Research… more
    Amazon (07/18/25)
    - Related Jobs
  • Principal Engineer, VCF Cluster Management Team

    Broadcom (Bellevue, WA)
    …solutions. We are dedicated to building robust, scalable, and high-performance distributed systems that empower enterprises to achieve their digital transformation ... for defining the technical vision, architecting, and leading the implementation of complex distributed systems that are central to our VCF offerings. You will work… more
    Broadcom (07/29/25)
    - Related Jobs
  • Manager III, Software Dev

    Amazon (Seattle, WA)
    …development, testing, deployment, and delivery of large-scale, multi-tiered, distributed software applications, systems, platforms, services or technologies using ... Java, C++, service-oriented architecture, and distributed programming. Provide technical leadership and project management for all aspects of the software… more
    Amazon (08/08/25)
    - Related Jobs
  • Principal Product Manager (Compute)

    DataRobot (Olympia, WA)
    …power agentic AI at DataRobot. Your mission is to deliver a world-class distributed compute platform that enables our customers to tackle complex challenges, from ... for DataRobot's core compute platform, encompassing general-purpose compute, specialized distributed compute for AI model training, and low-latency inference serving… more
    DataRobot (05/30/25)
    - Related Jobs