• Senior AI and ML HPC Cluster Engineer

    NVIDIA (Santa Clara, CA)
    …models + Familiarity with InfiniBand with IPoIB and RDMA + Understanding of fast, distributed storage systems like Lustre and GPFS for AI/HPC workloads + ... + Provide leadership and strategic guidance on the management of large-scale HPC systems including the deployment of compute, networking, and storage. + Develop and… more
    NVIDIA (01/10/26)
    - Related Jobs
  • Senior Product Manager - Datacenter GPU

    NVIDIA (Santa Clara, CA)
    …of GPU/accelerated compute architectures and their contributions to AI, HPC, and distributed storage systems is necessary. + Experience with storage, security, ... networking, and high-performance computing workflows is also required. + Shown success in moving products from concept to launch and broad customer adoption is important. + Excellent interpersonal skills and cross-functional collaboration abilities are needed.… more
    NVIDIA (01/10/26)
    - Related Jobs
  • Software Developer Engineer - iOS, Ring

    Amazon (Hawthorne, CA)
    …opportunity to contribute your creative ideas and energy, working with world-class experts, distributed cloud systems and home security devices. About the team ... design or architecture (design patterns, reliability and scaling) of new and existing systems experience - Experience as a mentor, tech lead or leading an… more
    Amazon (01/09/26)
    - Related Jobs
  • Staff Software Engineer

    Safran Passenger Innovations (Brea, CA)
    …I2C, SPI, MDIO, CAN o Bootloaders (U-Boot) o Highly available, fault-tolerant, distributed , or clustered systems development o Audio/video (A/V) device drivers, ... with one or more of the following: o UNIX/Linux or embedded operating systems using C/C+o Windows using C/C++/C#, .Net, web programming, JavaScript, ASP, SQL o… more
    Safran Passenger Innovations (01/03/26)
    - Related Jobs
  • Software Engineer, Data Center, Power Modeling

    Google (Sunnyvale, CA)
    …of experience in software development. + 2 years of experience building and managing distributed software systems and infrastructure. + 1 year of experience with ... front-end development, angular. + Familiarity with managing Google specific productions systems (Compute infrastructure Corp, etc.). + Excellent coding skills in… more
    Google (12/20/25)
    - Related Jobs
  • Staff Technical Program Manager, Infrastructure

    LinkedIn (Mountain View, CA)
    …to store, access/query, process and manage data. These include massive-scale distributed data systems , pub/sub and streaming technologies, DataLake storage ... and batch data processing systems . In addition to strong technical knowledge, the ideal candidate should thrive in a fast-paced rapidly evolving environment, be… more
    LinkedIn (12/03/25)
    - Related Jobs
  • Software Development Engineer AI/ML, Inference…

    Amazon (Cupertino, CA)
    …AI applications. Key job responsibilities * Architect and lead the design of distributed ML serving systems optimized for generative AI workloads * Drive ... focus on developing model-agnostic inference innovations, including disaggregated serving, distributed KV cache management, CPU offloading, and container-native solutions.… more
    Amazon (12/21/25)
    - Related Jobs
  • Staff Full Stack Software Engineer, Climate…

    Google (Mountain View, CA)
    …experience with relational databases (BigQuery, PostgreSQL), containerization (Docker) and distributed compute management systems (CloudRun, Kubernetes) + ... Stack Engineer, you will be the architect and developer of our distributed modeling and design infrastructure. This includes simulations, ML models, optimization… more
    Google (12/21/25)
    - Related Jobs
  • Staff Embedded Software Engineer

    General Motors (Mountain View, CA)
    …powers hundreds of test benches and work at the intersection of embedded systems and large-scale distributed infrastructure. **What You'll Do** + Develop Nomad ... Give You a Competitive Edge (Preferred Qualifications)** + Experience building agent-based systems for distributed infrastructure or edge device orchestration. +… more
    General Motors (12/03/25)
    - Related Jobs
  • Sr. Software Engineer- AI/ML, AWS Neuron…

    Amazon (Cupertino, CA)
    …well as Stable Diffusion, Vision Transformers (ViT) and many more. The ML Distributed Training team works side by side with chip architects, compiler engineers and ... runtime engineers to create, build and tune distributed training solutions with Trainium instances. Experience with training these large models using Python is a… more
    Amazon (12/19/25)
    - Related Jobs