• Senior Software Architect - Deep Learning…

    NVIDIA (Santa Clara, CA)
    …vision? What you will be doing: + Investigate opportunities to improve communication performance by identifying bottlenecks in today's systems . + Design and ... implement new communication technologies to accelerate AI and HPC workloads. + Explore innovative solutions in HW and SW for our next generation platforms as… more
    NVIDIA (07/29/25)
    - Related Jobs
  • Senior Solutions Architect, HPC

    NVIDIA (Santa Clara, CA)
    …Machine Learning ecosystems. You'll be called on to help architect and scale high- performance , distributed AI infrastructure on-prem or in the cloud built with ... profilers/ performance analysis tools (NSys). + Familiarity with NVIDIA systems /SDKs (eg CUDA), NVIDIA Networking technologies (eg, RoCE, InfiniBand), Switch… more
    NVIDIA (08/30/25)
    - Related Jobs
  • Sr. Software Development Engineer, HPC /ML…

    Amazon (Cupertino, CA)
    Description We are seeking an experienced engineer to work on distributed AI /ML systems . This role involves working on collective operations - the fundamental ... operations that enable AI to scale across multiple accelerators & servers. Most...building networking solutions that for Machine Learning (ML) and High- Performance Computing ( HPC ) workloads on AWS. We… more
    Amazon (08/11/25)
    - Related Jobs
  • Senior Software Engineer - HPC

    NVIDIA (Santa Clara, CA)
    …long term maintenance strategy. What you'll be doing: + Design highly available and scalable systems to meet the demands of our HPC clusters + Evaluate new and ... graphics, and revolutionized parallel computing. More recently, GPU deep learning ignited modern AI and enabled the next era of computing. NVIDIA is a "learning… more
    NVIDIA (08/27/25)
    - Related Jobs
  • Senior HPC Architect

    NVIDIA (Santa Clara, CA)
    …improved workflows and develop new, leading differentiated solutions. You will interact with HPC , OS, GPU compute, and systems specialist to architect, develop ... parallel computing. More recently, GPU deep learning ignited modern AI - the next era of computing. NVIDIA is...looking for an outstanding hands-on architect/engineer for a Senior HPC architect role to support deployment and bringup of… more
    NVIDIA (07/09/25)
    - Related Jobs
  • HPC Middleware Developer

    NVIDIA (Santa Clara, CA)
    …Networking Protocols InfiniBand, Ethernet + Knowledge in computer architecture and operating systems + Experience in performance optimizations + MSc or ... We are now looking for a senior HPC software engineer. As a member of our the High Performance Computing Software development team, you will be responsible for… more
    NVIDIA (09/29/25)
    - Related Jobs
  • Senior GPU and HPC Infrastructure Engineer…

    NVIDIA (Santa Clara, CA)
    …, and excellent communication and planning abilities. Experience working with High Performance Computing ( HPC ), GPUs, and high- performance networking (RDMA, ... of Linux system administration and management. + Understanding of cluster management systems (Kubernetes, SLURM) + Understanding of performance , security and… more
    NVIDIA (07/10/25)
    - Related Jobs
  • Senior ML Platform Engineer, AI

    NVIDIA (Santa Clara, CA)
    …with AI / HPC workflows that use MPI + Experience analyzing and tuning performance for a variety of AI / HPC workloads. + Passion for continual learning ... GPU compute clusters that run demanding deep learning, high performance computing, and computationally intensive workloads. We seek a...storage systems like Lustre and GPFS for AI / HPC workloads + Familiarity with deep learning… more
    NVIDIA (08/21/25)
    - Related Jobs
  • Research Scientist, AI & Systems

    Meta (Sunnyvale, CA)
    …on existing accelerator systems and guiding the future of models and AI HW at Meta. This drives improved performance , new model architectures and ... the following areas: Accelerators/GPU architectures, High Performance Computing ( HPC ), Machine Learning Compilers, Training/Inference ML Systems , Model… more
    Meta (08/23/25)
    - Related Jobs
  • Software Manager, AI Infrastructure System

    NVIDIA (Santa Clara, CA)
    …maintain infrastructure and large-scale applications for LLM-based solutions. Optimize these systems for performance , scalability, reliability, and secure data ... Strong technical background in cloud/distributed infrastructure + Experience debugging functional and performance issues in HPC GPU clusters + Background in… more
    NVIDIA (09/30/25)
    - Related Jobs