• AI / HPC Systems

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...workloads that expects a loss-less fabric interconnect. To improve performance of these systems we constantly look… more
    Meta (06/18/25)
    - Related Jobs
  • Production Systems Engineer, Sustaining

    Meta (Menlo Park, CA)
    …hardware and software components, co-design 15. Experience in developing or debugging AI / HPC systems , performance optimizations, including familiarity ... or supporting production hardware at scale 9. Experience in deploying and productionizing AI / HPC systems and/or related components at scale 10. Experience in… more
    Meta (06/25/25)
    - Related Jobs
  • Senior AI - HPC Cluster Engineer

    NVIDIA (Santa Clara, CA)
    …to work effectively with diverse teams and individuals. + Experience analyzing and tuning performance for a variety of AI / HPC workloads. + Passion for ... GPU compute clusters that run demanding deep learning, high performance computing, and computationally intensive workloads. We seek a...storage systems like Lustre and GPFS for AI / HPC workloads + Familiarity with deep learning… more
    NVIDIA (07/12/25)
    - Related Jobs
  • Senior AI - HPC Storage Engineer

    NVIDIA (Santa Clara, CA)
    …designing and operating large scale storage infrastructure. + Experience analyzing and tuning performance for a variety of AI / HPC workloads. + Experience ... join us today! As a member of the GPU AI / HPC Infrastructure team, you will provide leadership...solutions to enable runs of demanding deep learning, high performance computing, and computationally intensive workloads. We seek an… more
    NVIDIA (05/07/25)
    - Related Jobs
  • Senior Observability Architect, AI

    NVIDIA (Santa Clara, CA)
    …looking for a technical leader to define a vision and roadmap for distributed observability systems for large-scale AI and HPC clusters and workloads and ... and visualization to spectacularly improve efficiency, performance , and productivity of AI and HPC workloads. You will lead technical teams to develop,… more
    NVIDIA (05/15/25)
    - Related Jobs
  • Senior HPC and AI Networking…

    NVIDIA (Santa Clara, CA)
    …fit for you, we'd love to hear from you! NVIDIA is seeking a Senior High Performance Computing ( HPC ) and AI Networking Performance Research and Analysis ... In this exciting role, you will profile and analyze AI workloads on large GPUs and CPUs scale clusters...and platforms, such as HCAs, Switches, CPUs, GPUs, and Systems . You will develop performance analysis tools… more
    NVIDIA (07/11/25)
    - Related Jobs
  • AI Infrastructure Engineer - HPC

    Cisco (San Jose, CA)
    AI Infrastructure Engineer - HPC Apply (https://jobs.cisco.com/jobs/Login?projectId=1443781) + Location:San Jose, California, US + Alternate LocationAnywhere is ... and managing the internal NVIDIA DGX and Cisco-UCS based AI platforms at Cisco. You will provide leadership in...SaltStack, Puppet and/or Chef + Deep understanding of operating systems , computer networks, and high- performance applications. +… more
    Cisco (07/15/25)
    - Related Jobs
  • Senior Solution Architect, HPC

    NVIDIA (Santa Clara, CA)
    …Be Doing: + Primary responsibilities will include building and enabling robust AI / HPC infrastructure for customers + Support operational and reliability aspects ... of large-scale AI clusters, focusing on performance at scale,...in working with customers + Expertise with parallel file systems (eg Lustre, GPFS, BeeGFS, WekaIO) and high-speed interconnects… more
    NVIDIA (06/18/25)
    - Related Jobs
  • Sr. Worldwide Specialist Solutions Architect,…

    Amazon (Santa Clara, CA)
    …computing and its potential to overcome some of the biggest challenges in High Performance Computing ( HPC )? Do you have a unique combination of deep technical ... C++, Python, CUDA, Bash - Deep GPU knowledge in HPC and/or AI /ML frameworks. Preferred Qualifications -...life sciences or related discipline. - Working knowledge of HPC schedulers and distributed/parallel file systems , underlying… more
    Amazon (06/12/25)
    - Related Jobs
  • Senior Software Architect - Deep Learning…

    NVIDIA (Santa Clara, CA)
    …vision? What you will be doing: + Investigate opportunities to improve communication performance by identifying bottlenecks in today's systems . + Design and ... implement new communication technologies to accelerate AI and HPC workloads. + Explore innovative solutions in HW and SW for our next generation platforms as… more
    NVIDIA (05/05/25)
    - Related Jobs