• Senior AI - HPC

    NVIDIA (Santa Clara, CA)
    …of experience crafting and operating large scale compute infrastructure. + Experience with AI / HPC job schedulers and orchestrators, such as Slurm, K8s or LSF. ... Applied experience with AI / HPC workflows that use MPI and NCCL. + Proficient in using Linux including Centos/RHEL and/or Ubuntu Linux distributions. A solid… more
    NVIDIA (07/31/25)
    - Related Jobs
  • Senior Observability Architect, AI

    NVIDIA (Santa Clara, CA)
    …, HW, and SW engineering and research teams to define a vision and roadmap for AI / HPC cluster observability. + Architect and lead teams to develop, test, and ... NVIDIA's Hardware Infrastructure organization is seeking a Senior or Princip al Data and Observability Architect....vision and roadmap for distributed observability systems for large-scale AI and HPC clusters and workloads and… more
    NVIDIA (05/15/25)
    - Related Jobs
  • Senior HPC Engineer, Infrastructure…

    NVIDIA (Santa Clara, CA)
    NVIDIA is looking for a Senior HPC Engineer to join its...the team building many of the largest and fastest AI / HPC systems in the world! NVIDIA is ... customers, partners and internal teams to analyze, define, and implement large-scale AI / HPC projects. These efforts include a combination of networking, system… more
    NVIDIA (06/12/25)
    - Related Jobs
  • Senior Site Reliability Engineer,…

    NVIDIA (Santa Clara, CA)
    …a variety of HPC or EDA workloads. + Solid understanding of cluster configuration managements tools such as Ansible. + Proficiency in Perl for maintaining legacy ... NVIDIA is the leader in AI , machine learning and datacenter acceleration. NVIDIA is...and support workload and resource schedulers in a large-scale HPC environment. + Automate Everything: Develop automation scripts to… more
    NVIDIA (07/03/25)
    - Related Jobs
  • Senior Solutions Architect, HPC

    NVIDIA (TX)
    …Do you want to be part of a team that brings new Artificial Intelligence ( AI ) hardware and software technologies to production in customer data centers? As part of ... What you will be doing: + Working with NVIDIA AI Native, Consumer Internet and Enterprise customers on large...on network design, compute/storage and support bring up of server/network/ cluster deployments. You will need to visit customer data… more
    NVIDIA (06/05/25)
    - Related Jobs
  • Senior Solutions Architect - AI

    NVIDIA (MA)
    …an experienced systems architect with an interest in advancing artificial intelligence ( AI ) and high-performance computing ( HPC ) in academic and research ... requires a strong background in building and deploying research computing clusters, deploying AI and HPC workloads, and optimizing system performance at scale.… more
    NVIDIA (06/18/25)
    - Related Jobs
  • Senior Software Engineer, AI

    NVIDIA (Santa Clara, CA)
    We are now looking for a Senior Software Engineer for AI Resiliency. At NVIDIA, we are pushing the boundaries of what's possible in AI . We are currently ... Senior Software Engineer to lead the development of AI software resiliency for the most powerful AI...GPUs. Your expertise will be crucial in driving down cluster downtime towards zero, ensuring that our AI more
    NVIDIA (07/22/25)
    - Related Jobs
  • TS/SCI Systems Engineer Senior

    IBM (Chantilly, VA)
    …Polygraph to support a small standalone system dedicated to high-performance computing ( HPC ) and artificial intelligence ( AI ) workloads. This role demands a ... **Your role and responsibilities** We are seeking an experienced Senior Systems Engineer with US Government Top Secret/SCI security...focusing on the management and optimization of our standalone HPC / AI system. The ideal candidate will manage… more
    IBM (07/30/25)
    - Related Jobs
  • Senior High Performance Computing Engineer

    Amgen (Washington, DC)
    …Join us and transform the lives of patients while transforming your career. ** Senior High Performance Computing Engineer** **What you will do** Let's do this. Let's ... be responsible for the design, integration, and management of high-performance computing ( HPC ) systems that encompass both hardware and software components into the… more
    Amgen (06/26/25)
    - Related Jobs
  • Associate Director, Sr Principal Systems Engineer

    Bristol Myers Squibb (Princeton, NJ)
    …Myers Squibb is looking for an experienced Sr Principal Systems Engineer in HPC / AI infrastructure to work with our technology teams and various stakeholders ... to design, manage, and support cutting-edge HPC / AI infrastructure platforms to serve our community...in the cloud, provide guidance and technical expertise to senior research leaders and scientists, and work to build… more
    Bristol Myers Squibb (07/26/25)
    - Related Jobs