• Senior AI - HPC

    NVIDIA (Santa Clara, CA)
    …intelligence. Make the choice to join us today! As a member of the GPU AI / HPC Infrastructure team, you will provide leadership in the design and implementation ... years of experience designing and operating large scale compute infrastructure + Experience with AI / HPC advanced job schedulers, such as Slurm, K8s, RTDA or LSF… more
    NVIDIA (04/02/25)
    - Related Jobs
  • Senior Software Developer, HPC

    NVIDIA (Santa Clara, CA)
    …fueled by great technology-and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU ... integration and bare-metal provisioning related functionality in our Linux-based cluster management software environment. NVIDIA's Bright Cluster Manager… more
    NVIDIA (04/14/25)
    - Related Jobs
  • Senior AI - HPC Storage…

    NVIDIA (Santa Clara, CA)
    …solutions on any of the leading Cloud environment [AWS, Azure or GCP] + Experience with AI / HPC cluster job schedulers such as SLURM, LSF + In depth ... InfiniBand with IBOIP and RDMA + Background with Software Defined Networking and AI / HPC cluster networking + Familiarity with deep learning frameworks like… more
    NVIDIA (02/05/25)
    - Related Jobs
  • Senior Observability Architect, AI

    NVIDIA (Santa Clara, CA)
    …, HW, and SW engineering and research teams to define a vision and roadmap for AI / HPC cluster observability. + Architect and lead teams to develop, test, and ... NVIDIA's Hardware Infrastructure organization is seeking a Senior or Princip al Data and Observability Architect....vision and roadmap for distributed observability systems for large-scale AI and HPC clusters and workloads and… more
    NVIDIA (02/13/25)
    - Related Jobs
  • Sr. HPC Architect - Hybrid

    Caris Life Sciences (Irving, TX)
    …than yourself, Caris is where your impact begins.** **Position Summary** A Senior HPC Architect is responsible for designing and optimizing high-performance ... as their DNA. Backed by cutting-edge molecular science and AI , we ask ourselves every day: _"What would I...for parallel processing to leverage the power of the HPC cluster . + User Support: + Providing… more
    Caris Life Sciences (03/25/25)
    - Related Jobs
  • Senior Site Reliability Engineer,…

    NVIDIA (Santa Clara, CA)
    …a variety of HPC or EDA workloads. + Solid understanding of cluster configuration managements tools such as Ansible. + Proficiency in Perl for maintaining legacy ... NVIDIA is the leader in AI , machine learning and datacenter acceleration. NVIDIA is...and support workload and resource schedulers in a large-scale HPC environment. + Automate Everything: Develop automation scripts to… more
    NVIDIA (04/04/25)
    - Related Jobs
  • Senior Site Reliability Engineer…

    NVIDIA (Santa Clara, CA)
    …Make the choice, join our diverse team today! As a member of the GPU AI / HPC Infrastructure team, you will provide leadership in the design and implementation of ... You will also be maintaining and building deep learning AI - HPC GPU clusters at scale and supporting...cluster . + Deep understanding of GPU computing and AI infrastructure. + Passion for solving complex technical challenges… more
    NVIDIA (03/26/25)
    - Related Jobs
  • Senior Software Engineer, AI

    NVIDIA (Santa Clara, CA)
    We are now looking for a Senior Software Engineer for AI Resiliency. At NVIDIA, we are pushing the boundaries of what's possible in AI . We are currently ... Senior Software Engineer to lead the development of AI software resiliency for the most powerful AI...GPUs. Your expertise will be crucial in driving down cluster downtime towards zero, ensuring that our AI more
    NVIDIA (03/19/25)
    - Related Jobs
  • Advanced Technology Senior Software…

    Wells Fargo (West Des Moines, IA)
    …with Parallel I/O. Optimization of memory and data storage. + 1 year experience with Cluster HPC , HPC schedulers and familiarity with cloud-based HPC ... **About this role:** We are seeking a High-Performance Computing ( HPC ) Engineer with experience in Machine Learning to optimize and scale AI /ML workloads. The… more
    Wells Fargo (04/23/25)
    - Related Jobs
  • Associate Director, Sr Principal Systems Engineer

    Bristol Myers Squibb (Princeton, NJ)
    …Myers Squibb is looking for an experienced Sr Principal Systems Engineer in HPC / AI infrastructure to work with our technology teams and various stakeholders ... to design, manage, and support cutting-edge HPC / AI infrastructure platforms to serve our community...in the cloud, provide guidance and technical expertise to senior research leaders and scientists, and work to build… more
    Bristol Myers Squibb (04/25/25)
    - Related Jobs