• AI / HPC Systems

    Meta (Menlo Park, CA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI / HPC Systems Performance Engineer Responsibilities: 1. ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...workloads that expects a loss-less fabric interconnect. To improve performance of these systems we constantly look… more
    Meta (11/18/25)
    - Related Jobs
  • Senior AI and ML HPC Cluster…

    NVIDIA (Santa Clara, CA)
    …with AI / HPC workflows that use MPI + Experience analyzing and tuning performance for a variety of AI / HPC workloads. + Passion for continual learning ... GPU compute clusters that run demanding deep learning, high performance computing, and computationally intensive workloads. We seek a...storage systems like Lustre and GPFS for AI / HPC workloads + Familiarity with deep learning… more
    NVIDIA (10/19/25)
    - Related Jobs
  • Senior AI - HPC Cluster Engineer…

    NVIDIA (Santa Clara, CA)
    …analyzing and tuning performance for a variety of AI / HPC workloads. Excellent problem-solving to analyze complex systems , identify bottlenecks, and ... and implement GPU compute clusters for deep learning and high- performance computing. What you'll be doing: + Provide leadership...storage systems like Lustre and GPFS for AI / HPC workload. Experience working with deep learning… more
    NVIDIA (10/30/25)
    - Related Jobs
  • Senior Engineer - AI and HPC

    NVIDIA (Santa Clara, CA)
    …, time-series databases, and large-scale monitoring systems . + Familiarity with AI /ML pipelines, GPU-based workloads , and HPC environments. + Experience ... teams to optimize observability for model training, inference workloads, and HPC performance . + Leverage machine learning and statistical techniques… more
    NVIDIA (10/22/25)
    - Related Jobs
  • HPC Systems Engineer

    University of Pennsylvania (Philadelphia, PA)
    …Computing Center (PARCC) core facility is seeking a highly qualified and motivated High Performance Computing ( HPC ) Systems Engineer to join the team. ... PARCC's main cluster (Betty), delivers HPC , data-intensive science and Artificial Intelligence ( AI )...of the research community. + Optimize, monitor, and troubleshoot HPC file systems for performance more
    University of Pennsylvania (10/11/25)
    - Related Jobs
  • Sr. HPC Systems Engineer (IT@JH…

    Johns Hopkins University (Baltimore, MD)
    …research mission. This position focuses on the reliable operation, configuration, and optimization of HPC and AI systems , including multi-node CPU and GPU ... in Linux systems administration, configuration management (Ansible, Puppet, or Salt), performance monitoring, and tuning for HPC workloads. + Experience with… more
    Johns Hopkins University (11/22/25)
    - Related Jobs
  • AI / HPC System Performance

    Meta (New York, NY)
    …and host networking, communications lib and scheduling infrastructure. **Required Skills:** AI / HPC System Performance Engineer Responsibilities: 1. Lead ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially...a loss-less fabric interconnect with minimal latency. To improve performance of these systems we constantly look… more
    Meta (11/06/25)
    - Related Jobs
  • Senior HPC and AI Networking…

    NVIDIA (Santa Clara, CA)
    …fit for you, we'd love to hear from you! NVIDIA is seeking a Senior High Performance Computing ( HPC ) and AI Networking Performance Research and Analysis ... In this exciting role, you will profile and analyze AI workloads on large GPUs and CPUs scale clusters...and platforms, such as HCAs, Switches, CPUs, GPUs, and Systems . You will develop performance analysis tools… more
    NVIDIA (09/03/25)
    - Related Jobs
  • AI / HPC Network Engineering Manager

    Meta (Menlo Park, CA)
    …These workloads expect a loss-less fabric interconnect with minimal latency. To improve performance of these systems we constantly look for opportunities across ... host networking, communications lib and scheduling infrastructure. **Required Skills:** AI / HPC Network Engineering Manager Responsibilities: 1. Manage engineers… more
    Meta (10/16/25)
    - Related Jobs
  • HPC Sr. Scientific Software Engineer (IT@JH…

    Johns Hopkins University (Baltimore, MD)
    …Deployment and Design** + Develop and refine deployment strategies for scientific software on HPC and AI systems . + Design computational workflows, selecting ... Agents). _Performance Optimization_ + Analyze and optimize the performance of AI models and HPC...Ensure compliance with security and regulatory standards for all HPC and AI systems . _In… more
    Johns Hopkins University (11/21/25)
    - Related Jobs