• Software Engineering Manager - AI

    Meta (Menlo Park, CA)
    …Qualifications: 7. Experience in leading teams working on high performance computing ( HPC ) and AI /ML systems , including: 8. Communication libraries (eg, ... of Meta AI infrastructure! **Required Skills:** Software Engineering Manager - AI Systems Co-Design Responsibilities: 1. Lead and support the communications… more
    Meta (08/01/25)
    - Related Jobs
  • Senior Performance Engineer - AI

    NVIDIA (Santa Clara, CA)
    …problems of our time. What you'll be doing: + Optimizing the computational performance of the latest innovations in AI that address specific, domain-relevant ... machine learning models in digital biology and beyond + Collaborating with multiple HPC , AI infrastructure, and research teams + Driving the testing and… more
    NVIDIA (09/25/25)
    - Related Jobs
  • Systems Development Eng (AWS Generative…

    Amazon (Cupertino, CA)
    …and operating AWS cloud offerings that enable high performance and scalability in AI /ML and HPC workloads. You are intrigued by the continuous release of ... Want to do industry leading work delivering continuous price performance improvements in the cloud for AI ...have tremendous interest in cloud scale and curious how systems and software decisions impact the user. You insist… more
    Amazon (07/09/25)
    - Related Jobs
  • Senior Math Libraries Engineer - Sparsity…

    NVIDIA (Santa Clara, CA)
    …out from the crowd: + Strong understanding of sparse computations, in particular sparsity in AI and HPC + Good understanding of LLMs, Deep Learning methods and ... to simplify and accelerate computing for unstructured sparsity in DL and HPC . Around the world, leading commercial and academic organizations are revolutionizing … more
    NVIDIA (08/19/25)
    - Related Jobs
  • Engineering Manager - Rack Scale AI

    NVIDIA (Santa Clara, CA)
    …Lead IPP's (Infrastructure, Planning and Process) Cloud Platform Team focused on Rack Scale AI Systems . IPP is a global organization within NVIDIA. This group ... + Work with NVIDIA Product Teams to understand new product requirements including HPC and AI /ML Products. + Collaborate with multi-functional teams, including… more
    NVIDIA (07/29/25)
    - Related Jobs
  • Technical Program Manager, AI Network Infra

    Meta (Menlo Park, CA)
    AI product introductions and AI operations initiatives supporting Meta's growing AI / HPC infrastructure for our Family of Apps . They will be responsible ... deliver on shared goals 10. The ideal candidate will have experience in AI / HPC product development and operations, demonstrated experience in the Network… more
    Meta (08/01/25)
    - Related Jobs
  • Principal AI and ML Infra Software…

    NVIDIA (Santa Clara, CA)
    …Science or related area (or equivalent experience). + 15+ years of demonstrated expertise in AI /ML and HPC tasks and systems . + Hands-on experience in using ... We are seeking a Principal AI and ML Infra Software Engineer, GPU Clusters at...or operating High Performance Computing ( HPC ) grade infrastructure as well as… more
    NVIDIA (08/27/25)
    - Related Jobs
  • Senior AI and ML Storage Infra Software…

    NVIDIA (Santa Clara, CA)
    …in Computer Science or related field, with 6+ years of shown experience in AI /ML and HPC workloads and infrastructure. + Hands-on experience in using or ... people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An...operating High Performance Computing ( HPC ) grade infrastructure as well… more
    NVIDIA (08/08/25)
    - Related Jobs
  • Technical Sourcing Manager, Advanced Thermal…

    Meta (Fremont, CA)
    …and associated system design trade-offs, particularly for AI and High Performance Computing ( HPC ) systems 21. Experience interfacing with internal ... chain organizations related to data center products, infrastructure, rack design, AI , Compute Hardware, or Mechanical Engineering 12. Proven experience building and… more
    Meta (08/01/25)
    - Related Jobs
  • Software Engineer, SystemML - AI Networking

    Meta (Menlo Park, CA)
    …following machine learning/deep learning domains: Distributed ML Training, GPU architecture, ML systems , AI infrastructure, high performance computing, ... large-scale GPU training and inference fleet through an observable, reliable and high- performance distributed AI /GPU communication stack. Currently, one of the… more
    Meta (08/01/25)
    - Related Jobs