• Senior System Software Engineer , NCCL…

    NVIDIA (Santa Clara, CA)
    …We deliver communication runtimes like NCCL and NVSHMEM for Deep Learning and HPC applications. We are looking for a motivated Partner Enablement Engineer ... guide our key partners and customers with NCCL. Most DL/ HPC applications run on large clusters with high-speed networking...Develop tools and automation to isolate issues on new systems and platforms, including cloud platforms (Azure, AWS, GCP,… more
    NVIDIA (04/22/25)
    - Related Jobs
  • Software Engineer , Accelerator Solutions…

    Meta (Menlo Park, CA)
    …Qualifications:** Preferred Qualifications: 15. Full-stack experience and understanding of AI/ HPC systems , from hardware and infrastructure through the ... hardware requirements and specifications (eg, configuring hardware components, GPU, memory, network for AI/ HPC workloads). 18. Understanding of the transport… more
    Meta (05/01/25)
    - Related Jobs
  • Senior Technical Marketing Engineer - AI…

    NVIDIA (Santa Clara, CA)
    …you can make a lasting impact on the world. As a Senior Technical Marketing Engineer for AI Infrastructure, you will join a dedicated team that is passionate about ... advancement of datacenter GPUs and large scale GPU computing systems . What you will be doing: + Evaluate and...+ Proficiency in Python and C++ for AI and HPC applications. + Experience using large scale multi node… more
    NVIDIA (04/30/25)
    - Related Jobs
  • Analytics DevOps and Platform Engineer

    UCLA Health (Los Angeles, CA)
    …UCLA Health IT is looking for an outstanding Analytics DevOps and Platform Engineer , (IT Architect), to join the Solutions Architecture and Engineering (SAE) group. ... will possess a well-rounded skillset encompassing software development, knowledge of HPC and Citrix environments, and relevant cloud certifications. We are looking… more
    UCLA Health (02/20/25)
    - Related Jobs
  • Sr Staff Engineer , ML Infrastructure…

    LinkedIn (Mountain View, CA)
    …industry experience. 8+ years of experience designing and managing large-scale, distributed systems or HPC environments, with at least 3+ years focused ... LinkedIn is the world's largest professional network , built to create economic opportunity for every...About the Role We are seeking a Senior Staff Engineer to design, build, and maintain our large-scale GPU… more
    LinkedIn (04/18/25)
    - Related Jobs
  • Senior Software Engineer , GPU…

    NVIDIA (Santa Clara, CA)
    …wave of artificial intelligence. We are looking for a highly motivated senior software engineer for an exciting role in our communication libraries and network ... crew that develops and maintains software for complex heterogeneous computing systems that power disruptive products in High Performance Computing and Deep… more
    NVIDIA (05/02/25)
    - Related Jobs
  • Software Engineer , SystemML - AI…

    Meta (Menlo Park, CA)
    …Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on ... (eg Large-Scale GenAI/LLM training) from the trainer down to the inter-GPU and network communication layer. And we are seeking for engineers to work on the… more
    Meta (04/22/25)
    - Related Jobs
  • Software Engineer , SystemML - Scaling…

    Meta (Menlo Park, CA)
    **Summary:** In this role, you will be a member of the Network .AI Software team and part of the bigger DC networking organization. The team develops and owns the ... Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on… more
    Meta (04/18/25)
    - Related Jobs
  • Software Engineer , SystemML - AI…

    Meta (Menlo Park, CA)
    …Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on ... (eg Large-Scale GenAI/LLM training) from the trainer down to the inter-GPU and network communication layer. And we are seeking for engineers to work on the… more
    Meta (03/21/25)
    - Related Jobs
  • Software Engineer - Datacenter networking

    Meta (Menlo Park, CA)
    …Meta's global data center networks. Our work covers the entire network lifecycle, including hardware development, capacity planning, distributed and centralized ... control systems , modeling/provisioning/automation, monitoring/troubleshooting/analytics, and simulation/design/failure analysis.We are actively seeking Software… more
    Meta (04/18/25)
    - Related Jobs