• Principal Data Platform Architect

    NVIDIA (Santa Clara, CA)
    …for distributed data platform and observability systems for large-scale AI and HPC clusters and workloads and guide implementation towards this vision. You will ... spectacularly improve efficiency, performance, and productivity of AI and HPC workloads. You will lead technical teams to develop,...research teams to define a vision and roadmap for AI/ HPC cluster observability. + Architect and lead teams to… more
    NVIDIA (06/17/25)
    - Related Jobs
  • Base Command Manager Engineer - Nvis NPI

    NVIDIA (Santa Clara, CA)
    …years of experience in at least two of the following: HPC /large-scale cluster administration, Linux systems engineering, infrastructure automation (eg, Ansible, ... such as CKA/CKAD (Certified Kubernetes Administrator/Developer), RHCE, or other advanced Linux/ HPC credentials. NVIDIA is widely considered one of the world's most… more
    NVIDIA (08/24/25)
    - Related Jobs
  • Research Scientist, AI & Systems Co-design (PhD)

    Meta (Sunnyvale, CA)
    …to algorithms, tooling, and interfaces, working across multiple accelerator types and HPC collective communication libraries such as NCCL, RCCL, UCC and MPI. 7. ... of the following areas: Accelerators/GPU architectures, High Performance Computing ( HPC ), Machine Learning Compilers, Training/Inference ML Systems, Model Compression,… more
    Meta (08/23/25)
    - Related Jobs
  • Strategic Partner, Quantum Computing

    NVIDIA (Santa Clara, CA)
    …a pioneer in accelerated computing, NVIDIA is at the forefront of AI, HPC , and now quantum technologies. Quantum computing promises to transform industries, from ... leadership in innovative technologies. + Strong understanding of Quantum computing technologies, HPC and AI, and their potential applications. + Advanced degree or… more
    NVIDIA (08/22/25)
    - Related Jobs
  • Solutions Architect, Hyperscale

    NVIDIA (Santa Clara, CA)
    …technical and interpersonal skills to analyze, define, implement and optimize AI/ML and HPC software and system solutions at hyper scale . What you'll be doing: ... Passion for enhancing customer experience + Proficiency in AI, ML and HPC applications + Comprehensive knowledge of computer system architecture including PCIe,… more
    NVIDIA (08/22/25)
    - Related Jobs
  • Senior Datacenter GPU Power Architect

    NVIDIA (Santa Clara, CA)
    …Applied Power Architecture team to develop state of the art GPUs to power AI, HPC , Automotive, GeForce, and Mobile products. What you'll be doing: + You will be ... GPUs, CPUs, Switches, and platforms. + Understand the workload characteristics for GenAI/ HPC workloads at Datacenter Scale to drive new HW/SW features for Perf@Watt… more
    NVIDIA (08/20/25)
    - Related Jobs
  • Sr. ML Kernel Performance Engineer, AWS Neuron,…

    Amazon (Cupertino, CA)
    …full software development experience - Expertise in accelerator architectures for ML or HPC such as GPUs, CPUs, FPGAs, or custom architectures - Experience with GPU ... and/or AMD GPU ISA - Experience developing high performance libraries for HPC applications - Proficiency in low-level performance optimization for GPUs - Experience… more
    Amazon (08/15/25)
    - Related Jobs
  • Senior Software Engineer - Nvlink NOS

    NVIDIA (Santa Clara, CA)
    …networking technologies, and enable large-scale deployment in High-Performance Computing ( HPC ) data centers. Are you passionate about working on innovative ... technologies for next gen HPC ? Then we want to hear from you! What you'll be doing: + As a Senior Software Engineer at NVIDIA, you will use your expertise in Python… more
    NVIDIA (08/15/25)
    - Related Jobs
  • Principal Firmware Engineer - Server Manageability…

    NVIDIA (Santa Clara, CA)
    …NVIDIA InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. We're looking for a strong technical architect to own the ... in standards bodies such as OCP and DMTF. + Familiarity with NVIDIA HPC programming models and libraries (CUDA, cuDNN, DOCA) + Knowledge of enterprise storage… more
    NVIDIA (08/13/25)
    - Related Jobs
  • Site Reliability Engineer, GNC (Falcon)

    SpaceX (Hawthorne, CA)
    …and services + Provision and maintain virtual and physical servers + Work with SpaceX HPC team to monitor and maintain a 4000+ thread HPC cluster + Closely ... collaborate with GNC software engineers to create highly operable and maintainable products + Add monitoring for web apps and respond to outages + Manage the underlying computational infrastructure of GNC in collaboration with IT + Engage in and improve the… more
    SpaceX (08/13/25)
    - Related Jobs