• Senior Solutions Architect, Cluster Design…

    NVIDIA (Santa Clara, CA)
    …design, network validation and troubleshooting + Proven expertise in designing large-scale distributed systems , AI clusters, or HPC infrastructure + Ability ... is building the world's most groundbreaking and innovative accelerated computing platforms for AI and HPC . Because of our work, scientists, researchers, and… more
    NVIDIA (12/04/25)
    - Related Jobs
  • Software Development Engineer, Annapurna Labs,…

    Amazon (Cupertino, CA)
    Description We are seeking an experienced engineer to work on distributed AI /ML systems . This role involves working on collective operations - the fundamental ... operations that enable AI to scale across multiple accelerators & servers. Most...building networking solutions that for Machine Learning (ML) and High- Performance Computing ( HPC ) workloads on AWS. We… more
    Amazon (12/18/25)
    - Related Jobs
  • Principal Data Platform Architect

    NVIDIA (Santa Clara, CA)
    …technical leader to define a vision and roadmap for distributed data platform and observability systems for large-scale AI and HPC clusters and workloads and ... and visualization to spectacularly improve efficiency, performance , and productivity of AI and HPC workloads. You will lead technical teams to develop,… more
    NVIDIA (12/16/25)
    - Related Jobs
  • Senior System Software Engineer…

    NVIDIA (Santa Clara, CA)
    …by great technology-and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our tightly ... software architecture. With a targeted charter to enable best-in-class datacenter-scale performance and efficiency for our next generation of datacenter products,… more
    NVIDIA (11/11/25)
    - Related Jobs
  • Solutions Architect, Hyperscale

    NVIDIA (Santa Clara, CA)
    …Prepare and deliver technical presentations and workshops to customers + Address and optimize customer AI systems performance issues What we need to see: + ... and interpersonal skills to analyze, define, implement and optimize AI /ML and HPC software and system solutions...performance + Experience in designing, running and troubleshooting performance benchmarks for AI systems more
    NVIDIA (11/21/25)
    - Related Jobs
  • Software Development Snr Manager

    Oracle (Sacramento, CA)
    …network fabric** , supporting millions of devices, multi-region interconnects, and high- performance compute ( HPC / AI /GPU) environments. + Integrate ML ... Development Team within OCI's Network Availability organization. This team builds the AI , analytics, and automation systems that power OCI's self-healing cloud… more
    Oracle (11/25/25)
    - Related Jobs
  • Nvidia 2026 Internships: Systems Software…

    NVIDIA (Santa Clara, CA)
    …and Data Structures, Computer Architecture, Compiler Development, Open Source Programming, High- Performance Computing ( HPC ) , Automation Tools (XLA, TVM, ... you're expressing interest in one of our 202 6 Systems Software Engineering Internships. We'll review resumes on an...challenges no one else can solve. Our work in AI and digital twins is transforming the world's largest… more
    NVIDIA (12/01/25)
    - Related Jobs
  • Senior Solutions Architect, NVIDIA Cloud Partners

    NVIDIA (Santa Clara, CA)
    …etc. + Familiarity with at scale GPU systems in general, encompassing performance testing, AI benchmarking, and more. + Practical involvement in cluster ... expertise in data center design, development and execution for AI and HPC . + Efficient time management...HPC cluster settings. + Practical knowledge of NVIDIA systems technology such as NCCL, DCGM, UFM, Mission Control,… more
    NVIDIA (12/02/25)
    - Related Jobs
  • Cluster Deployment Operations Engineer - NVIS

    NVIDIA (Santa Clara, CA)
    …10+ years of experience in at least two of the following: HPC /large-scale cluster administration, Linux systems engineering, infrastructure automation (eg, ... optimization. + Hands-on experience using cluster telemetry and dashboard tools to assess HPC and AI clusters (eg, Prometheus, Grafana, DCGM, and similar… more
    NVIDIA (12/18/25)
    - Related Jobs
  • Software Developer 4

    Oracle (Sacramento, CA)
    …the forefront of building a cutting-edge, ultra-high- performance GPU platform designed to support AI /ML/ HPC workloads. This is your chance to be part of the ... AI revolution, creating systems that allow customers...and diagnostic services. These are essential for running distributed AI /ML/ HPC workloads across thousands of GPUs, leveraging… more
    Oracle (11/25/25)
    - Related Jobs