• Senior DGX Cloud

    NVIDIA (Santa Clara, CA)
    … analysis, optimization, and modeling to define the architecture and design of NVIDIA's DGX Cloud clusters. The ideal candidate will have a deep understanding of ... the methodology to conduct end to end performance analysis of critical AI applications running on large...will work closely with the multi-functional teams to define DGX Cloud cluster architecture for different CSPs,… more
    NVIDIA (05/31/25)
    - Related Jobs
  • Senior DGX AI Cloud

    NVIDIA (Santa Clara, CA)
    Joining NVIDIA's DGX Cloud AI Efficiency Team means contributing to the infrastructure that powers our innovative AI research. This team focuses on optimizing ... Engineers to design and develop tools for AI application performance analysis. Your work will enable AI researchers to...to work efficiently with a wide variety of DGXC cloud AI systems as they seek out opportunities for… more
    NVIDIA (06/08/25)
    - Related Jobs
  • Senior Manager, Technical Program…

    NVIDIA (Santa Clara, CA)
    …AI innovation powering breakthroughs in research, autonomous vehicles, robotics, and more. The DGX Cloud team builds and operates the AI infrastructure that ... Manager for Technical Program Management team to lead a high-impact team within our DGX Cloud Infrastructure organization. You will play a critical role in… more
    NVIDIA (07/22/25)
    - Related Jobs
  • Senior Software Engineer, DGX

    NVIDIA (Santa Clara, CA)
    We are looking for a Senior Software Engineer to join our DGX Cloud team and build the foundational systems that drive NVIDIA's high- performance GPU ... to creating an environment where diverse perspectives drive innovation. As part of the DGX Cloud team, you'll work on ground breaking technology that powers the… more
    NVIDIA (07/29/25)
    - Related Jobs
  • Senior DGX Cloud Software…

    NVIDIA (Santa Clara, CA)
    …building and running private and public clouds at production scale. As part of the DGX Cloud team, you'll have the opportunity to support our customers' journeys ... bare-metal, accelerated compute infrastructure and codify reliability best-practices in the broader DGX Cloud platform ecosystem. What you'll be doing: + Design,… more
    NVIDIA (07/26/25)
    - Related Jobs
  • Senior DGX Cloud AI…

    NVIDIA (Santa Clara, CA)
    …engineering practices to ensure high efficiency and availability of AI systems. As a senior DGX Cloud AI Infrastructure software engineer at NVIDIA, you ... Joining NVIDIA's DGX Cloud AI Efficiency Team means...leads the way in groundbreaking developments in Artificial Intelligence, High- Performance Computing, and Visualization. The GPU, our invention, serves… more
    NVIDIA (07/19/25)
    - Related Jobs
  • Senior Software Engineer, Distributed…

    NVIDIA (Santa Clara, CA)
    …GPU deep learning. What you will be doing: + You will be part of an DGX Cloud team responsible for production systems that enable large scalable GPU clusters to ... to ensure production AI clusters run reliability and consistently with maximum performance . Evaluating system failures and improving services based on a well-defined… more
    NVIDIA (07/02/25)
    - Related Jobs
  • Senior Manager, AI Infrastructure…

    NVIDIA (Santa Clara, CA)
    …AI Infrastructure Engineers at NVIDIA ensure that our internal and external facing GPU cloud services run maximum reliability and uptime as promised to the users and ... through careful preparation and planning while keeping an eye on capacity, latency and performance . What you'll be doing: + Lead a team of software and AI engineers… more
    NVIDIA (07/08/25)
    - Related Jobs
  • Senior AI Infrastructure Engineer, Security…

    NVIDIA (Santa Clara, CA)
    …systems, tooling, and data infrastructure that enable operation of our GPU cloud services. We are enabling engineering teams to innovate while proactively ... services that surface security signals and automate enforcement across multi-tenant cloud environments. + Develop and operate risk management workflows that… more
    NVIDIA (07/02/25)
    - Related Jobs
  • Senior Machine Learning Infrastructure…

    NVIDIA (Santa Clara, CA)
    …that automates GPU asset provisioning, configuration, and lifecycle management across cloud providers. You'll contribute to this platform to build end-to-end ... with cluster management systems (Kubernetes, SLURM) + Understanding of performance , security and reliability in complex distributed systems. Familiarity with… more
    NVIDIA (07/10/25)
    - Related Jobs