• Senior Cloud Platform Software Engineer

    NVIDIA (Seattle, WA)
    …computing, Deep Learning, and/or GPU accelerated computing domains + Large-scale distributed system, HPC , ML and Training experience with Slurm and Kubernetes + ... the journey to build the best cloud offering for AI workloads and to bring its latest GPU technology...Deep knowledge of both software and hardware knowledge in HPC and ML infrastructure NVIDIA is leading… more
    NVIDIA (09/19/25)
    - Related Jobs
  • AI Factory Digital Twin Engineer

    NVIDIA (Santa Clara, CA)
    NVIDIA's AI Factories are built to accelerate AI and HPC workloads. At their core the Digital Twin (physics-based model used to design, validate, and operate ... to stand out from the crowd + Background in AI / HPC data center cooling, including immersion and...or OCP advancing digital twin interoperability. + Experience applying AI / ML for simulation acceleration, surrogate modeling, or… more
    NVIDIA (10/22/25)
    - Related Jobs
  • Senior Software Developer - AI Infra…

    Oracle (Frankfort, KY)
    …the forefront of building a cutting-edge, ultra-high-performance GPU platform designed to support AI / ML / HPC workloads. This is your chance to be part of ... triage automation, and diagnostic services. These are essential for running distributed AI / ML / HPC workloads across thousands of GPUs, leveraging technologies… more
    Oracle (11/25/25)
    - Related Jobs
  • Principal Software Developer - AI Infra…

    Oracle (Austin, TX)
    …the forefront of building a cutting-edge, ultra-high-performance GPU platform designed to support AI / ML / HPC workloads. This is your chance to be part of ... automation, and diagnostic services. These are essential for running distributed AI / ML / HPC workloads across thousands of GPUs, leveraging technologies like… more
    Oracle (11/25/25)
    - Related Jobs
  • Technical Program Manager, AI Network Infra

    Meta (Seattle, WA)
    …1. Lead technical program management of next-generation Artificial Intelligence/Machine Learning ( AI / ML ) platform (s) for Meta's Network Infrastructure in ... product introductions and AI operations initiatives supporting Meta's growing AI / HPC infrastructure for our Family of Apps . They will be responsible for… more
    Meta (11/19/25)
    - Related Jobs
  • Architect, AI Operations, OCI, NA

    Oracle (Honolulu, HI)
    …the forefront of building a cutting-edge, ultra-high-performance GPU platform designed to support AI / ML / HPC workloads. This is your chance to be part of ... continues to meet the rapidly evolving demands of both Enterprise and AI / ML customers. + Ensure reliability and customer satisfaction through proactive issue… more
    Oracle (11/25/25)
    - Related Jobs
  • Senior Solution Engineer, AI Factory Triage

    NVIDIA (Santa Clara, CA)
    …the GB200. We are looking for an experienced engineer to triage customers' hardware platform issues and AI / ML workloads in huge datacenters of rack-scale ... and the ability to analyze, optimize, and customize Linux environments for AI / ML workloads. + Containerized solutions experience with Docker, Kubernetes, Slurm… more
    NVIDIA (11/01/25)
    - Related Jobs
  • Senior Principal Software Engineer, AI

    Oracle (Nashville, TN)
    …triage automation, and diagnostic services. These are essential for running distributed AI / ML / HPC workloads across thousands of GPUs, leveraging technologies ... to scale and optimize Monitoring and Repair solutions for AI infrastructure components like GPU control plane and GPU...governance + Cloud infrastructure: OCI, AWS, Azure, Google Cloud Platform (GCP) + Operating Systems: Linux, MacOS + Scripting… more
    Oracle (11/25/25)
    - Related Jobs
  • Senior Software Developer - AI Infra…

    Oracle (Santa Clara, CA)
    …triage automation, and diagnostic services. These are essential for running distributed AI / ML / HPC workloads across thousands of GPUs, leveraging technologies ... and be a part of the team that's pushing the boundaries of AI technology! **Responsibilities** **Minimum Qualifications** + 4+ years of backend software development… more
    Oracle (11/25/25)
    - Related Jobs
  • Senior Platform Telemetry Engineer

    NVIDIA (Santa Clara, CA)
    …world. NVIDIA GH200 superchip provides performance and productivity required for strong scaling for HPC and generative AI workload. Scale out is inherent to the ... can perceive and understand the world. Today, we are increasingly known as "the AI computing company." We are looking to grow our company and establish teams with… more
    NVIDIA (11/14/25)
    - Related Jobs