• Senior Data Center Performance Engineer…

    NVIDIA (Santa Clara, CA)
    …NVIDIA NVLink, NVIDIA InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. We are searching for a highly motivated ... our data center platforms and products + Characterize real-world AI training, inference, and HPC workloads at...to stand out from the crowd: + Experience with AI / ML frameworks (PyTorch, TensorFlow, JAX). Knowledge of… more
    NVIDIA (12/09/25)
    - Related Jobs
  • Software Developer 3

    Oracle (Seattle, WA)
    …the forefront of building a cutting-edge, ultra-high-performance GPU platform designed to support AI / ML / HPC workloads. This is your chance to be part of ... automation, and diagnostic services. These are essential for running distributed AI / ML / HPC workloads across thousands of GPUs, leveraging technologies like… more
    Oracle (11/25/25)
    - Related Jobs
  • Senior Software Engineer - Storage

    NVIDIA (Santa Clara, CA)
    …and tools that enable researchers and engineers to develop the next generation of AI / ML systems. By joining us, you'll help design solutions that power some ... of GPUs and petabytes of storage in multi-region clusters. + Collaborate with AI / ML research teams to understand their requirements and translate them into… more
    NVIDIA (12/02/25)
    - Related Jobs
  • Principal Software Engineer - Copilot Security

    Microsoft Corporation (Redmond, WA)
    …to improve defenses and enablement. + Align with central Microsoft security and AI roadmaps, landing platform capabilities in Copilot and MAI consumer scenarios. ... Slurm, HPC ), containerization and orchestration technologies (Docker, Kubernetes) for ML model deployment, and ML lifecycle management in production… more
    Microsoft Corporation (12/12/25)
    - Related Jobs
  • Principal Software Engineer - Copilot Security

    Microsoft Corporation (Redmond, WA)
    …improve defenses and enablement. + Align with central Microsoft security and AI roadmaps, influencing platform capabilities and landing them in Copilot ... Slurm, HPC ), containerization and orchestration technologies (Docker, Kubernetes) for ML model deployment, and ML lifecycle management in production… more
    Microsoft Corporation (11/26/25)
    - Related Jobs
  • Senior Software Engineer, Observability

    NVIDIA (Santa Clara, CA)
    …by collaborating with teams with varied strengths including GPU Compute, Distributed Systems, Networking, ML Infra, AI Platform , and Cloud Services to ensure ... reliability and cost efficiency of telemetry pipelines while supporting high-volume workloads ( AI / ML , HPC clusters, GPU infrastructure) + Embedding security… more
    NVIDIA (12/09/25)
    - Related Jobs
  • Principal Member of Technical Staff

    Oracle (Nashville, TN)
    …background in distributed cloud systems **with direct experience in GPU computing, AI / ML workloads, and high-performance infrastructure.** They will be an ... driver installation, firmware management, and performance troubleshooting Familiarity with AI / ML frameworks (eg, PyTorch, TensorFlow, JAX) and distributed… more
    Oracle (11/25/25)
    - Related Jobs
  • Senior Deep Learning Engineer - Autonomous…

    NVIDIA (Santa Clara, CA)
    …professional experience building and scaling high-performance distributed systems, ideally in ML , HPC , or large-scale data infrastructure. + Extensive knowledge ... people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An...speed and improve safety, working closely with research and platform teams across NVIDIA. What you'll be doing: +… more
    NVIDIA (10/03/25)
    - Related Jobs
  • Senior Performance and Development Engineer

    NVIDIA (Santa Clara, CA)
    …of GPUs. Join our team of experts and help us build a supercharged AI platform that improves efficiency, resilience, and Model FLOPs Utilization (MFU). In ... This team focuses on optimizing efficiency and resiliency of ML workloads, as well as developing scalable AI...in building a highly scalable, fault tolerant and optimized AI platform . What you will be doing:… more
    NVIDIA (11/01/25)
    - Related Jobs
  • Senior Product Manager - Observability…

    NVIDIA (Santa Clara, CA)
    …etc.) and integration into large‑scale telemetry systems. + Deep knowledge of AI / ML infrastructure, high‑performance computing ( HPC ), networking, and cloud ... NVIDIA has become the platform upon which every new AI -powered...with enterprise platforms; deployments at modern data‑center scale; delivered ML / AI observability solutions for LLMOps, predictive incident… more
    NVIDIA (10/07/25)
    - Related Jobs