- NVIDIA (Seattle, WA)
- …computing, Deep Learning, and/or GPU accelerated computing domains + Large-scale distributed system, HPC , ML and Training experience with Slurm and Kubernetes + ... the journey to build the best cloud offering for AI workloads and to bring its latest GPU technology...Deep knowledge of both software and hardware knowledge in HPC and ML infrastructure NVIDIA is leading… more
- NVIDIA (Santa Clara, CA)
- NVIDIA's AI Factories are built to accelerate AI and HPC workloads. At their core the Digital Twin (physics-based model used to design, validate, and operate ... to stand out from the crowd + Background in AI / HPC data center cooling, including immersion and...or OCP advancing digital twin interoperability. + Experience applying AI / ML for simulation acceleration, surrogate modeling, or… more
- Oracle (Frankfort, KY)
- …the forefront of building a cutting-edge, ultra-high-performance GPU platform designed to support AI / ML / HPC workloads. This is your chance to be part of ... triage automation, and diagnostic services. These are essential for running distributed AI / ML / HPC workloads across thousands of GPUs, leveraging technologies… more
- Oracle (Austin, TX)
- …the forefront of building a cutting-edge, ultra-high-performance GPU platform designed to support AI / ML / HPC workloads. This is your chance to be part of ... automation, and diagnostic services. These are essential for running distributed AI / ML / HPC workloads across thousands of GPUs, leveraging technologies like… more
- Meta (Seattle, WA)
- …1. Lead technical program management of next-generation Artificial Intelligence/Machine Learning ( AI / ML ) platform (s) for Meta's Network Infrastructure in ... product introductions and AI operations initiatives supporting Meta's growing AI / HPC infrastructure for our Family of Apps . They will be responsible for… more
- Oracle (Honolulu, HI)
- …the forefront of building a cutting-edge, ultra-high-performance GPU platform designed to support AI / ML / HPC workloads. This is your chance to be part of ... continues to meet the rapidly evolving demands of both Enterprise and AI / ML customers. + Ensure reliability and customer satisfaction through proactive issue… more
- NVIDIA (Santa Clara, CA)
- …the GB200. We are looking for an experienced engineer to triage customers' hardware platform issues and AI / ML workloads in huge datacenters of rack-scale ... and the ability to analyze, optimize, and customize Linux environments for AI / ML workloads. + Containerized solutions experience with Docker, Kubernetes, Slurm… more
- Oracle (Nashville, TN)
- …triage automation, and diagnostic services. These are essential for running distributed AI / ML / HPC workloads across thousands of GPUs, leveraging technologies ... to scale and optimize Monitoring and Repair solutions for AI infrastructure components like GPU control plane and GPU...governance + Cloud infrastructure: OCI, AWS, Azure, Google Cloud Platform (GCP) + Operating Systems: Linux, MacOS + Scripting… more
- Oracle (Santa Clara, CA)
- …triage automation, and diagnostic services. These are essential for running distributed AI / ML / HPC workloads across thousands of GPUs, leveraging technologies ... and be a part of the team that's pushing the boundaries of AI technology! **Responsibilities** **Minimum Qualifications** + 4+ years of backend software development… more
- NVIDIA (Santa Clara, CA)
- …world. NVIDIA GH200 superchip provides performance and productivity required for strong scaling for HPC and generative AI workload. Scale out is inherent to the ... can perceive and understand the world. Today, we are increasingly known as "the AI computing company." We are looking to grow our company and establish teams with… more