- NVIDIA (Santa Clara, CA)
- …in AI/ HPC data center cooling, including immersion and two-phase systems . + Experience building predictive digital twin frameworks combining physical modeling ... NVIDIA's AI Factories are built to accelerate AI and HPC workloads. At their core the Digital Twin (physics-based...tokens per watt across GPUs, cooling, power, and control systems . We are seeking a Senior AI Factory Digital… more
- NVIDIA (Santa Clara, CA)
- …The data center platforms like GB200 NVL72 by NVIDIA are redefining AI, HPC , and cloud computing. To accommodate leading workloads globally, our diagnostic ... systems need to evolve across diverse hardware technologies. We're...We're in search of a visionary technical leader to engineer and propel innovation in diagnostics for NVIDIA's partner… more
- NVIDIA (Santa Clara, CA)
- NVIDIA is looking for an expert software engineer to help us deliver CUDA-X libraries across the NVIDIA CPU and GPU ecosystem. For over a decade, NVIDIA's ... accelerated computing platform has revolutionized HPC and AI with applications ranging from COVID-19 research...domain expert by continuously surveying current trends in software systems . What we need to see: + PhD or… more
- NVIDIA (Santa Clara, CA)
- NVIDIA is looking for an expert software engineer to help us expand our catalog of Device eXtension (Dx) APIs for our math libraries. For over a decade, NVIDIA's ... accelerated computing platform has revolutionized HPC and AI with applications ranging from COVID-19 research...domain expert by continuously surveying current trends in software systems . What we need to see: + PhD or… more
- NVIDIA (Santa Clara, CA)
- …Docker containers & Jenkins pipelines + Certifications in storage (eg, SNIA) or HPC systems or Storage Performance experience with mdtest or FIO tool. ... be. We are looking for a Senior Software Validation Engineer to lead software validation activities in the Datacenter...streamlining our testing processes. + Validation of distributed Storage systems (eg, Lustre) on AI/ HPC Datacenter scale… more
- Meta (Menlo Park, CA)
- …Communications Library), which enables multi-GPU and multi-node data communication through HPC -style collectives. NCCL has been integrated into PyTorch and is on ... of GenAI/LLM scaling reliability and performance. **Required Skills:** Software Engineer , SystemML - Scaling / Performance Responsibilities: 1. Enabling reliable… more
- NVIDIA (Santa Clara, CA)
- …10+ years of experience in at least two of the following: HPC /large-scale cluster administration, Linux systems engineering, infrastructure automation (eg, ... the world. We are seeking a dedicated Base Command Manager (BCM) Engineer to support product deployments/escalations and collaborate with Engineering and our Field… more
- NVIDIA (Santa Clara, CA)
- NVIDIA data center systems , such as DGX and HGX, have become core to NVIDIA's rapidly growing enterprise and cloud provider businesses. These platforms bring ... Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. We're looking for a strong technical...the system software level. Including firmware, kernel drivers, operating systems , and user mode drivers. You will work with… more
- Amazon (Cupertino, CA)
- …base. Working at the intersection of software, hardware, and machine learning systems , you'll bring expertise in low-level optimization, system architecture, and ML ... (design patterns, reliability and scaling) of new and existing systems experience - 5+ years of full software development...experience - Expertise in accelerator architectures for ML or HPC such as GPUs, CPUs, FPGAs, or custom architectures… more
- NVIDIA (Santa Clara, CA)
- …develops the next generation simulation framework that spans across multiple Networking Operating Systems related to HPC , Ethernet AI, and more. We expect you ... NVIDIA is looking for a highly motivated, creative, and passionate Software Engineer to design and develop a simulation software to integrate with many networking… more
Recent Jobs
-
Talent Learning and Capabilities Leader Learning Pgm Coord
- CommonSpirit Health (Englewood, CO)
-
Senior Program Manager, Kindle Flex
- Amazon (Monroe, OH)
-
RN Quality Patient Safety Program Manager
- Dignity Health (Santa Cruz, CA)
-
Engineer 3 - Microgrid & DERMS Integration & Demonstration
- Southern California Edison (Pomona, CA)