- Oracle (Olympia, WA)
- …in the RDMA cluster networking domain and enable seamless, accelerated High- Performance Compute ( HPC ), Artificial Intelligence and Machine Learning advancements. ... force, driving the development and design of state-of-the-art RDMA clusters tailored specifically for AI , ML, HPC workloads. We strive to be the go-to experts in… more
- Microsoft Corporation (Redmond, WA)
- …and external, and operate at the intersection of AI algorithmic innovation, purpose-built AI hardware, systems , and software. We are a team of highly capable ... **Overview** The Artificial Intelligence ( AI ) Frameworks team at Microsoft develops AI...+ Speeding up/reducing complexity of key components/pipelines to improve performance and/or efficiency of our systems +… more
- Amazon (Seattle, WA)
- …Elastic Fabric Adapter (EFA) network card work for Machine Learning (ML) and High- Performance Computing ( HPC ) customers on AWS. Across multiple projects written ... cloud possible? Do you have a laser focus on performance in your code? We want to talk to...big impact every day on the hottest companies doing AI and HPC today. Key job responsibilities… more
- Microsoft Corporation (Redmond, WA)
- …our infrastructure team. In this role, you'll blend software engineering and systems engineering to keep our large-scale distributed AI infrastructure reliable ... + **Reliability & Availability** : Ensure uptime, resiliency, and fault tolerance of AI model training and inference systems . + **Observability** : Design and… more
- Amazon (Seattle, WA)
- …Elastic Fabric Adapter (EFA) network card work for Machine Learning (ML) and High- Performance Computing ( HPC ) customers on AWS. Across multiple projects written ... cloud possible? Do you have a laser focus on performance in your team's code? We want to talk...big impact every day on the hottest companies doing AI and HPC today. Key job responsibilities… more
- Oracle (Seattle, WA)
- …in distributed cloud systems **with direct experience in GPU computing, AI /ML workloads, and high- performance infrastructure.** They will be an exceptionally ... (NVIDIA or AMD), including driver installation, firmware management, and performance troubleshooting Familiarity with AI /ML frameworks (eg, PyTorch, TensorFlow,… more
- Teradata (Olympia, WA)
- …That's why we built the most complete cloud analytics and data platform for AI . By delivering harmonized data, trusted AI , and faster innovation, we uplift ... companies across every major industry trust Teradata to improve business performance , enrich customer experiences, and fully integrate data across the enterprise.… more
- Oracle (Seattle, WA)
- …a cutting-edge, ultra-high- performance GPU cluster based Data Centers designed to support AI /ML/ HPC workloads. This is your chance to be part of the ... AI revolution, creating systems that allow customers...scale from tens to thousands of GPUs without compromising performance . We are the AI Infrastructure Delivery… more
- Meta (Bellevue, WA)
- …10. Experience in leading teams working on high performance computing ( HPC ) and AI /ML systems , including: GPU/ASIC-based kernel development and ... ROCm), distributed systems for large scale training and serving, and systems architecture and performance 11. Accelerator (GPU/ASIC) kernel development and… more
- Meta (Bellevue, WA)
- …networks, powering our global data centers and supporting cutting-edge technologies like AI , Generative AI , Recommendation engines, and Metaverse. Our network ... to join our teams and help build scalable distributed systems , develop innovative solutions to our challenges, and ship...firmware, and software for network devices, transport stacks, and AI workloads 2. Debug complex system-level issues and lead… more