- Oracle (Sacramento, CA)
- …in the RDMA cluster networking domain and enable seamless, accelerated High- Performance Compute ( HPC ), Artificial Intelligence and Machine Learning advancements. ... force, driving the development and design of state-of-the-art RDMA clusters tailored specifically for AI , ML, HPC workloads. We strive to be the go-to experts in… more
- Google (Sunnyvale, CA)
- …designs, with a specific focus on TPU architecture and its integration within AI /ML-driven systems . As a Quality and Reliability Engineer for Google Cloud, ... this role, you'll work to shape the future of AI /ML hardware acceleration. You will have an opportunity to...reliability. You will be responsible for ensuring that High Performance Computing ( HPC ) SOC products meet stringent… more
- NVIDIA (Santa Clara, CA)
- …. + Detailed understanding of GPU/accelerated compute architectures and their contributions to AI , HPC , and distributed storage systems is necessary. + ... system, and software requirements across compute, networking, storage, and security to support AI , HPC , cloud graphics and video workloads. + Partner with… more
- NVIDIA (Santa Clara, CA)
- …performance computing, and artificial intelligence. Our technology powers everything from generative AI to autonomous systems , and we continue to shape the ... researchers and engineers to develop the next generation of AI /ML systems . By joining us, you'll help...8+ years of experience developing and operating large-scale distributed systems , infrastructure platforms, or HPC environments. +… more
- Microsoft Corporation (Mountain View, CA)
- …and external, and operate at the intersection of AI algorithmic innovation, purpose-built AI hardware, systems , and software. We are a team of highly capable ... The Artificial Intelligence Cloud Inference team at Microsoft develops AI software that enables running AI models...+ Speeding up/reducing complexity of key components/pipelines to improve performance and/or efficiency of our systems +… more
- NVIDIA (Santa Clara, CA)
- …executives. Ways to Stand Out from the Crowd: + Experience with GPU-accelerated compute, HPC systems , or large-scale AI clusters. + Knowledge of Kubernetes ... workflows, and resilience tooling that enables consistent GPU fleet performance . You will build the next generation of health...at scale. If you are motivated by building foundational systems that enable large AI clusters to… more
- NVIDIA (Santa Clara, CA)
- …hyperscalers. + Analyzing and developing solutions for customer performance issues for both AI and systems performance . What we need to see: + BS/MS/PhD ... an expert Solutions Architect to assist customers in building AI /ML and HPC software solutions at scale....experience building performance benchmarks for data center systems , including large scale AI training and… more
- Microsoft Corporation (Mountain View, CA)
- …the boundaries of scale, performance , and deployment, creating frontier AI systems that power transformative experiences across Microsoft. The Multimodal ... (Pandas, NumPy, etc.) + OR equivalent experience. + **Experience with large-scale AI systems ** - design and deployment of distributed architectures, multimodal… more
- NVIDIA (Santa Clara, CA)
- …distributed storage systems , and ensuring low-latency data access for high- performance computing ( HPC ) and AI /ML workloads. Storage Production ... and alerting systems to ensure proactive detection and resolution of performance issues. + Work with AI /ML workloads to optimize storage architectures… more
- NVIDIA (Santa Clara, CA)
- …NVIDIA InfiniBand and Ethernet networking, NVIDIA ARM CPUs, and a fully optimized NVIDIA AI and HPC software stack. We're looking for a strong technology leader ... is leading the way in groundbreaking developments in Artificial Intelligence, High- Performance Computing and Visualization. The GPU, our invention, serves as the… more