- NVIDIA (Santa Clara, CA)
- …a discipline that involves designing, building, and maintaining large-scale production systems with high efficiency and availability. It encompasses various areas, ... including software and systems engineering practices, storage, data management, and services. Production..., and ensuring low-latency data access for high-performance computing ( HPC ) and AI/ML workloads. Storage Production Engineers at NVIDIA… more
- NVIDIA (Santa Clara, CA)
- …Infrastructure Specialist to design, develop, and operationalize next-generation thermal systems . This role will be deeply involved in heat-rejection architecture, ... waste-heat recovery integration, and full-stack MEP systems , transforming how we think about thermal performance at...to stand out from the crowd: + Experience with AI/ HPC data centers and advanced cooling technologies, including two-phase… more
- Microsoft Corporation (Redmond, WA)
- **Overview** The HPC /AI (High-Performance Computing and Artificial Intelligence) organization is on a mission to build the next generation of distributed AI ... supercomputers- systems that deliver unprecedented computational power, scalability, and reliability...some of the largest and most complex distributed training systems in the world. This is a rare opportunity… more
- Amazon (Austin, TX)
- …operate next-generation infrastructure that powers breakthrough innovation in AI/ML and HPC workloads. If you're passionate about pushing the limits of performance, ... the cloud, this is your opportunity to build the systems that define what's next for AWS - and...have tremendous interest in cloud scale and curious how systems and software decisions impact the user. You insist… more
- Microsoft Corporation (Mountain View, CA)
- …to push the boundaries of AI toward **Humanist Superintelligence-ultra-capable systems that remain controllable, safety-aligned, and anchored to human values.** ... experience. + Apply strong software engineering fundamentals in distributed systems , networking, and storage while building large-scale distributed applications on… more
- NVIDIA (Santa Clara, CA)
- …ecosystem of data center platform & node designs. From single node HGX/DGX systems all the way up to large multi-node NVLink domain rack architectures. These ... InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. NVIDIA NVLink Fusion will enable industry-leading AI scale-up and… more
- Amazon (Austin, TX)
- …operate next-generation infrastructure that powers breakthrough innovation in AI/ML and HPC workloads. If you're passionate about pushing the limits of performance, ... the cloud, this is your opportunity to build the systems that define what's next for AWS - and...the current customer experience as well as developing improved systems for future designs. You will work directly with… more
- Amazon (Austin, TX)
- …operate next-generation infrastructure that powers breakthrough innovation in AI/ML and HPC workloads. If you're passionate about pushing the limits of performance, ... the cloud, this is your opportunity to build the systems that define what's next for AWS - and...the current customer experience as well as developing improved systems for future designs. You will work directly with… more
- RTX Corporation (Woburn, MA)
- …internships). + Applied and/or academic experience of modeling and simulation physical systems . + The ability to obtain and maintain a US security clearance ... a security clearance. **Qualifications We Prefer** + Specific majors: Systems Engineering, Operations Research, Math, Applied Math, Industrial Engineering, Physics,… more
- NVIDIA (Santa Clara, CA)
- …years of relevant professional experience encompassing large-scale ML training, AV systems , simulation, and AI infrastructure development. + Deep proficiency in RL ... Exceptional programming skills in C++ and Python, vital for developing efficient systems and data pipelines. + Extensive experience with large-scale GPU clusters,… more