- NVIDIA (Santa Clara, CA)
- We are seeking Lead Post-Silicon Validation Engineer within the GPU Engineering Team to help drive development of future GPUs be used in 3D graphics, deep learning, ... HPC and automotive markets. Make the choice to join...proven experience with three years and working with memory systems in the lab. + Direct experience in taking… more
- NVIDIA (Santa Clara, CA)
- …ecosystem of data center platform & node designs. From single node HGX/DGX systems all the way up to large multi-node NVLink domain rack architectures. These ... InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. We're searching for a highly motivated, technical architect to… more
- NVIDIA (Santa Clara, CA)
- …ecosystem of data center platform & node designs. From single node HGX/DGX systems all the way up to large multi-node NVLink domain rack architectures. These ... InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. NVIDIA NVLink Fusion will enable industry-leading AI scale-up and… more
- NVIDIA (Santa Clara, CA)
- …NVIDIA GH200 superchip provides performance and productivity required for strong scaling for HPC and generative AI workload.Scale out is inherent to design of this ... I/O bus (PCIe, etc.) and CPU. + Architecting complex systems , I/O error handling from PCIe & other I/O...datacenter requirements and improve resiliency of a GPU based systems + Identify gaps in platform debuggability and drive… more
- NVIDIA (Santa Clara, CA)
- …You Will Be Doing: + Architect, lead, and scale globally distributed production systems supporting AI/ML, HPC , and critical engineering platforms across hybrid ... forecasting strategies, and uncertainty testing approaches for sophisticated distributed systems . + Lead cross-organizational efforts to assess operational maturity,… more
- Amazon (Cupertino, CA)
- …of applications and workloads: databases, web services, games, video encoding, ML and HPC , and a variety of internal and customer services and applications to ensure ... Develop frameworks to analyze hardware and software performance. Create automated systems for collecting and analyzing processor, OS, and workload performance data.… more
- Amazon (Cupertino, CA)
- …range of applications including databases, web services, games, video encoding, ML, and HPC workloads. This doesn't mean you have or will have all those skills, ... other open source projects - Develop analysis frameworks and automation systems Tool(s) Development - Enhance APerf (our open-source Rust-based performance tool)… more
- Google (Mountain View, CA)
- …DAS + Large scale optimization/inversion experience + High performance computing ( HPC ) experience. + GCP experience. + Experience in infrastructure-as-code, eg ... Terraform + Exposure to productions systems that rely heavily on ML models, and/or experience with model deployment + Experience working in start-up like… more
- Amazon (Cupertino, CA)
- …cloud offerings that enable high performance and scalability in AI/ML and HPC workloads. AWS Infrastructure Services owns the design, planning, delivery, and ... goal of improving the current customer experience as well as developing improved systems for future designs. You will work directly with vendors and ODM/JDM design… more
- NVIDIA (Santa Clara, CA)
- …paced team that crafts the next generation of process technologies for AI, HPC , automotive, and graphics. You'll take on hard problems, partner with world-class ... infrastructure improvements for next-gen process technologies: tool integrations, build systems , integrate compute resources + Identifying bottlenecks, advocating for… more