- NVIDIA (Santa Clara, CA)
- …detection and remediation of performance and reliability issues. + Optimize AI /ML and HPC workloads by crafting intelligent caching, low-latency storage ... Team's internal and partner-facing storage environments. We focus on delivering high- performance , highly available storage systems that scale while enabling… more
- NVIDIA (Santa Clara, CA)
- …and integration into large‑scale telemetry systems . + Deep knowledge of AI /ML infrastructure, high‑ performance computing ( HPC ), networking, and cloud ... to stand out from the crowd: + Masters/Phd or Expertise in distributed systems , performance modeling, or fault‑tolerant computing. + Experience with MLOps and… more
- NVIDIA (Santa Clara, CA)
- …experience with GPU systems in general including but not limited to AI workflow development, performance development, AI benchmarking, etc. Your base ... NVIDIA is searching for Solutions Architect with expertise in AI , Machine Learning, and HPC for Hyperscale...with expertise in AI , Machine Learning, and HPC for Hyperscale and Cloud Providers focus. Primary responsibilities… more
- TE Connectivity (CA)
- …optical transceivers, near/co-package optical transceivers, optical interconnects for advanced AI / HPC environment, compute, storage, and networking hardware ... + Define novel, scalable CPO/NPO architectures to meet future performance , power, and density demands of AI ,...AI , ML, and hyperscale workloads. + Architect CPO/NPO systems across optical, electrical, thermal, and mechanical domains. +… more
- NVIDIA (Santa Clara, CA)
- …training deep learning models at scale, and a good mathematical foundation to analyze new AI algorithms. We focus on AI models for autonomous driving such as ... agent behavior models, end-to-end AV architectures, AI safety, closed-loop training approaches, and AV foundation models...monitoring and debugging tools to ensure the reliability and performance of training workflows on large GPU clusters. What… more
- NVIDIA (Santa Clara, CA)
- …with performance modeling, profiling, debug, and code optimization of a DL/ HPC /high- performance application + Architectural knowledge of CPU and GPU + GPU ... for us. Does the idea of contributing to and pushing the boundaries of state-of-the-art AI and Compute systems excite you? Interested in getting exposure to the… more
- Meta (Menlo Park, CA)
- …levels 9. Experience in leading teams working on high performance computing ( HPC ) and AI /ML systems , including: 10. GPU/ASIC-based kernel development and ... systems for our fleet 4. Technical management 5. Experience in systems architecture, performance , workload-analysis and large scale distributed systems … more
- Cisco (Milpitas, CA)
- …agile team engaged in the design, development and execution of tests to qualify network performance for AI /ML capability. You will be a part of our solutions ... a customer-facing environment + Previous experience leading teams + Exposure network operating systems , preferably SONiC + Exposure to RDMA, HPC networks +… more
- Micron Technology, Inc. (San Jose, CA)
- …position in the Artificial Intelligence ( AI ), Machine Learning (ML) and High Performance Computing ( HPC ) business segments. You will be working on innovative ... you will be charged with defining and accomplishing the strategy for a High Performance Memory product portfolio that will further fortify Micron's leadership… more
- NVIDIA (Santa Clara, CA)
- …NVLink, NVIDIA InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. NVIDIA NVLink Fusion will enable ... center platform & node designs. From single node HGX/DGX systems all the way up to large multi-node NVLink...industry-leading AI scale-up and scale-out performance with NVIDIA… more