- Oracle (Santa Clara, CA)
- …triage automation, and diagnostic services. These are essential for running distributed AI / ML / HPC workloads across thousands of GPUs, leveraging technologies ... and be a part of the team that's pushing the boundaries of AI technology! **Responsibilities** **Minimum Qualifications** + 4+ years of backend software development… more
- NVIDIA (Santa Clara, CA)
- …world. NVIDIA GH200 superchip provides performance and productivity required for strong scaling for HPC and generative AI workload. Scale out is inherent to the ... can perceive and understand the world. Today, we are increasingly known as "the AI computing company." We are looking to grow our company and establish teams with… more
- Oracle (Sacramento, CA)
- …senior software engineers and applied ML developers building the next-generation AI -driven operations platform for OCI. + Partner with Network Engineering, ... fabric** , supporting millions of devices, multi-region interconnects, and high-performance compute ( HPC / AI /GPU) environments. + Integrate ML and LLM-based… more
- NVIDIA (Santa Clara, CA)
- …NVIDIA NVLink, NVIDIA InfiniBand networking, NVIDIA Grace CPUs, and a fully optimized NVIDIA AI and HPC software stack. We are searching for a highly motivated ... our data center platforms and products + Characterize real-world AI training, inference, and HPC workloads at...to stand out from the crowd: + Experience with AI / ML frameworks (PyTorch, TensorFlow, JAX). Knowledge of… more
- Oracle (Sacramento, CA)
- …the forefront of building a cutting-edge, ultra-high-performance GPU platform designed to support AI / ML / HPC workloads. This is your chance to be part of ... automation, and diagnostic services. These are essential for running distributed AI / ML / HPC workloads across thousands of GPUs, leveraging technologies like… more
- NVIDIA (Santa Clara, CA)
- …and tools that enable researchers and engineers to develop the next generation of AI / ML systems. By joining us, you'll help design solutions that power some ... of GPUs and petabytes of storage in multi-region clusters. + Collaborate with AI / ML research teams to understand their requirements and translate them into… more
- NVIDIA (Santa Clara, CA)
- …by collaborating with teams with varied strengths including GPU Compute, Distributed Systems, Networking, ML Infra, AI Platform , and Cloud Services to ensure ... reliability and cost efficiency of telemetry pipelines while supporting high-volume workloads ( AI / ML , HPC clusters, GPU infrastructure) + Embedding security… more
- NVIDIA (Santa Clara, CA)
- …professional experience building and scaling high-performance distributed systems, ideally in ML , HPC , or large-scale data infrastructure. + Extensive knowledge ... people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An...speed and improve safety, working closely with research and platform teams across NVIDIA. What you'll be doing: +… more
- NVIDIA (Santa Clara, CA)
- …of GPUs. Join our team of experts and help us build a supercharged AI platform that improves efficiency, resilience, and Model FLOPs Utilization (MFU). In ... This team focuses on optimizing efficiency and resiliency of ML workloads, as well as developing scalable AI...in building a highly scalable, fault tolerant and optimized AI platform . What you will be doing:… more
- NVIDIA (Santa Clara, CA)
- …etc.) and integration into large‑scale telemetry systems. + Deep knowledge of AI / ML infrastructure, high‑performance computing ( HPC ), networking, and cloud ... NVIDIA has become the platform upon which every new AI -powered...with enterprise platforms; deployments at modern data‑center scale; delivered ML / AI observability solutions for LLMOps, predictive incident… more