- Oracle (Topeka, KS)
- …support the end-to-end lifecycle of AI and machine learning workloads. From GPU infrastructure and training pipelines to model serving and deployment tools-we ... will work on critical components of OCI's AI platform, including **high-scale GPU cluster management** , **self-service ML infrastructure** , and **model training… more
- Oracle (Jackson, MS)
- …The Compute Scaled Manufacturing organization's mission is to meet surging GPU demand for Oracle's AI infrastructure by scaling the server qualification ... program manager with a background in supply chain, server (CPU and GPU ) hardware, data center capacity delivery, cloud services, server technology, and software… more
- Oracle (Tallahassee, FL)
- …heart of OCI is the large-scale distributed infrastructure to provide compute CPU and GPU bare metal and virtual machine capacity to our customers. We are the group ... that ingests CPU/ GPU servers as they land in the data centers,...provide technical guidance and mentorship across the lifecycle of CPU/ GPU server systems, from manufacturing and validation including firmware… more
- Meta (New York, NY)
- …working on high performance computing (HPC) and AI/ML systems, including: GPU /ASIC-based kernel development and optimization (eg CUDA, ROCm), distributed systems for ... and serving, and systems architecture and performance 11. Accelerator ( GPU /ASIC) kernel development and optimization 12. Experience in accelerating libraries… more
- Amazon (Culver City, CA)
- …(eg, media transcoding and processing engine, real-time collaboration infrastructure, GPU -accelerated rendering system, or content delivery and QC automation), ... media transfer, sub-second latency playback for 4:4:4 10-bit video, elastic GPU compute for rendering workloads, and multi-region disaster recovery for productions… more
- Microsoft Corporation (Redmond, WA)
- …: Analyze system performance and scalability, optimize resource utilization (compute, GPU clusters, storage, networking). + **Automation & Tooling** : Build ... deployments, incident response, scaling, and failover in hybrid cloud/on-prem CPU+ GPU environments. + **Incident Management** : Lead on-call rotations, troubleshoot… more
- SHI (Lansing, MI)
- …goals. + Architect Advanced Compute Environments: Optimize end-to-end infrastructure- GPU clusters, fabric interconnects, and AI-optimized storage-for large model ... Benchmarking: Direct technical validation for AI workloads, including multi-tenant GPU performance testing, model training simulations, and inferencing benchmark… more
- NVIDIA (CA)
- Interested in architecting innovative AI GPU platforms for data center and edge deployments at the world's top Retail, CPG, and QSR companies? We are looking for a ... Support the business development team through the sales process for GPU /Network hardware/software products. Owning the technical relationship and enabling customer… more
- Microsoft Corporation (Mountain View, CA)
- …and other state of the art LLMs + Measure, benchmark performance on Nvidia/AMD GPU 's and first party Microsoft silicon + Optimize and monitor performance of LLMs and ... on high performance applications and performance debug and optimization on CPU's/ GPU 's **Other Requirements** + Ability to meet Microsoft, customer and/or government… more
- Oracle (Seattle, WA)
- …maintenance, configuration and validation systems that enable us to deliver high quality GPU clusters to our customers and operate them with high availability. These ... the high speed backend NICs and automated validation and for GPU servers. Together they build the GPU specialization on top of the general compute Data plane… more
Recent Jobs
-
SW Development and Release Operations Technical Program Manager
- Cadence Design Systems, Inc. (San Jose, CA)