Performance Benchmarking Engineer Cluster Jobs

13 jobs (page 1)

Categories

All Categories

Engineering (11)

Performance Benchmarking…

Oracle (Seattle, WA)

…Design and code solutions for performance benchmarking . + Troubleshoot performance problems on RDMA clusters and perform cluster performance ... team strives to be the go-to experts on RDMA cluster architecture and its relationship to AI/ML/HPC performance...field with 5+years of relevant experience + Experience with benchmarking and troubleshooting or optimizing performance of… more

Oracle (11/25/25)
- Related Jobs
AI and ML HPC Cluster Engineer

NVIDIA (Santa Clara, CA)

…the world's most advanced computing workloads. NVIDIA is looking for an AI/ML HPC Cluster Engineer to join our MARS team. You will provide technical engagement ... including performance analysis and optimizations + Analyze and optimize cluster efficiency, job fragmentation, and GPU waste to meet internal SLA targets.… more

NVIDIA (01/10/26)
- Related Jobs
Senior HPC Cluster Engineer - EDA

NVIDIA (Santa Clara, CA)

…lasting impact on the world. We are seeking a highly skilled and experienced HPC Cluster Engineer to design, deploy, and operate GPU Compute Clusters for EDA and ... high- performance computing workloads used across multiple teams and projects....experience crafting and operating large scale compute infrastructure, including cluster configuration managements tools such as BCM or Ansible.… more

NVIDIA (12/10/25)
- Related Jobs
Senior AI and ML HPC Cluster…

NVIDIA (Santa Clara, CA)

…breaking GPU compute clusters that run demanding deep learning, high performance computing, and computationally intensive workloads. We seek a technical leader ... compute, networking, and storage design for large scale, high performance workloads, effective resource utilization in a heterogeneous compute environment,… more

NVIDIA (01/10/26)
- Related Jobs
Senior AI-HPC Cluster Engineer…

NVIDIA (Santa Clara, CA)

…graphics. Design and implement GPU compute clusters for deep learning and high- performance computing. What you'll be doing: + Provide leadership and strategic ... user needs. + Support our researchers to run their workloads including performance analysis and optimizations. + Conduct root cause analysis and suggest corrective… more

NVIDIA (10/30/25)
- Related Jobs
Senior System Software Engineer - AI…

NVIDIA (Santa Clara, CA)

…developing tools for AI researchers and SW/HW teams running AI workload in GPU cluster . As a member of the software development team, we will work with users ... debugging tricky failures and issues to help improve the performance and efficiency of the system. What you'll be...common encountered problems like memory or networking + Create benchmarking and simulation technologies for AI system or GPU… more

NVIDIA (12/19/25)
- Related Jobs
Principal, Software Engineer - Cloud…

Walmart (Sunnyvale, CA)

…atop for advanced debugging. + Perform deep analysis of OSD, MON, MDS, RGW performance and optimize cluster parameters. + Debug network congestion, packet loss, ... hardware (NVMe SSDs, RDMA NICs, high-density HDDs) and their impact on storage performance . + Evaluate next-gen server SKUs, perform benchmarking , and compare… more

Walmart (11/20/25)
- Related Jobs
Senior MLOps Engineer , GenAI Framework

NVIDIA (Santa Clara, CA)

…cloud compute technologies, eg: SLURM, Lustre, k8s + Software and hardware Benchmarking on high- performance computing systems. #LI-Hybrid Your base salary will ... dedicated and motivated senior build and continuous integration (CI/CD) engineer for its GenAI Frameworks (Megatron-LM (https://github.com/NVIDIA/Megatron-LM) and NeMo… more

NVIDIA (10/15/25)
- Related Jobs
Senior DGX Cloud Performance…

NVIDIA (Santa Clara, CA)

… performance and AI workloads on large scale systems + Experience with performance modeling and benchmarking at scale + Strong background in Computer ... seeking highly skilled Parallel and Distributed Systems engineers to drive the performance analysis, optimization, and modeling to define the architecture and design… more

NVIDIA (01/10/26)
- Related Jobs
Principal / Sr. Principal HPC Network…

Northrop Grumman (Jessup, MD)

…making history. We are looking for you to join our team as a High- Performance Computing ( **HPC** ) **Network Engineer ** based out of **Annapolis Junction ... **Responsibilities** + Monitor and maintain performance of network within a high- performance compute cluster + Contribute to design of new high-… more

Northrop Grumman (12/05/25)
- Related Jobs

"Alerted.org

Advanced Search

Recent Searches

Recent Jobs

Account Login

Sign Up

Forgot your password?