- NVIDIA (Santa Clara, CA)
- …that power some of the world's most advanced computing workloads. NVIDIA is looking for an AI /ML HPC Cluster Engineer to join our MARS team. You will provide ... + Minimum 2 years of experience administering multi-node compute infrastructure + Background in managing AI / HPC job schedulers like Slurm, K8s, PBS, RTDA,… more
- NVIDIA (Santa Clara, CA)
- …and intelligence. Make the choice to join us today! As a member of the GPU AI / HPC Infrastructure team, you will provide leadership in the design and ... 5+ years of experience designing and operating large scale compute infrastructure + Experience with AI / HPC advanced job schedulers, such as Slurm, K8s, PBS,… more
- NVIDIA (Santa Clara, CA)
- …Observability is at the heart of this transformation. We are looking for a Senior AI & HPC Observability Engineer to design and build the next-generation ... innovation and collaboration. Within this mission, our team, Managed AI Superclusters (MARS) builds and scales the infrastructure...systems covering metrics, logs, traces, and events for GPU-powered AI and HPC workloads. + Build large-scale… more
- NVIDIA (Santa Clara, CA)
- …+ Minimum of 6 years of experience crafting and operating large scale compute infrastructure . + Experience with AI / HPC job schedulers and orchestrators, such ... learning and staying ahead of new technologies and effective approaches in the HPC and AI /ML infrastructure fields. Ways to stand out from the crowd: +… more
- Lilly (Indianapolis, IN)
- …Advanced Intelligence and Data science teams through implementing advancements across our AI / HPC infrastructure tooling and operational excellence You will ... - You will bring a high learning agility and Infrastructure availability and reliability Engineer skills to...initiatives in areas such as: AI /ML acceleration, Infrastructure AI OPS automation, HPC … more
- Johns Hopkins University (Baltimore, MD)
- …and computational workflows on advanced HPC Systems and related infrastructure . Working primarily within Linux-based environments, the engineer manages and ... Develop and refine deployment strategies for scientific software on HPC and AI systems. + Design computational...fields, with advanced training in scientific computing. Classified Title: HPC Scientific Software Engineer Job Posting Title… more
- Johns Hopkins University (Baltimore, MD)
- …will design, build, and support Johns Hopkins University's high-performance computing and AI research infrastructure . This role integrates elements of both ... and Design** + Develop and refine deployment strategies for scientific software on HPC and AI systems. + Design computational workflows, selecting optimal… more
- Meta (New York, NY)
- …host networking, communications lib and scheduling infrastructure . **Required Skills:** AI / HPC System Performance Engineer Responsibilities: 1. Lead ... **Summary:** Meta's AI Training and Inference Infrastructure is growing exponentially to support ever increasing use cases of AI . This results in a dramatic… more
- NVIDIA (Santa Clara, CA)
- NVIDIA is hiring engineers to scale up its AI Infrastructure . We expect you to have a strong programming background, knowledge of datacenter hardware, ... hardware fleet management systems. Proven operational excellence in designing and maintaining AI infrastructure NVIDIA is widely considered to be one of… more
- NVIDIA (Santa Clara, CA)
- …continual learning and staying ahead of new technologies and effective approaches in the HPC infrastructure fields. Ways to stand out from the crowd: + ... world. We are seeking a highly skilled and experienced HPC Cluster Engineer to design, deploy, and...tools such as BCM or Ansible. + Experience with AI / HPC job schedulers and orchestrators, such as… more
Recent Jobs
-
Field Engineer Sr
- Lockheed Martin (New Boston, TX)
-
FEA Stress Analysis Engineer
- SI Solutions, LLC (San Jose, CA)
-
Principal Product Manager - Security
- Microsoft Corporation (Redmond, WA)
-
Principal Engineer
- Meijer (Grand Rapids, MI)