- Lawrence Berkeley National Laboratory (Berkeley, CA)
- …Lab's ( LBNL ) Information Technology Division ( IT ) has an opening for a Senior HPC Cluster Systems Administrator to join their ScienceIT Team ! In ... by building, integrating, and maintaining Linux-based resources, high-performance computing cluster systems , and Kubernetes clusters. This role provides… more
- NVIDIA Corporation (Santa Clara, CA)
- Senior AI- HPC EDA Cluster ...leadership and strategic mentorship on the management of large-scale HPC systems including the deployment of compute, ... is loaded## Senior AI- HPC EDA Cluster Engineerlocations: US, CA, Santa Clara: US, TX, Austin:...Experience analyzing and tuning performance for a variety of AI/ HPC workloads. Excellent problem-solving to analyze complex systems… more
- The Voleon Group (Berkeley, CA)
- …multibillion‑dollar asset manager, and we have ambitious goals for the future. As a Senior Cluster Site Reliability Engineer (SRE), you will help scale our ... research compute cluster to meet our growing needs, and you will...in SRE or DevOps roles, preferably working as a senior engineer or tech lead Knowledge of HPC… more
- NVIDIA Corporation (Santa Clara, CA)
- …to stand out from the crowd: Experience leading large-scale AI Factory or HPC cluster bring-ups or builds* Hands-on experience with NVIDIA networking products ... Senior Solutions Architect, Cluster Design and...validation and troubleshooting* Proven expertise in designing large-scale distributed systems , AI clusters, or HPC infrastructure* Ability… more
- Ring Inc (San Francisco, CA)
- …networking, observability, security, disaster recovery, and cost management. Familiarity with HPC cluster management softwares such as Slurm Familiarity with ... and retrieval workloads. Previous success managing engineering teams delivering production-grade, HPC -scale RAG systems . Deep understanding of infra domains:… more
- NVIDIA Corporation (Santa Clara, CA)
- …ETH/IB networking components, storage, etc.) within extensive AI and HPC cluster settings.* Practical knowledge of NVIDIA systems technology such as NCCL, ... Senior Solutions Architect, NVIDIA Cloud Partners page is...with partners and customers.* Experience crafting and deploying large-scale cluster environments.* Practical expertise in data center design, development… more
- NVIDIA Corporation (Santa Clara, CA)
- …disability status or any other characteristic protected by law. Similar Jobs (5) Senior Systems Software Engineer, Data Center locations 2 Locations time type ... Senior Software Architect - Data Center Systems...systems , particularly at the SW/HW interface. Understanding of HPC or Deep learning workloads and use of accelerated… more
- Zettabyte (Palo Alto, CA)
- …mindset-comfortable with ambiguity and rapid iteration Bonus qualifications GPU or HPC cluster management experience Understanding of ML/AI workload patterns ... world. Why this role exists We need a Backend Engineer to build the systems that orchestrate GPU clusters for AI workloads. You'll create APIs that handle GPU… more
- Fluidstack (San Francisco, CA)
- …infrastructure. We treat our customers' outcomes as our own, taking pride in the systems we build and the trust we earn. If you're motivated by purpose, obsessed ... join us in building what's next. About the Role Senior / Staff SREs at Fluidstack sit at the...networking, platform engineering, and data center operations to build systems that scale with the demands of AI workloads.… more
- Promote Project (Santa Clara, CA)
- …are seeking a distributed software engineer to join our team! As a Senior engineer, you'll be instrumental in developing and optimizing AI infrastructure services to ... on: Developing solutions at the intersection of machine learning, distributed systems , and high-performance computing, supplying to the advancement of AI… more