- NVIDIA (Santa Clara, CA)
- …from the crowd: + Technical competency in managing and automating large-scale distributed systems independent of cloud providers. Advanced hands-on experience ... part of an DGX Cloud team responsible for production systems that enable large scalable GPU clusters to be...years in similar role and experience on large-scale production systems . Experience with common software engineering principles, tools and… more
- NVIDIA (Santa Clara, CA)
- …enthusiastic about building the next generation of scalable AI systems . As a Senior Applied AI Software Engineer on the Dynamo project, you will address some ... Go for Kubernetes controllers and operators development. + Deep understanding of distributed systems , parallel computing, and GPU architectures. + Experience… more
- TP-Link North America, Inc. (Irvine, CA)
- …performance and enable consumers to enjoy a seamless, effortless lifestyle. OVERVIEW As a Senior Cloud Engineer - Distributed Database & Middleware, you will ... etc.) to ensure project success. + Mentor team members in distributed systems , database governance, and performance tuning practices to elevate the overall… more
- NVIDIA (Santa Clara, CA)
- …and fleet management engineering. + Experience with infrastructure automation and distributed systems design developing tools for running large scale ... one or more of the following: Linux, Slurm, Kubernetes, Local and Distributed Storage, and Systems Networking. Ways to stand out from the crowd: + Demonstrating… more
- NVIDIA (Santa Clara, CA)
- …ecosystem to power AI at scale! We are seeking a highly technical and creative Senior Technical Marketing Engineer to join our team to showcase the innovations ... world's largest AI models. This role will focus on distributed AI model training, ensuring that customers and partners...7+ years of experience in deep learning engineering, HPC systems , AI infrastructure, or technical evangelism roles. + Strong… more
- Google (Sunnyvale, CA)
- …and configuration management + Experience with Kubernetes cluster management systems and technical leadership. Google's software engineers develop the ... bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage,...on and is growing every day. As a software engineer , you will work on a specific project critical… more
- Google (Sunnyvale, CA)
- …design and architecture. + 3 years of experience developing large-scale infrastructure, distributed systems or networks, or experience with compute technologies, ... bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage,...on and is growing every day. As a software engineer , you will work on a specific project critical… more
- Amazon (East Palo Alto, CA)
- …base. You'll bring a passion for innovation, data, search, analytics, and distributed systems . You'll also: Solve challenging technical problems, often ones ... into technological breakthroughs? Join Amazon as a Software Development Engineer (SDE) and help shape the future of global...lead or leading an engineering team - Experience with distributed computing and enterprise-wide systems Proficiency in… more
- Amazon (Cupertino, CA)
- …accelerators and the Trn1 and Inf1 servers that use them. This role is for a senior software engineer in the Machine Learning Applications (ML Apps) team for AWS ... degree in computer science or equivalent - Preferred previous software engineer expertise with Pytorch/Jax/Tensorflow, Distributed libraries and Frameworks,… more
- Amazon (Cupertino, CA)
- …Inferentia (Inf1/Inf2) our cloud-scale Machine Learning accelerators. This role is for a Senior Machine Learning Engineer in the Distribute Training team for AWS ... well as Stable Diffusion, Vision Transformers (ViT) and many more. The ML Distributed Training team works side by side with chip architects, compiler engineers and… more