- Meta (Menlo Park, CA)
- **Summary:** Meta is seeking a Systems Engineer to join our Release to Production (RTP) team working on AI/ML initiatives supporting large scale AI Training and ... to hyperscalar bring up and validation. **Required Skills:** Hardware Systems Engineer , NPI AI Lead Responsibilities: 1....rack level and at scale, as well as debugging AI/ HPC systems , performance optimizations, including familiarity with… more
- Meta (Menlo Park, CA)
- **Summary:** Meta is seeking a Systems Engineer to join our Release to Production (RTP) team working on AI/ML initiatives supporting large scale AI Training and ... to Meta Silicon hyperscalar bring up and validation. **Required Skills:** Hardware Systems Engineer , NPI AI Responsibilities: 1. Lead the bring-up, validation,… more
- SLAC National Accelerator Laboratory (Menlo Park, CA)
- …parallel applications (eg, gdb, Valgrind, Nvidia Nsight). + In-depth knowledge of Linux operating systems and advanced shell scripting. + Proven expertise ... Senior High Performance Computing Engineer Job ID 6383 Location SLAC - Menlo...role in managing and optimizing our High Performance Computing ( HPC ) environment in support of these groundbreaking scientific projects.… more
- UCLA Health (Los Angeles, CA)
- …UCLA Health IT is looking for an outstanding Analytics DevOps and Platform Engineer , (IT Architect), to join the Solutions Architecture and Engineering (SAE) group. ... professional with a strong foundation in cloud computing, Windows and Linux administration, Citrix virtualization, DevOps principles, and automation. The ideal… more
- NVIDIA (Santa Clara, CA)
- …We deliver communication runtimes like NCCL and NVSHMEM for Deep Learning and HPC applications. We are looking for a motivated Partner Enablement Engineer ... guide our key partners and customers with NCCL. Most DL/ HPC applications run on large clusters with high-speed networking...Develop tools and automation to isolate issues on new systems and platforms, including cloud platforms (Azure, AWS, GCP,… more
- Meta (Menlo Park, CA)
- **Summary:** Meta is seeking a Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are the foundation upon ... and lifecycle of servers in production. **Required Skills:** Production Systems Engineer , Fleet AI Systems ...issues. 15. 2+ years of experience supporting AI or HPC systems and/or related systems ,… more
- Meta (Menlo Park, CA)
- **Summary:** Meta is seeking an experienced Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are the ... and lifecycle of servers in production. **Required Skills:** Production Systems Engineer , Fleet AI Systems ...issues. 18. 4+ years of experience supporting AI or HPC systems and/or related systems ,… more
- Amazon (San Diego, CA)
- …quickly and confidently with robust verification frameworks that scale with our systems . About the team The Kuiper Silicon teams deliver custom communication silicon ... in Python programming and automation tools - Strong knowledge of systems engineering fundamentals (networking, storage, operating systems ) - Experience… more
- Insight Global (Santa Clara, CA)
- …and Requirements -8-10+ years of experience -Extensive experience working with enterprise systems -Proficient in Linux -Background with Python, Jenkins, Ansible, ... Description Insight Global is looking for a Senior DevOps Engineer to support one of our largest clients onsite...with Product Teams to understand new product requirements including HPC and AI/ML Products. . Finding Optimum Solutions to… more
- The Walt Disney Company (Emeryville, CA)
- …significant impact on our studio. **RESPONSIBILITIES:** + Build and support our on-prem HPC storage systems + Develop software tools that enhance storage ... We seek a Senior Storage Engineer who is passionate about building and maintaining...the key technology pillars of storage, software tools, and Linux administration. As an essential team member, you will… more