- Meta (Olympia, WA)
- …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI/ HPC Systems Performance Engineer Responsibilities: 1. Active ... of RDMA workloads that expects a loss-less fabric interconnect. To improve performance of these systems we constantly look for opportunities across stack: network… more
- Meta (Bellevue, WA)
- …evolving AI workload needs.We are hiring in multiple locations. **Required Skills:** Software Engineer , Systems ML - HPC Specialist Responsibilities: 1. ... **Summary:** Meta is seeking an AI Software Engineer to join our Research & Development teams....on the web.Some aspects of this role as an HPC specialist may include authoring components such as cuBLAS,… more
- Meta (Bellevue, WA)
- …host networking, communication libraries, and scheduling infrastructure. **Required Skills:** AI/ HPC Network Engineer Responsibilities: 1. Design, develop, test ... loss-less fabric interconnect. To enhance the performance of these systems , we continuously seek opportunities for improvement across our...and operate networking systems to support large scale AI training jobs. 2.… more
- Amazon (Seattle, WA)
- …other AWS teams across different locations. This is an opportunity to operate and engineer systems on a global scale, while touching and influencing large parts ... Description The AWS High Performance Computing ( HPC ) team is looking for experienced SDE to...(design patterns, reliability and scaling) of new and existing systems experience - Experience programming with at least one… more
- Amazon (Seattle, WA)
- …of peer teams? We want to talk to you! We seek a Software Development Engineer for the Machine Learning (ML) Infrastructure team to build the tools that are used ... top performance of AWS ML and High Performance Computing ( HPC ) technologies developed by our organization. Bring your exceptional...Fabric Adapter (EFA). Key job responsibilities Be an autonomous engineer on a team that builds and maintains the… more
- Amazon (Seattle, WA)
- …of peer teams? We want to talk to you! We seek a Sr. Software Development Engineer for the Machine Learning (ML) Infrastructure team to build the tools that are used ... top performance of AWS ML and High Performance Computing ( HPC ) technologies developed by our organization. Bring your exceptional...Fabric Adapter (EFA). Key job responsibilities Be the lead engineer on a team that builds and maintains the… more
- Meta (Bellevue, WA)
- …approach to the new product introduction (NPI) phase. **Required Skills:** Hardware Systems Engineer , AI NPI Responsibilities: 1. Drive and execute end-to-end ... validation strategy (hardware and software), with a focus on various AI/ HPC hardware systems in datacenter applications. 2. Lead the bring-up, validation, and… more
- Meta (Bellevue, WA)
- **Summary:** Meta is seeking a Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are the foundation upon ... and lifecycle of servers in production. **Required Skills:** Production Systems Engineer , Fleet AI Systems ...issues. 15. 2+ years of experience supporting AI or HPC systems and/or related systems ,… more
- Meta (Bellevue, WA)
- **Summary:** Meta is seeking an experienced Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are the ... and lifecycle of servers in production. **Required Skills:** Production Systems Engineer , Fleet AI Systems ...issues. 18. 4+ years of experience supporting AI or HPC systems and/or related systems ,… more
- Amazon (Redmond, WA)
- …quickly and confidently with robust verification frameworks that scale with our systems . About the team The Kuiper Silicon teams deliver custom communication silicon ... in Python programming and automation tools - Strong knowledge of systems engineering fundamentals (networking, storage, operating systems ) - Experience… more
Recent Jobs
-
Lead Software Engineer, Full Stack (Go, Java, AWS)
- Capital One (Atlanta, GA)
-
Full Time - Cloud Native SaaS Developer (Java / AWS)
- Eliassen Group (Westlake, TX)
-
Senior Principal Electrical COTS Computing and Networking Engineer - Onsite Tewksbury/Marlborough, MA
- Raytheon (Denver, CO)