• AI/ HPC Systems Performance…

    Meta (Olympia, WA)
    …fabric and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI/ HPC Systems Performance Engineer Responsibilities: 1. Active ... of RDMA workloads that expects a loss-less fabric interconnect. To improve performance of these systems we constantly look for opportunities across stack: network… more
    Meta (03/22/25)
    - Related Jobs
  • Software Engineer , Systems ML…

    Meta (Bellevue, WA)
    …evolving AI workload needs.We are hiring in multiple locations. **Required Skills:** Software Engineer , Systems ML - HPC Specialist Responsibilities: 1. ... **Summary:** Meta is seeking an AI Software Engineer to join our Research & Development teams....on the web.Some aspects of this role as an HPC specialist may include authoring components such as cuBLAS,… more
    Meta (02/14/25)
    - Related Jobs
  • AI/ HPC Network Engineer

    Meta (Bellevue, WA)
    …host networking, communication libraries, and scheduling infrastructure. **Required Skills:** AI/ HPC Network Engineer Responsibilities: 1. Design, develop, test ... loss-less fabric interconnect. To enhance the performance of these systems , we continuously seek opportunities for improvement across our...and operate networking systems to support large scale AI training jobs. 2.… more
    Meta (04/09/25)
    - Related Jobs
  • Software Engineer - AWS PCS, High…

    Amazon (Seattle, WA)
    …other AWS teams across different locations. This is an opportunity to operate and engineer systems on a global scale, while touching and influencing large parts ... Description The AWS High Performance Computing ( HPC ) team is looking for experienced SDE to...(design patterns, reliability and scaling) of new and existing systems experience - Experience programming with at least one… more
    Amazon (04/30/25)
    - Related Jobs
  • Software Development Engineer , ML…

    Amazon (Seattle, WA)
    …of peer teams? We want to talk to you! We seek a Software Development Engineer for the Machine Learning (ML) Infrastructure team to build the tools that are used ... top performance of AWS ML and High Performance Computing ( HPC ) technologies developed by our organization. Bring your exceptional...Fabric Adapter (EFA). Key job responsibilities Be an autonomous engineer on a team that builds and maintains the… more
    Amazon (04/23/25)
    - Related Jobs
  • Sr. Software Development Engineer , ML…

    Amazon (Seattle, WA)
    …of peer teams? We want to talk to you! We seek a Sr. Software Development Engineer for the Machine Learning (ML) Infrastructure team to build the tools that are used ... top performance of AWS ML and High Performance Computing ( HPC ) technologies developed by our organization. Bring your exceptional...Fabric Adapter (EFA). Key job responsibilities Be the lead engineer on a team that builds and maintains the… more
    Amazon (02/13/25)
    - Related Jobs
  • Hardware Systems Engineer , AI NPI

    Meta (Bellevue, WA)
    …approach to the new product introduction (NPI) phase. **Required Skills:** Hardware Systems Engineer , AI NPI Responsibilities: 1. Drive and execute end-to-end ... validation strategy (hardware and software), with a focus on various AI/ HPC hardware systems in datacenter applications. 2. Lead the bring-up, validation, and… more
    Meta (02/05/25)
    - Related Jobs
  • Production Systems Engineer , Fleet…

    Meta (Bellevue, WA)
    **Summary:** Meta is seeking a Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are the foundation upon ... and lifecycle of servers in production. **Required Skills:** Production Systems Engineer , Fleet AI Systems ...issues. 15. 2+ years of experience supporting AI or HPC systems and/or related systems ,… more
    Meta (03/15/25)
    - Related Jobs
  • Production Systems Engineer , Fleet…

    Meta (Bellevue, WA)
    **Summary:** Meta is seeking an experienced Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are the ... and lifecycle of servers in production. **Required Skills:** Production Systems Engineer , Fleet AI Systems ...issues. 18. 4+ years of experience supporting AI or HPC systems and/or related systems ,… more
    Meta (03/29/25)
    - Related Jobs
  • Systems Development Engineer - SRE,…

    Amazon (Redmond, WA)
    …quickly and confidently with robust verification frameworks that scale with our systems . About the team The Kuiper Silicon teams deliver custom communication silicon ... in Python programming and automation tools - Strong knowledge of systems engineering fundamentals (networking, storage, operating systems ) - Experience… more
    Amazon (05/01/25)
    - Related Jobs