- Meta (Olympia, WA)
- …and host networking, comms lib and scheduling infrastructure. **Required Skills:** AI/ HPC Systems Performance Engineer Responsibilities: 1. Active member of ... a multi-disciplinary team to develop solutions for large scale training systems. 2. Responsible for the overall performance of the communication system, including performance benchmarking, monitoring and troubleshooting production issues. 3. Identify potential… more
- Meta (Bellevue, WA)
- …AI workload needs.We are hiring in multiple locations. **Required Skills:** Software Engineer , Systems ML - HPC Specialist Responsibilities: 1. Apply relevant ... **Summary:** Meta is seeking an AI Software Engineer to join our Research & Development teams....on the web.Some aspects of this role as an HPC specialist may include authoring components such as cuBLAS,… more
- Amazon (Seattle, WA)
- Description The AWS High Performance Computing ( HPC ) team is looking for experienced SDE to work on a new HPC service. The HPC team is building a core set of ... that allow our customers to plan, schedule, and execute HPC workloads across the full range of AWS compute...different locations. This is an opportunity to operate and engineer systems on a global scale, while touching and… more
- Amazon (Seattle, WA)
- Description We are seeking an experienced engineer to work on distributed AI/ML systems. This role involves working on collective operations - the fundamental ... systems is valued, and experience with high-speed networking or HPC interconnects is valued highly. If you like solving...you like solving hard problems, want to work with HPC and ML customers, iterate fast and deliver meaningful… more
- Meta (Bellevue, WA)
- …the new product introduction (NPI) phase. **Required Skills:** Hardware Systems Engineer , AI NPI Responsibilities: 1. Drive and execute end-to-end system validation ... strategy (hardware and software), with a focus on various AI/ HPC hardware systems in datacenter applications. 2. Lead the bring-up, validation, and deployment of… more
- Meta (Bellevue, WA)
- **Summary:** Meta is seeking an experienced Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are the foundation ... and lifecycle of servers in production. **Required Skills:** Production Systems Engineer , Fleet AI Systems Lead Responsibilities: 1. Lead interfacing with external… more
- Meta (Bellevue, WA)
- **Summary:** Meta is seeking a Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are the foundation upon which ... and lifecycle of servers in production. **Required Skills:** Production Systems Engineer , Fleet AI Systems Responsibilities: 1. Interface with external vendors and… more
- Amazon (Redmond, WA)
- …Qualifications - Experience working with ASIC teams and High-Performance Computing ( HPC ) environments - AWS certifications (eg, AWS Certified Solutions Architect, ... AWS Certified DevOps Engineer ) - Experience with container orchestration, monitoring tools, and database administration - Familiarity with incident management and… more
- Amazon (Seattle, WA)
- …To achieve our ambitious goals, we're expanding our team and looking for a Senior Engineer to lead the development of a new EC2 Service critical to scale our current ... and next-generation Machine Learning (ML) and HPC Platforms. You will also be a technical leader for a team that owns the software-defined networking (SDN) dataplane… more
- Meta (Bellevue, WA)
- …of our network engineering teams is for you! **Required Skills:** Software Engineer - Datacenter networking Responsibilities: 1. Design and implement drivers (and/or ... PHY, FPGAs, sensors, fan control, power etc). 3. Develop and enhance HPC collective communication and parallel computing libraries such as NCCL, RCCL, OneCCL,… more
Recent Searches
- Senior Program Manager (Alabama)
- tobacco hours (United States)
- full time overnight stock (United States)
- product regulatory manager (United States)
Recent Jobs
-
Lead Data Scientist - AI
- CVS Health (New York, NY)
-
Apps Dev Tech Lead Analyst - C13
- Citigroup (Jersey City, NJ)