- NVIDIA (Santa Clara, CA)
- …We deliver communication runtimes like NCCL and NVSHMEM for Deep Learning and HPC applications. We are looking for a motivated Partner Enablement Engineer ... guide our key partners and customers with NCCL. Most DL/ HPC applications run on large clusters with high-speed networking...Develop tools and automation to isolate issues on new systems and platforms, including cloud platforms (Azure, AWS, GCP,… more
- Northrop Grumman (Los Angeles, CA)
- …is currently looking for an experienced **Principal or Senior Principal** level engineer in **Modeling and Simulations, Systems engineering or software ... to modeling and simulation. **Basic Qualifications for a Principal Engineer , Modeling and Simulation Systems /Software - (Level...running Monte Carlo simulations + Experience working with an HPC system + Experience with hardware in the loop… more
- Meta (Menlo Park, CA)
- **Summary:** Meta is seeking an experienced Production Systems Engineer to join our Release to Production (RTP) team. Our servers and data centers are the ... and lifecycle of servers in production. **Required Skills:** Production Systems Engineer , Fleet AI Systems ...contributor 18. 3+ years of experience supporting AI or HPC systems and/or related systems ,… more
- NVIDIA (Santa Clara, CA)
- Do you have expertise in CUDA kernel optimization, C++ systems programming, or compiler infrastructure? Join NVIDIA's nvFuser (https://github.com/NVIDIA/Fuser) team ... of GPUs! We're looking for engineers who excel at parallel programming and systems -level performance work and want to directly impact the future of AI compilation.… more
- Amazon (Cupertino, CA)
- Description We are seeking an experienced engineer to work on distributed AI/ML systems . This role involves working on collective operations - the fundamental ... kernels, and performant code is important. Experience with embedded systems is valued, and experience with high-speed networking or... is valued, and experience with high-speed networking or HPC interconnects is valued highly. If you like solving… more
- SpaceX (Hawthorne, CA)
- Site Reliability Engineer , GNC (Falcon) Hawthorne, CA Apply SpaceX was founded under the belief that a future where humanity is out exploring the stars is ... goal of enabling human life on Mars. SITE RELIABILITY ENGINEER , GNC (FALCON) SpaceX is looking for a Site...simulations on a high-performance computing cluster, automated data analysis systems , continuous integration systems for rocket and… more
- NVIDIA (Santa Clara, CA)
- …Senior AI Observability Engineer to help architect and implement distributed observability systems for AI and HPC clusters. We serve and collaborate directly ... You will be working with a team of dedicated engineers on systems for data collection, aggregation, enrichment, storage, retrieval, and visualization to… more
- NVIDIA (Santa Clara, CA)
- …AI, ideally across the entire lifecycle-from design to deployment-of large-scale High-Performance Computing ( HPC ) systems . Ways to Stand Out from the Crowd: + ... We are seeking a Distinguished Engineer for AI Resiliency at NVIDIA! Join NVIDIA...GPU, memory, storage, and networking. + Experience in implementing HPC software development best practices in large-scale systems… more
- Stanford University (Stanford, CA)
- …researchers from a variety of Stanford and SLAC organizations. The majority of the HPC systems are hosted in the Stanford Research Computing Facility (SRCF), ... Research Data Center Facility Engineer **Business Affairs: University IT (UIT), Stanford, California,...Stanford Research Computing. Research Computing offers High Performance Computing ( HPC ) hosting services, computational and data systems ,… more
- UCLA Health (Los Angeles, CA)
- …UCLA Health IT is looking for an outstanding Analytics DevOps and Platform Engineer , (IT Architect), to join the Solutions Architecture and Engineering (SAE) group. ... will possess a well-rounded skillset encompassing software development, knowledge of HPC and Citrix environments, and relevant cloud certifications. We are looking… more