- Applied Materials (Santa Clara, CA)
- …benefits (https://hrportal.ehr.com/applied/) . Job Description: The Applied Global Services (AGS) DDP Service Business Unit (SBU) is seeking an ambitious candidate ... as business acumen. This person will work closely with DDP Business Unit, AGS OCE and Operations, and AGS...other disciplines within the function. + 3-5 years of DDP BU / Application / Platform experience. + Program… more
- NVIDIA (Santa Clara, CA)
- …in supervising and improving substantial distributed training operations using PyTorch ( DDP , FSDP), NeMo, or JAX. Moreover, an in-depth understanding of AI/ML ... workflows, involving data processing, model training, and inference pipelines. + Proficiency in programming & scripting languages such as Python, Go, Bash, as well as familiarity with cloud computing platforms (eg, AWS, GCP, Azure) in addition to experience… more
- Walmart (Sunnyvale, CA)
- …or embedded multimedia systems. + Strong grasp of distributed training ( DDP /Horovod) and MLOps (Kubeflow, Airflow, MLFlow, Feature stores). + Knowledge of ... data privacy frameworks (FL, DPSGD, HE, SMPC) and ability to translate compliance constraints into code. + Bachelor or higher in CS, EE, Math, or related field. **Preferred experiences:** + Background in ad tech or retail media optimization (incrementally real… more
- Amazon (Cupertino, CA)
- …in Pytorch, XLA, JAX as well as distributed training libraries like FSDP, DDP and others. Includes enabling models using MoE architectures and future newer ... architectures. Lead the way to ensure support for key ML functionality in a combined chip / software platform Ensure the right thing is being built and delivered to customers Key job responsibilities Our engineers and managers collaborate across diverse teams,… more
- NVIDIA (Santa Clara, CA)
- …+ Expertise in running and optimizing large-scale distributed training workloads using PyTorch ( DDP , FSDP), NeMo, or JAX. Also, possess a deep understanding of AI/ML ... workflows, encompassing data processing, model training, and inference pipelines. + Proficiency in programming & scripting languages such as Python, Go, Bash, as well as familiarity with cloud computing platforms (eg, AWS, GCP, Azure, OCI) in addition to… more
- NVIDIA (Santa Clara, CA)
- …of the attention mechanisms. + Hands on experience with large scale training (eg, ZeRO, DDP , FSDP, TP, CP) and data processing (eg Ray, Spark). + All we do is ... in Python and we open source our product, therefore production-quality software engineering skills is highly relevant. + MS or PhD or equivalent experience in Computer Science, Machine Learning, Applied Math, Physics, or a related field. Ways to stand out from… more