• SBU Product Line Management IV - (E4)

    Applied Materials (Santa Clara, CA)
    …benefits (https://hrportal.ehr.com/applied/) . Job Description: The Applied Global Services (AGS) DDP Service Business Unit (SBU) is seeking an ambitious candidate ... as business acumen. This person will work closely with DDP Business Unit, AGS OCE and Operations, and AGS...other disciplines within the function. + 3-5 years of DDP BU / Application / Platform experience. + Program… more
    Applied Materials (09/03/25)
    - Related Jobs
  • Principal AI and ML Infra Software Engineer, GPU…

    NVIDIA (Santa Clara, CA)
    …in supervising and improving substantial distributed training operations using PyTorch ( DDP , FSDP), NeMo, or JAX. Moreover, an in-depth understanding of AI/ML ... workflows, involving data processing, model training, and inference pipelines. + Proficiency in programming & scripting languages such as Python, Go, Bash, as well as familiarity with cloud computing platforms (eg, AWS, GCP, Azure) in addition to experience… more
    NVIDIA (08/27/25)
    - Related Jobs
  • (USA) Staff, Software Engineer

    Walmart (Sunnyvale, CA)
    …or embedded multimedia systems. + Strong grasp of distributed training ( DDP /Horovod) and MLOps (Kubeflow, Airflow, MLFlow, Feature stores). + Knowledge of ... data privacy frameworks (FL, DPSGD, HE, SMPC) and ability to translate compliance constraints into code. + Bachelor or higher in CS, EE, Math, or related field. **Preferred experiences:** + Background in ad tech or retail media optimization (incrementally real… more
    Walmart (08/24/25)
    - Related Jobs
  • Software Development Manager, AWS Neuron Machine…

    Amazon (Cupertino, CA)
    …in Pytorch, XLA, JAX as well as distributed training libraries like FSDP, DDP and others. Includes enabling models using MoE architectures and future newer ... architectures. Lead the way to ensure support for key ML functionality in a combined chip / software platform Ensure the right thing is being built and delivered to customers Key job responsibilities Our engineers and managers collaborate across diverse teams,… more
    Amazon (08/15/25)
    - Related Jobs
  • Senior AI and ML Storage Infra Software Engineer,…

    NVIDIA (Santa Clara, CA)
    …+ Expertise in running and optimizing large-scale distributed training workloads using PyTorch ( DDP , FSDP), NeMo, or JAX. Also, possess a deep understanding of AI/ML ... workflows, encompassing data processing, model training, and inference pipelines. + Proficiency in programming & scripting languages such as Python, Go, Bash, as well as familiarity with cloud computing platforms (eg, AWS, GCP, Azure, OCI) in addition to… more
    NVIDIA (08/08/25)
    - Related Jobs
  • Senior ML Software Engineer

    NVIDIA (Santa Clara, CA)
    …of the attention mechanisms. + Hands on experience with large scale training (eg, ZeRO, DDP , FSDP, TP, CP) and data processing (eg Ray, Spark). + All we do is ... in Python and we open source our product, therefore production-quality software engineering skills is highly relevant. + MS or PhD or equivalent experience in Computer Science, Machine Learning, Applied Math, Physics, or a related field. Ways to stand out from… more
    NVIDIA (07/26/25)
    - Related Jobs