• Senior AI-HPC Cluster Engineer - MLOps

    NVIDIA (Santa Clara, CA)
    …to HPC including InfiniBand, RDMA, RoCE and Amazon EFA. + Understanding of fast, distributed storage systems like Lustre and GPFS for AI/HPC workload. Experience ... Provide leadership and strategic mentorship on the management of large-scale HPC systems including the deployment of compute, networking, and storage. + Develop and… more
    NVIDIA (07/31/25)
    - Related Jobs
  • Software Development Engineer , Amazon…

    Amazon (San Francisco, CA)
    …artists, while also having access to their favorite established musicians. We build systems that are distributed on a large scale, spanning our music ... or architecture (design patterns, reliability and scaling) of new and existing systems experience Preferred Qualifications - 3+ years of full software development… more
    Amazon (07/25/25)
    - Related Jobs
  • Senior Software Development Engineer

    Amazon (East Palo Alto, CA)
    …readable and maintainable code and debug complex problems spanning across different systems * Communicate complex features, problems and solutions effectively and in ... (design patterns, reliability and scaling) of new and existing systems experience - Experience as a mentor, tech lead...of experience in design and development of large scale distributed databases or filesystem. - 5+ years experience writing… more
    Amazon (07/24/25)
    - Related Jobs
  • Security Engineer , Kuiper Security

    Amazon (Sunnyvale, CA)
    …delivering security software in a production environment - 3+ years experience delivering distributed software systems in Python, Java, Rust, GoLang or C/C+- 3+ ... the research & development, deployment and operation of several mission-critical security systems and mechanisms. You will work in a start-up like environment,… more
    Amazon (07/11/25)
    - Related Jobs
  • Principal Engineer - DL and AI Software

    NVIDIA (Santa Clara, CA)
    …and tools and a passion for engineering process optimization + Understanding of distributed systems and High Performance Computing workloads NVIDIA is leading ... + 15+ yrs of experience with designing and building complex software systems , especially in C++ and Python + Practical experience with industry-standard… more
    NVIDIA (07/07/25)
    - Related Jobs
  • Sr. Security Engineer , Kuiper Trust…

    Amazon (Sunnyvale, CA)
    …delivering security software in a production environment - 3+ years experience developing distributed software systems in Java, Rust, GoLang or C/C+- 3+ years ... the research & development, deployment and operation of several mission-critical security systems and mechanisms. You will work in a start-up like environment,… more
    Amazon (07/02/25)
    - Related Jobs
  • Security Engineer , Kuiper Security

    Amazon (Sunnyvale, CA)
    …delivering security software in a production environment - 3+ years experience delivering distributed software systems in Python, Java, Rust, GoLang or C/C+- 3+ ... the research & development, deployment and operation of several mission-critical security systems and mechanisms. You will work in a start-up like environment,… more
    Amazon (07/02/25)
    - Related Jobs
  • Senior Software Engineer , Amazon Games AI…

    Amazon (San Diego, CA)
    …effective Senior Game Developer - AI/ML to build and integrate innovative AI systems and tools into our game pipelines and customer experiences. Our mission to ... (design patterns, reliability and scaling) of new and existing systems experience - Experience as a mentor, tech lead...into game engines - Experience with RL, DRL, and distributed ML architectures - Experience with designing and deploying… more
    Amazon (06/12/25)
    - Related Jobs
  • Senior Data Engineer - Audience Manager…

    The Walt Disney Company (Glendale, CA)
    …experience modeling and developing large data pipelines + Hands-on experience with distributed systems such as Spark, Hadoop (HDFS, Hive, Presto, PySpark) ... Work with engineering teams to collect required data from internal and external systems . + Support Agile methodologies such as Scrum by actively participating in… more
    The Walt Disney Company (06/05/25)
    - Related Jobs
  • (USA) Staff, Software Engineer

    Walmart (Sunnyvale, CA)
    …/ ACR pipelines (frame grabbers, fingerprinting, video embeddings) or embedded multimedia systems . + Strong grasp of distributed training (DDP/Horovod) and MLOps ... Option 1: Bachelor's degree in computer science, computer engineering, computer information systems , software engineering, or related area and 4 years' experience in… more
    Walmart (05/28/25)
    - Related Jobs