• Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    …drive foundational improvements and automation to improve engineer 's productivity. As a Site Reliability Engineer , you are responsible for the big ... be doing: + Troubleshoot incoming support requests in a large-scale HPC environment. + Contribute enhancements to existing deployment automation, configuration… more
    NVIDIA (11/05/25)
    - Related Jobs
  • Sr. Software Development Engineer

    Amazon (Cupertino, CA)
    Description We are seeking an experienced engineer to work on distributed AI/ML systems. This role involves working on collective operations - the fundamental ... systems is valued, and experience with high-speed networking or HPC interconnects is valued highly. If you like solving...5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience… more
    Amazon (08/11/25)
    - Related Jobs
  • Site Reliability Engineer

    LTD Global (Berkeley, CA)
    Position overview: We are seeking a Site Reliability Engineer to join our Operations Group. This role plays a key part in advancing scientific discovery by ... supporting high-performance computing ( HPC ) and data analysis for the organization. Our center...energy efficiency, environmental science, and other missions. As a Site Reliability Engineer , you will… more
    LTD Global (09/23/25)
    - Related Jobs
  • Site Reliability Engineer

    SpaceX (Hawthorne, CA)
    Site Reliability Engineer , GNC (Falcon)...maintain virtual and physical servers + Work with SpaceX HPC team to monitor and maintain a 4000+ thread ... the ultimate goal of enabling human life on Mars. SITE RELIABILITY ENGINEER , GNC (FALCON)...HPC cluster + Closely collaborate with GNC software engineers… more
    SpaceX (10/16/25)
    - Related Jobs
  • Senior High Performance Computing Engineer

    SLAC National Accelerator Laboratory (Menlo Park, CA)
    Senior High Performance Computing Engineer Job ID 6383 Location SLAC - Menlo Park, CA Full-Time Regular **SLAC Job Postings** **About SLAC:** The SLAC National ... the nature of this position, SLAC is open to on- site and hybrid work options.** **Position Overview:** As a...options.** **Position Overview:** As a Senior High Performance Computing Engineer in the Scientific Computing Services Division of the… more
    SLAC National Accelerator Laboratory (10/25/25)
    - Related Jobs
  • Research Data Center Facility Engineer

    Stanford University (Stanford, CA)
    Research Data Center Facility Engineer **Business Affairs: University IT (UIT), Stanford, California, United States** Facilities Post Date Sep 30, 2025 Requisition # ... during the hiring process._ Stanford Research Computing is looking for an experienced facility engineer to join our team. Our staff work directly with some of the… more
    Stanford University (10/01/25)
    - Related Jobs
  • Sr. ML Kernel Performance Engineer , AWS…

    Amazon (Cupertino, CA)
    …language experience - 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience - 5+ years of ... full software development experience - Expertise in accelerator architectures for ML or HPC such as GPUs, CPUs, FPGAs, or custom architectures - Experience with GPU… more
    Amazon (08/15/25)
    - Related Jobs
  • Sr. System Development Engineer

    Amazon (Cupertino, CA)
    …operate next-generation infrastructure that powers breakthrough innovation in AI/ML and HPC workloads. If you're passionate about pushing the limits of performance, ... complex problems. You will decompose big difficult server system testability, reliability and diagnosis problems into straightforward tasks, components or features… more
    Amazon (10/25/25)
    - Related Jobs
  • Package Design Engineer

    Broadcom (San Jose, CA)
    …**Job Description:** Broadcom is seeking an experienced IC package-design engineer for complex flip-chip-BGA packages for industry-leading ASICs with high-speed ... package designs for ASICs for artificial intelligence (AI), networking, high-performance computing ( HPC ), and 5G base stations. These designs include SerDes at 224G… more
    Broadcom (08/19/25)
    - Related Jobs
  • Sr Hardware Development Engineer , High…

    Amazon (Cupertino, CA)
    …operate next-generation infrastructure that powers breakthrough innovation in AI/ML and HPC workloads. If you're passionate about pushing the limits of performance, ... in product development disciplines such as, thermal, mechanical, power, FW/SW, reliability , and sustaining - Experience deploying and operating hardware and… more
    Amazon (11/05/25)
    - Related Jobs