• Senior AI- HPC Cluster…

    NVIDIA (Santa Clara, CA)
    …Make the choice to join us today! As a member of the GPU AI/ HPC Infrastructure team, you will provide leadership in the design and implementation of ground ... + Provide leadership and strategic guidance on the management of large-scale HPC systems including the deployment of compute, networking, and storage. + Develop… more
    NVIDIA (04/02/25)
    - Related Jobs
  • Senior AI- HPC Storage…

    NVIDIA (Santa Clara, CA)
    …Make the choice to join us today! As a member of the GPU AI/ HPC Infrastructure team, you will provide leadership in the design and implementation of ground ... implementation of distributed storage services. + Design, implement an on-prem AI/ HPC infrastructure supplemented with cloud computing to support the growing needs… more
    NVIDIA (02/05/25)
    - Related Jobs
  • Senior Site Reliability Engineer

    NVIDIA (Santa Clara, CA)
    …efficiency, and performance and drive foundational improvements and automation to improve engineer 's productivity. As a Site Reliability Engineer , you are ... + Manage and support workload and resource schedulers in a large-scale HPC environment. + Automate Everything: Develop automation scripts to automate deployment,… more
    NVIDIA (04/04/25)
    - Related Jobs
  • Sr Core Infrastructure Engineer HPC

    Children's Mercy Kansas City (Kansas City, MO)
    …we can improve the lives of children beyond the walls of our hospital. Overview Senior Core Infrastructure Engineer - HPC plans, designs, implements and ... and turning them into working systems and services. The Senior Core Infrastructure Engineer - HPC...and skills to perform advanced management and troubleshooting for Linux Servers in a hybrid environment. Requires knowledge of… more
    Children's Mercy Kansas City (03/28/25)
    - Related Jobs
  • Software Development Engineer , HPC

    Amazon (Cupertino, CA)
    Description We are seeking an experienced engineer to work on distributed AI/ML systems. This role involves working on collective operations - the fundamental ... C/C++ and relatively low level, so solid knowledge of Linux , kernels, and performant code is important. Experience with...systems is valued, and experience with high-speed networking or HPC interconnects is valued highly. If you like solving… more
    Amazon (05/03/25)
    - Related Jobs
  • Software Development Engineer , Nitro High…

    Amazon (Sunnyvale, CA)
    …we're building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. ... will help each team member develop into a better-rounded engineer and enable them to take on more complex...with systems knowledge and experience in area such as Linux OS boot sequencing, Kernel, Hypervisor (Xen or KVM),… more
    Amazon (04/29/25)
    - Related Jobs
  • HPC Operations Manager - Hardware…

    NVIDIA (Santa Clara, CA)
    …to support their future chip design needs, understand their workflow characteristics, and engineer an efficient HPC environment. Work with IT and engineering ... intelligence to autonomous cars. We are now looking for a highly motivated HPC Operations Manager to join this multifaceted and innovative infrastructure team to… more
    NVIDIA (03/12/25)
    - Related Jobs
  • Senior Linux IT Systems…

    Lockheed Martin (Orlando, FL)
    … IT Systems Engineer with a focus on supporting Red Hat\-based Linux distributions, Kubernetes and High Performance Computing \( HPC \) clusters\. * Linux ... * Maintain operating system cyber compliance, upgrades, and patching\. \(Windows/ Linux /VMWare\) * Perform guidance and management for hardware, including… more
    Lockheed Martin (04/06/25)
    - Related Jobs
  • Senior Systems Administrator - Windows/…

    Mount Sinai Health System (New York, NY)
    …Mount Sinai. The Administrator is the principal technology expert for Windows and Linux systems, and help support high-performance computing ( HPC ) environment in ... and a research data services team. The **_Senior Systems Administrator/ Engineer ,_** as a member of the Scientific Computing and...TSM system is integrated with the 25,000-core, 30 petabyte HPC system. This position reports to the Director for… more
    Mount Sinai Health System (03/25/25)
    - Related Jobs
  • Linux System Administrator IV (Server)

    Leidos (Annapolis Junction, MD)
    …Administrator IV - Server** for a new customer on a strategic High-Performance Computing ( HPC ) program. The Senior System Administrator will need to be a ... judgment, and the ability to work within a team to mature the HPC capabilities of our customer. **Primary Responsibilities:** + Responsible for overseeing the most… more
    Leidos (04/10/25)
    - Related Jobs