• Senior Deep Learning Systems Engineer,…

    NVIDIA (Santa Clara, CA)
    …growing datacenter deployments as well as establishing a data-driven approach to hardware design and system software development. The role of a Deep Learning Systems ... of deep learning applications on datacenter-class hardware and significantly influence the design and optimization of datacenters. Do you want to influence the… more
    NVIDIA (07/23/25)
    - Related Jobs
  • ( Senior ) Datacenter System Administrator

    pony.ai (Fremont, CA)
    …problems (repair or replace parts, debugging R&D system issues, etc.) + Design and standardize system or server solutions to meet AI/Autonomous Driving related ... assessments and contingency plans based on company business requirements. + Design and implement monitoring, configuration management and reporting functions that… more
    pony.ai (06/09/25)
    - Related Jobs
  • Senior Technical Program Manager I, Optical…

    Google (Sunnyvale, CA)
    …experience. + 8 years of experience in program management. + Experience in network infrastructure, architecture, and design of optical networks, as well as ... operational delivery in a changing environment. + Understanding of network infrastructure, Optical transceiver technologies, Optical transmission technologies, Optical… more
    Google (09/01/25)
    - Related Jobs
  • Senior Solutions Architect, Networking

    NVIDIA (Santa Clara, CA)
    NVIDIA is looking for an experienced network and systems infrastructure Solutions Architect. Do you want to be part of a team that brings new Artificial Intelligence ... account and program managers, working closely with the team to secure design wins + Identifying new business/project opportunities for NVIDIA products and technology… more
    NVIDIA (08/27/25)
    - Related Jobs
  • Senior Software Architect Networking

    NVIDIA (Santa Clara, CA)
    …for many years. The next unit of computing is the datacenter, and the network makes it all possible! We are growing our networking architecture team with people ... prevalent in AI applications, such as using NCCL. + Design and implement new techniques and protocols to accelerate...Excellent C/C++ programming and debugging skills. + Experience with network simulations. + Deep understanding of RDMA. + Proven… more
    NVIDIA (06/17/25)
    - Related Jobs
  • Senior Software Engineer, Infrastructure,…

    Google (Sunnyvale, CA)
    …+ Experience with Kubernetes networking principles, managing and troubleshooting network configurations in production Kubernetes environments for large enterprise. + ... all areas, including information retrieval, distributed computing, large-scale system design , networking and data storage, security, artificial intelligence, natural… more
    Google (08/31/25)
    - Related Jobs
  • Senior Staff Software Engineer, Media…

    Google (Mountain View, CA)
    …data analysis, and generative models. + Excellent investigative thinking, technical design , audio or video processing, and machine learning skills. Google's software ... all areas, including information retrieval, distributed computing, large-scale system design , networking and data storage, security, artificial intelligence, natural… more
    Google (08/31/25)
    - Related Jobs
  • Senior Staff Software Engineer, TPU…

    Google (Sunnyvale, CA)
    …all areas, including information retrieval, distributed computing, large-scale system design , networking and data storage, security, artificial intelligence, natural ... language processing, UI design and mobile; the list goes on and is...from developing our latest TPUs to running a global network , while driving towards shaping the future of hyperscale… more
    Google (08/30/25)
    - Related Jobs
  • Senior Software Engineer

    Datavant (Sacramento, CA)
    …the right format. Our platform is powered by the largest, most diverse health data network in the US, enabling data to be secure, accessible and usable to inform ... Staff Engineer you will work on the architecture and design level solutioning of our product while also driving...code, performing code reviews, and working on full stack design and architecture of applications + Exceptional ability to… more
    Datavant (08/29/25)
    - Related Jobs
  • Senior Performance and Resilience Engineer…

    Red Hat (Sacramento, CA)
    …for vLLM and llm-d (distributed LLM inference on Kubernetes/OpenShift). You will design and automate failure and resiliency experiments across vLLM, llm-d, and ... fault scenarios, and establish go/no-go gates for releases and CI/CD + Design GPU/accelerator-aware fault experiments that target vLLM and the stack beneath it… more
    Red Hat (08/28/25)
    - Related Jobs