• Senior GPU and HPC Infrastructure Engineer…

    NVIDIA (Santa Clara, CA)
    …+ Implement monitoring and health management capabilities that enable industry-leading reliability , availability, and scalability of GPU assets. You will be ... systems (Kubernetes, SLURM) + Understanding of performance, security and reliability in complex distributed systems. Familiarity with system level architecture,… more
    NVIDIA (07/10/25)
    - Related Jobs
  • Senior Silicon Product Definition Engineer

    NVIDIA (Santa Clara, CA)
    …and board designers, software/firmware engineers, HW/SW applications engineering, process/ reliability specialists, ATE engineers, product managers, sales, and ... path analysis, power analysis, process technologies, transistor/device physics, silicon reliability , and aging mechanisms. + Familiarity with Perl, C/C++, tool… more
    NVIDIA (07/09/25)
    - Related Jobs
  • Senior Android Engineer, Health Mobile…

    Amazon (San Francisco, CA)
    …are responsible for the app architecture, developer onboarding, mobile app releases, reliability , and ensuring production issues gets routed to the right teams and ... language experience - 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience - Experience as a… more
    Amazon (07/04/25)
    - Related Jobs
  • Senior Software Engineer, Bare Metal…

    NVIDIA (Santa Clara, CA)
    …Implementing monitoring and health management capabilities that enable industry leading reliability , availability, and scalability of GPU assets. You will be ... Working with teams across NVIDIA to ensure production AI clusters run reliability and consistently with maximum performance. Evaluating system failures and improving… more
    NVIDIA (06/30/25)
    - Related Jobs
  • Senior Silicon Circuits System Design…

    NVIDIA (Santa Clara, CA)
    …and board designers, software/firmware engineers, HW/SW applications engineering, process/ reliability specialists, ATE engineers, product managers, sales, and ... path analysis, power analysis, process technologies, transistor/device physics, silicon reliability , and aging mechanisms. + Familiarity with Perl, C/C++, tool… more
    NVIDIA (06/13/25)
    - Related Jobs
  • Senior Systems Integration Engineer,…

    Ford Motor Company (Palo Alto, CA)
    …specifications, and interfaces considering factors such as bandwidth, latency, reliability , and scalability . Characterize the Vehicle Network by laying ... performance analysis, simulation, and testing to validate the functionality, reliability , and robustness of network communication systems under various operating… more
    Ford Motor Company (06/10/25)
    - Related Jobs
  • Senior Staff Software Engineer, TPU…

    Google (Sunnyvale, CA)
    …who use Google services around the world. We prioritize security, efficiency, and reliability across everything we do - from developing our latest TPUs to running ... a global network, while driving towards shaping the future of hyperscale computing. Our global impact spans software and hardware, including Google Cloud's Vertex AI, the leading AI platform for bringing Gemini models to enterprise customers. The US base… more
    Google (08/30/25)
    - Related Jobs
  • Senior Technical Account Manager, Google…

    Google (Sunnyvale, CA)
    …for customer events and launches, partnering with Support, Engineers, and Site Reliability Engineers to ensure customer success, and work with customers and support ... to guide issues/escalations to resolution. + Develop best practices and assets based on learnings from customer engagements to support initiatives to scale through partners and accelerate Google Cloud adoption. Google is proud to be an equal opportunity… more
    Google (08/30/25)
    - Related Jobs
  • Senior Staff Software Engineer,…

    Google (Sunnyvale, CA)
    …who use Google services around the world. We prioritize security, efficiency, and reliability across everything we do - from developing our latest TPUs to running ... a global network, while driving towards shaping the future of hyperscale computing. Our global impact spans software and hardware, including Google Cloud's Vertex AI, the leading AI platform for bringing Gemini models to enterprise customers. The US base… more
    Google (08/30/25)
    - Related Jobs
  • Senior Staff Software Engineer, GenAI…

    Google (Mountain View, CA)
    …who use Google services around the world. We prioritize security, efficiency, and reliability across everything we do - from developing our latest TPUs to running ... a global network, while driving towards shaping the future of hyperscale computing. Our global impact spans software and hardware, including Google Cloud's Vertex AI, the leading AI platform for bringing Gemini models to enterprise customers. The US base… more
    Google (08/29/25)
    - Related Jobs