• Principal Software Engineer - Copilot Security

    Microsoft Corporation (Redmond, WA)
    …+ Hands-on experience with distributed training frameworks (Ray, Slurm, HPC ), containerization and orchestration technologies (Docker, Kubernetes) for ML model ... deployment, and ML lifecycle management in production environments. + Experience designing evaluation frameworks for LLM-based applications and implementing observability for agent systems using tools such as Phoenix, MLFlow, LangFuse, or custom eval… more
    Microsoft Corporation (12/12/25)
    - Related Jobs
  • Cloud Hardware Dev Engineer (AWS Generative AI…

    Amazon (Seattle, WA)
    …cloud offerings that enable high performance and scalability in AI/ML and HPC workloads. Utility Computing (UC) AWS Utility Computing (UC) provides product ... innovations - from foundational services such as Amazon's Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS's services and features apart in the industry. As a member… more
    Amazon (12/10/25)
    - Related Jobs
  • Systems Development Eng (AWS Generative AI & ML…

    Amazon (Seattle, WA)
    …AWS cloud offerings that enable high performance and scalability in AI/ML and HPC workloads. You are intrigued by the continuous release of newer AWS services ... and instance types that solve newer, bigger and more interesting business problems every day? Does that make you wish your talents were applied to those at cloud scale? If yes, then come join us - we are looking for builders like you. The AWS Hardware… more
    Amazon (12/10/25)
    - Related Jobs
  • Group Leader - Future Computing Technologies

    Pacific Northwest National Laboratory (Richland, WA)
    …+ Strong publication record in advanced architectures, design automation, co-design, HPC , quantum computing, AI, microelectronics, or other related area + ... Demonstrated success securing competitive research funding (DOE, DoD, NSF, or industry) + Experience leading large, interdisciplinary R&D efforts and managing multi-million-dollar portfolios + Proven ability to mentor and develop staff, including supporting… more
    Pacific Northwest National Laboratory (12/09/25)
    - Related Jobs
  • Senior Software Engineer

    Microsoft Corporation (Redmond, WA)
    …performance analysis and optimization of state of the art LLMs, HPC applications including proficiency using GPU profiling tools Cross-team collaboration skills ... and the desire to collaborate in a team of researchers and developers + Ability to independently lead projects Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:… more
    Microsoft Corporation (12/06/25)
    - Related Jobs
  • Senior Principal System Solution Architect

    Microsoft Corporation (Redmond, WA)
    …hardware development lifecycle. + Proficient understanding of state of the art of AI/ HPC physical infrastructure. + Ability to analyze solutions from a full TCO ... perspective, including but not limited to: CAPEX, OPEX, the system constraints that drive design tradeoffs, technology, quality and serviceability. + Deep and broad knowledge of current and emerging copper and optical based interconnect technologies. +… more
    Microsoft Corporation (12/06/25)
    - Related Jobs
  • Senior Capacity Delivery Reliability Planner , AWS…

    Amazon (Seattle, WA)
    …cross-functional strategic capacity modeling to support emerging workloads (such as AI/ML and HPC ) and future region expansions will be a crucial part of your role. ... Additionally, you'll drive long-term innovation in planning frameworks and digital planning tools to support dynamic business shifts. Enterprise Level: Your responsibilities include delivering enterprise-wide SIOP transformations that break down silos between… more
    Amazon (12/06/25)
    - Related Jobs
  • Senior Software Engineer - Copilot Security

    Microsoft Corporation (Redmond, WA)
    …+ Hands-on experience with distributed training frameworks (Ray, Slurm, HPC ), containerization and orchestration technologies (Docker, Kubernetes) for ML model ... deployment, and ML lifecycle management in production environments + Experience designing evaluation frameworks for LLM-based applications and implementing observability for agent systems using tools such as Phoenix, MLFlow, LangFuse, or custom eval harnesses;… more
    Microsoft Corporation (12/06/25)
    - Related Jobs
  • Software Engineering Manager, MTIA

    Meta (Bellevue, WA)
    …10. Experience in leading teams working on high performance computing ( HPC ) and AI/ML systems, including: GPU/ASIC-based kernel development and optimization (eg ... CUDA, ROCm), distributed systems for large scale training and serving, and systems architecture and performance 11. Accelerator (GPU/ASIC) kernel development and optimization 12. Experience in accelerating libraries on AI hardware, similar to cuBLAS, cuDNN,… more
    Meta (12/06/25)
    - Related Jobs
  • Software Developer 4

    Oracle (Olympia, WA)
    …(OCI) Cluster Networking team is building an ultra-high-performance network to support AI/ML/ HPC workloads. Join us to design systems that scale from tens to ... hundreds of thousands of GPUs without sacrificing performance. Our team develops and tunes the software and hardware stack for distributed workloads using libraries such as NCCL on high-speed networks. Strong knowledge and practical experience with NCCL is… more
    Oracle (12/02/25)
    - Related Jobs