• Senior Software Engineer , AI Resiliency

    NVIDIA (Santa Clara, CA)
    …straggler/hang detection. + Hands-On Coding & Optimization: Contribute to large-scale distributed systems with high-quality, production-level C++ and Python ... We are now looking for a Senior Software Engineer for AI Resiliency. At NVIDIA, we are...6+ years of relevant experience + Strong understanding of distributed systems concepts , parallel programming, and… more
    NVIDIA (10/15/25)
    - Related Jobs
  • Senior Software Engineer

    Amazon (San Francisco, CA)
    …to ingest and watch live video. If you're excited by the idea of building backend distributed systems that operate at a worldwide scale, and have an interest in ... Senior Software Engineer @ Twitch San Francisco, CA **About Us**...and other engineers to design, develop, launch, and operate distributed systems at scale. + Mentor and… more
    Amazon (09/27/25)
    - Related Jobs
  • Senior AI Infrastructure Engineer - DGX…

    NVIDIA (Santa Clara, CA)
    …well on projects initiated by others. + Experience with infrastructure automation and distributed systems design developing tools for running large scale private ... from the crowd: + Interest in crafting, analyzing and fixing large-scale distributed systems . + Systematic problem-solving approach, coupled with strong… more
    NVIDIA (11/06/25)
    - Related Jobs
  • Software Engineer , Machine Learning

    Meta (Menlo Park, CA)
    systems , computer vision, natural language processing, data mining, or distributed systems 13. Translating insights into business recommendations 14. ... and matching patterns from different areas of computer science in production systems and 25. Distributed systems **Public Compensation:** $178,360/year… more
    Meta (10/28/25)
    - Related Jobs
  • Software Engineer , Machine Learning

    Meta (Menlo Park, CA)
    systems , computer vision, natural language processing, data mining, or distributed systems 14. Translating insights into business recommendations 15. Hadoop, ... and matching patterns from different areas of computer science in production systems 26. Distributed systems **Public Compensation:** $227,920/year to… more
    Meta (10/24/25)
    - Related Jobs
  • Software Engineer , Machine Learning

    Meta (Menlo Park, CA)
    …5 years of experience in the following: 10. Filesystems, server architectures, and distributed systems 11. Machine learning, recommendation systems , pattern ... simple commands 18. Building highly-scalable performant solutions 19. Design scalable distributed systems with established partition tolerance, consistency, and… more
    Meta (10/24/25)
    - Related Jobs
  • Senior Engineer - AI and HPC Observability

    NVIDIA (Santa Clara, CA)
    …large-scale telemetry data pipelines leveraging OpenTelemetry, Kafka, Prometheus, and other distributed systems to ingest, process, and analyze massive data ... ML or statistical techniques. + Excellent problem-solving, debugging, and performance-tuning skills in distributed systems . Ways To Stand Out from The Crowd: +… more
    NVIDIA (10/22/25)
    - Related Jobs
  • Software Engineer , Machine Learning

    Meta (Menlo Park, CA)
    systems , computer vision, natural language processing, data mining, or distributed systems 12. 3. Translating insights into business recommendations 13. ... recognizing and matching patterns from different areas of computer science in production systems and 24. 15. Distributed systems **Public Compensation:**… more
    Meta (10/20/25)
    - Related Jobs
  • Software Engineer , Machine Learning

    Meta (Menlo Park, CA)
    systems , computer vision, natural language processing, data mining, or distributed systems 14. 3. Translating insights into business recommendations 15. ... recognizing and matching patterns from different areas of computer science in production systems and, 26. 15. Distributed systems **Public Compensation:**… more
    Meta (10/01/25)
    - Related Jobs
  • Software Engineer (Machine Learning)

    Meta (Menlo Park, CA)
    systems , computer vision, natural language processing, data mining, or distributed systems 13. 3. Translating insights into business recommendations 14. ... recognizing and matching patterns from different areas of computer science in production systems and 24. 14. Distributed systems **Public Compensation:**… more
    Meta (09/30/25)
    - Related Jobs