• Sight Machine, Inc. (San Francisco, CA)
    …and the down to earth style of Detroit manufacturing. Together, we bring deep industry expertise and a shared commitment to advancing manufacturing towards a more ... Load (ETL), information retrieval, data aggregation and analytics, factory automation, distributed computing, security, and is leading the expansion into machine … more
    job goal (01/12/26)
    - Related Jobs
  • PriorLabs GmbH (San Francisco, CA)
    …building complex, scalable software, preferably for data processing, automated testing, or distributed systems. Deep , practical knowledge of the modern ML ... Hard Problems: Work closely with our ML researchers to translate deep technical challenges into well‑designed, scalable software systems. Qualifications Exceptional… more
    job goal (01/13/26)
    - Related Jobs
  • Pantera Capital (Palo Alto, CA)
    …or open to relocation. Focus Design, build, and implement large‑scale distributed training systems. Profiling, debugging, and optimizing multi‑host GPU utilization. ... not limited to Scalable orchestration framework and tools Machine learning compilers and runtime such as XLA, MLIR, and...compilers and runtime such as XLA, MLIR, and Triton Distributed training strategies such as FSDP, Megatron, and pipeline… more
    job goal (01/13/26)
    - Related Jobs
  • Comfy (San Francisco, CA)
    …that run millions of AI workflows. This means building rock solid distributed systems, designing data models that support explosive growth, and solving difficult ... engineering behind it You have a strong understanding of distributed systems and scaled large systems before What you'll...one week Nice to have Are not afraid to deep dive into infrastructure (Kubernetes / Helm) to get… more
    job goal (01/13/26)
    - Related Jobs
  • Sweya Information Technologies LLP (San Francisco, CA)
    …5+ years of experience in AI/ML engineering, Strong background in Python and deep learning frameworks, Experience with distributed systems and cloud ... form feedback-driven, self-improving systems for enterprise operations. Python TensorFlow/PyTorch Distributed Systems LLMs Apply for this Position We're excited to… more
    job goal (01/13/26)
    - Related Jobs
  • LucidLink Corp. (San Francisco, CA)
    …the Media & Entertainment industry and expanding into data-intensive sectors, you'll gain deep insight into cutting-edge technologies and play a role in shaping the ... a company with triple-digit growth rates means unparalleled opportunities for advancement, learning , and being part of an exciting journey toward unicorn status.… more
    job goal (01/13/26)
    - Related Jobs
  • Liquid AI (San Francisco, CA)
    …other remote locations. This Role Is For You If: You have experience with machine learning at scale You have worked with audio models and understand the effects of ... runtime, latency, and quality You're proficient in PyTorch, and familiar with distributed training frameworks like DeepSpeed, FSDP, or Megatron-LM You've worked with… more
    job goal (01/13/26)
    - Related Jobs
  • Scale AI, Inc. (San Francisco, CA)
    …specialities in back‑end systems. Extensive experience in software development and a deep understanding of distributed systems and public cloud platforms (AWS ... and actually useful, these models need human eval and reinforcement learning through human feedback (RLHF) during pre‑training, fine‑tuning, and production… more
    job goal (01/13/26)
    - Related Jobs
  • OpenAI (San Francisco, CA)
    …and cutting-edge research environments. Have experience working on large-scale Machine Learning infrastructure and distributed systems. Know how to reason ... internet, and take actions in secure environments. We're looking for people with deep experience building AI infrastructure and who are used to working closely with… more
    job goal (01/13/26)
    - Related Jobs
  • Lambda Inc. (San Francisco, CA)
    …artificial intelligence. One person, one GPU. If you'd like to build the world's best deep learning cloud, join us. *Note: This position requires presence in our ... and create new designs, architectures, standards, and methods for large-scale distributed systems. Engage in service capacity planning and demand forecasting,… more
    job goal (01/13/26)
    - Related Jobs