- Charles Schwab (San Francisco, CA)
- …impact. + Champion reliability, monitoring, observability, and operational best practices for AI systems and data pipelines. + Collaborate with cross-functional ... + Ability to solve complex problems with ambiguous or incomplete data in distributed systems . + Demonstrated business domain knowledge relevant to previous… more
- New York Times (New York, NY)
- …Go. + 7+ years of experience designing, building, and operating complex, large-scale distributed systems and APIs. + 5+ years of experience database ... it's worth paying for. About the Role The New AI Products and Platforms mission is a central hub...+ Lead the design and implementation of highly scalable, reliable , and performant backend services and APIs. + Make… more
- Lilly (Indianapolis, IN)
- …will be driving the engineering and operations of advanced Linux platforms supporting AI and HPC workloads, managing Nvidia DGX systems using Mission Control, ... Come help us unlock the power of HPC and AI based POGPU and Accelerated Compute infrastructure! The Cloud...will have groundbreaking chances to build secure, resilient, and reliable hybrid cloud services using proactive, predictive, and automated… more
- The Hertz Corporation (Atlanta, GA)
- …team. + **Architectural Guidance:** Oversee the design, development, and scaling of robust, reliable , and secure AI systems and infrastructure, ensuring ... + **System Architecture:** Demonstrated ability to design scalable, fault-tolerant, and high-performance distributed systems for AI workloads. + **Generative… more
- NVIDIA (Santa Clara, CA)
- …subject area (or equivalent experience). + 10+ years of experience managing large-scale distributed systems or enterprise AI infrastructure. + Expert-level ... secure, reliable production deployments. + Industry-leading expertise in AI /LLM infrastructure and agentic systems , including end-to-end design and… more
- General Motors (Sunnyvale, CA)
- …must have:** + 10+ years of experience, with a strong background in large-scale distributed systems preferred. + 5+ years of experience leading and driving ... **Job Description** Principal AI /ML Engineer, AV ML Infra We're General Motors...model performance by running large-scale simulation workloads and managing reliable ML inference pipelines. + **ML Compute:** Streamlines andoptimizeslarge-scale… more
- Robert Half Technology (Miami, FL)
- … AI /ML Engineer will design, build, and deploy production-grade machine learning and AI systems that power core products and features. This role bridges ... cutting-edge research with reliable , scalable engineering, turning prototypes into high-performance services that...equivalent experience + 3-8+ years of hands-on experience shipping ML/ AI systems to production + Expert-level Python… more
- General Motors (Mountain View, CA)
- …research efforts, and guide teams in transitioning scientific advances into reliable robotic capabilities deployed in real systems . **What** **You'll** ... **Job Description** Our AI Research team is building end-to-end robot policies...integrating multimodal perception, robot learning architectures, and physical execution systems to solve manipulation, autonomy, and simulation challenges at… more
- LinkedIn (Mountain View, CA)
- … and some or all of big data technologies like Hadoop, Spark, distributed key-value stores, streaming processes, recommender systems , statistical methods, and ... for leading a team of world class applied researchers, ML/ AI , and full stack engineers who are solving cutting-edge...define the bar for quality and efficiency of software systems by developing best practices and strategies that balance… more
- Microsoft Corporation (Redmond, WA)
- …in one or more programming languages (eg, C#, Java, Python). + Experience with distributed systems , cloud services (Azure, AWS, or GCP), or large-scale data ... **Overview** Join Microsoft's CoreAI - AI Platform team in Bay Area/Redmond to build...agent-driven automation for key stages. + **Develop secure and reliable infrastructure** for data access, entitlement management, and operational… more