- Snap Inc. (Bellevue, WA)
- …role in scaling our ML Infrastructure, optimizing AI training and inference systems , and driving innovations that make Snapchat's ranking and recommendation ... systems more efficient and impactful. We're looking for a... for machine learning workloads at scale and drive reliability and efficiency improvements across Snapchat's ML Infrastructure +… more
- Amazon (Redmond, WA)
- …technical field - 5+ years of professional experience in hardware or systems development roles in aerospace, automotive, or high- reliability electronics ... Hands-on experience with lab setup, test rack integration, and support of large systems - Proficiency with Ubuntu/Linux system configuration, netplan, and basic… more
- Oracle (Seattle, WA)
- …the AI space building systems that operate at unprecedented speed, scale and reliability . You should be a systems specialist with exposure to low level ... customers we're building provisioning, repair, monitoring, maintenance, configuration and validation systems that enable us to deliver high quality GPU clusters to… more
- Oracle (Olympia, WA)
- …generalist, able to dive deep into any part of the stack and low level systems , as well as design broad distributed system interactions. You should value ... SQL databases, NoSQL systems , and distributed file systems such as HDFS, ensuring reliability and scalability. + Build platform capabilities that serve the… more
- Oracle (Seattle, WA)
- …ambiguous problems into elegant, maintainable software. + **Instrument and monitor systems ** to ensure reliability , performance, and observability. + Contribute ... data, and ideas. You'll design and deliver internal tools and systems that power decision-making, observability, and automation across OCI's strategic customer… more
- Oracle (Olympia, WA)
- …and tools to operationalize Large Language Models (LLMs) and agentic AI systems . Our goal is to empower developers and enterprises to deploy intelligent ... (IC4), you will contribute to the design and implementation of scalable, distributed systems that serve LLMs and support agent-based workflows. You will work in a… more
- SpaceX (Redmond, WA)
- …high- reliability electronics for satellites and spacecraft. + Drive system trades, requirements capture, component selection, analysis, schematic capture, PCB ... and spacecraft to deploy Starlink, the world's most advanced broadband internet system . Starlink is the world's largest satellite constellation and is providing… more
- Broadcom (Bellevue, WA)
- …digital workspaces to improve mobile experiences, and transforming cyber security. ESXi Operating System is at the core of VCF virtualization technology. It is an ... operating system and virtualization infrastructure built from scratch for the...scheduling-related resource management features, with emphasis on scalability, performance, reliability , and support of new hardware technologies. + Work… more
- Microsoft Corporation (Redmond, WA)
- …developing and following the playbook, working on call to monitor system /product/service for degradation, downtime, or interruptions. + Communicating status updates ... clearly and initiates actions to restore system /product/service for simple and complex problems when appropriate. + Proactively seeks new knowledge and adapts to new… more
- Microsoft Corporation (Redmond, WA)
- …and training pipelines, lead model quantization and performance optimization, and design systems that measure real user engagement at global scale. This opportunity ... Perform ongoing quality tuning and AI feature optimization to improve reliability and user outcomes. **On-Device Performance Optimization** + Lead cold-start… more