- TP-Link North America, Inc. (Irvine, CA)
- …Azure, OCI) and cloud-based databases (eg, MongoDB, SQL databases). + Proficiency in distributed systems , including middleware such as message queues. + Advanced ... the United States, TP-Link Systems Inc. is a global provider of reliable networking devices and smart home products, consistently ranked as the world's top… more
- Walmart (Bentonville, AR)
- …technologies such as Kafka for building scalable, event-driven architectures and ensuring reliable data streaming between distributed systems . + Experience ... related field, with 5+ years of experience in large-scale distributed systems . + Strong communication skills and...and analytics. Nice to have : + Familiarity with AI /GenAI, LLMs, chatbots, and using AI tools… more
- Amazon (Lockbourne, OH)
- …of the fundamentals of Computer Science, and practical experience building large-scale distributed systems . This person has thrived and succeeded in delivering ... to lead the development of industry-leading models with multimodal systems . As a Senior SDE with the AGI team,...Large Language Models (LLMs) and Generative Artificial Intelligence (Gen AI ). You will have significant influence on our overall… more
- MongoDB (Pittsburgh, PA)
- …As a senior SRE, you will be expected to be able to design & build complex systems , operate with autonomy and act as owner for everything you do. The SRE Atlas team ... the various Atlas software engineering teams to provide expertise about running systems at scale, build new tooling and automation and perform essential maintenance… more
- Cisco (Chicago, IL)
- …Lead a team of super smart engineers who are passionate about large scale distributed systems for Splunk Cloud Observability in FedRAMP environments + Manage ... etc. + Excellent problem-solving, triaging, and debugging skills in large-scale distributed systems **Preferred Qualifications** + Familiarity working with… more
- Microsoft Corporation (Redmond, WA)
- …As a Software Engineer, the candidate will design, develop, and maintain robust, reliable , and highly distributed software systems using modern technologies. ... thrive at work and beyond. **Responsibilities** + Designs, develops, and maintains distributed software systems using modern technologies. + Collaborates with… more
- NVIDIA (Santa Clara, CA)
- …HPC including InfiniBand, RDMA and RoCE. + Understanding of fast, distributed storage systems such as Lustre and GPFS for AI /HPC workload. + Familiarity with ... ). + Experience analyzing and tuning performance for a variety of AI /HPC workloads. Excellent problem-solving to analyze complex systems , identify bottlenecks,… more
- Meta (Boston, MA)
- …you will work closely with other engineers and researchers to ensure that our AI training infrastructure is reliable , efficient, and scalable. You will also have ... **Summary:** The AI Production Engineering team at Meta is responsible...the advancement of the field.Production Engineering is a hybrid software/ systems group that ensures Meta's services and products run… more
- NVIDIA (Santa Clara, CA)
- …in automation farm or in cloud. You will continuously innovate and develop scalable, reliable , high performance systems and tools to enable the next generation ... develop test content using C/C++? Do you excel using AI tools to aid in solving complex issues? We'd...large scale, running hundreds of tests per day in distributed heterogeneous servers with NVIDIA's GPUs connect to verify… more
- NVIDIA (WA)
- …proficiency in Go and experience building scalable Go services that manage complex distributed systems + Hands-on experience with Helm, Kustomize, and managing ... to seamlessly install, upgrade, and manage cluster runtime packages powering NVIDIA's AI Accelerators. You'll work on innovative controller systems that manage… more