- Rubrik (Palo Alto, CA)
- …services for system monitoring, detecting faults, and automatically self-healing the distributed systems + Design, develop, and operationalize high-performance, ... Computer Science or related field + 2+ years of software development experience on Linux, preferably in Platform/ Systems...domain + Strong fundamentals in data structures, algorithms, and distributed systems design + Strong background in… more
- NVIDIA (Santa Clara, CA)
- …from the crowd: + Technical competency in managing and automating large-scale distributed systems independent of cloud providers. Advanced hands-on experience ... + 5+ years in similar role and experience on large-scale production systems . Experience with common software engineering principles, tools and techniques.… more
- NVIDIA (Santa Clara, CA)
- …building the next generation of scalable AI systems . As a Senior Applied AI Software Engineer on the Dynamo project, you will address some of the most ... Go for Kubernetes controllers and operators development. + Deep understanding of distributed systems , parallel computing, and GPU architectures. + Experience… more
- Amazon (East Palo Alto, CA)
- …base. You'll bring a passion for innovation, data, search, analytics, and distributed systems . You'll also: Solve challenging technical problems, often ones ... about transforming business challenges into technological breakthroughs? Join Amazon as a Software Development Engineer (SDE) and help shape the future of… more
- Amazon (Cupertino, CA)
- …- Bachelor's degree in computer science or equivalent - Preferred previous software engineer expertise with Pytorch/Jax/Tensorflow, Distributed libraries and ... customers and raise our performance bar. You'll design fault-tolerant systems that run at massive scale as we continue...that use them. This role is for a senior software engineer in the Machine Learning Applications… more
- NVIDIA (Santa Clara, CA)
- …and fleet management engineering. + Experience with infrastructure automation and distributed systems design developing tools for running large scale ... We are seeking Software Engineers with previous experience building and running...more of the following: Linux, Slurm, Kubernetes, Local and Distributed Storage, and Systems Networking. Ways to… more
- Amazon (Cupertino, CA)
- …design or architecture (design patterns, reliability and scaling) of new and existing systems experience - 5+ years of full software development life cycle, ... science or equivalent - Experience in computer architecture - Previous software engineering expertise with Pytorch/Jax/Tensorflow, Distributed libraries and… more
- Amazon (Cupertino, CA)
- …design or architecture (design patterns, reliability and scaling) of new and existing systems experience - 5+ years of full software development life cycle, ... science or equivalent - Experience in computer architecture - Previous software engineering expertise with Pytorch/Jax/Tensorflow, Distributed libraries and… more
- Amazon (Cupertino, CA)
- …and the Trn1 and Inf1 servers that use them. This role is for a senior software engineer in the Machine Learning Applications (ML Apps) team for AWS Neuron. This ... and runtime engineers to create , build and tune distributed training solutions with Trn1. Experience training these large...systems experience - - 5+ years of full software development life cycle, including coding standards, code reviews,… more
- Google (Sunnyvale, CA)
- …academic or industry setting. + Experience building and supporting large scale distributed systems and infrastructure. + Familiarity with Kubernetes development, ... of experience with an advanced degree. + Experience in distributed computing or machine learning infrastructure. Preferred qualifications: +...goes on and is growing every day. As a software engineer , you will work on a… more