- NVIDIA (Santa Clara, CA)
- …production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized discipline ... Kubernetes and OpenStack. SRE at NVIDIA ensures that our internal and external facing GPU cloud services run maximum reliability and uptime as promised to the users… more
- Amazon (Seattle, WA)
- …training infrastructure that scales to a large model size and hardware such as GPU . - Collaborate with other talented applied scientists and engineers to develop an ... AI. Basic Qualifications - 5+ years of non-internship professional software development experience - 5+ years of programming with...- 5+ years of programming with at least one software programming language experience - 5+ years of leading… more
- NVIDIA (Santa Clara, CA)
- …with high efficiency and availability. It encompasses various areas, including software and systems engineering practices, storage, data management, and services. ... Storage Production Engineers at NVIDIA ensure that our internal and external-facing GPU cloud services meet reliability and uptime goals as promised to the… more
- NVIDIA (Santa Clara, CA)
- …Work alongside system architects, chip and board designers, software /firmware engineers, HW/SW applications engineering, process/reliability specialists, ATE ... to digital design, circuit analysis, computer architecture, BIOS, drivers, and software applications. NVIDIA is leading the way in groundbreaking developments in… more
- General Motors (Frankfort, KY)
- …skills in modern C++ or Python + Experience with profiling CPU and/or GPU software , process scheduling, and prioritization + Passionate about self-driving car ... infrastructure, compute platform, labeling, as well as simulation + Strong software engineering (SWE) skills, focusing on distributed backend development, batch data… more
- NVIDIA (Santa Clara, CA)
- …production systems with high efficiency and availability using the combination of software and systems engineering practices. This is a highly specialized discipline ... approaches to running better production systems and optimizations. Much of our software development focuses on building components to eliminating manual work through… more
- NVIDIA (Santa Clara, CA)
- …overcome system/infrastructure failures, ensuring fault tolerance. + Collaborate with software teams to pinpoint performance bottlenecks. Design, prototype, and ... of training applications using PyTorch or similar framework + Building distributed software applications using collective communication libraries such as MPI or NCCL… more
- JPMorgan Chase (Jersey City, NJ)
- …In this role, you will leverage your deep knowledge of machine learning, software engineering, and product management to spearhead multiple complex ML projects and ... leading, and mentoring a team of Machine Learning and Software Engineers, focusing on best practices in ML engineering,...Background in High Performance Computing, ML Hardware Acceleration (eg, GPU , TPU, RDMA), or ML for Systems. + Strategic… more
- University of Maine System (Orono, ME)
- …processing and storage. + Experience with diagnosing system and application software problems. + Knowledge of infrastructure for migrating jobs between on-premise ... with version control using Git. + Demonstrated skills automating or optimizing software , code, and/or processes. + Experience maintaining the stability and security… more
- Amazon (Cupertino, CA)
- …people who want to help. You'll join a diverse team of software , hardware, and network engineers, supply chain specialists, security experts, operations managers, ... speed bus design and signal integrity, failure analysis, server components (eg CPU, GPU , SSDs, drives), BIOS, BMC, and networking - Excellent written and oral… more