- Northrop Grumman (Redondo Beach, CA)
- …code deployment, maintenance, and optimization efforts. The lessons learned from existing HPC systems will inform the architecture, deployment, and utilization ... but are not limited to: + Develop and deploy architectures for future HPC systems based on engineering computing requirements, collaborating with engineering to… more
- NVIDIA (Santa Clara, CA)
- …doing: + Provide leadership and strategic mentorship on the management of large-scale HPC systems including the deployment of compute, networking, and storage. + ... ). + Experience analyzing and tuning performance for a variety of AI/ HPC workloads. Excellent problem-solving to analyze complex systems , identify bottlenecks,… more
- NVIDIA (Santa Clara, CA)
- …like NCCL, NVSHMEM, and UCX that are crucial for scaling Deep Learning and HPC . We're seeking a Senior Software Architect to help co-design next-gen data ... to grow with the increasing scale of next generation systems . This is an outstanding opportunity to advance the...topology, algorithms, and communication scaling relevant to AI and HPC workloads. + Strong experience with Linux .… more
- NVIDIA (Santa Clara, CA)
- …long term maintenance strategy. What you'll be doing: + Design highly available and scalable systems to meet the demands of our HPC clusters + Evaluate new and ... to join us today. We are looking for a Senior Software Engineer to join our mission to continue...Engineer to join our mission to continue improving our HPC infrastructure. Our team builds and operates sophisticated infrastructure… more
- NVIDIA (Santa Clara, CA)
- …team today! We are looking for an outstanding hands-on architect/engineer for a Senior HPC architect role to support deployment and bringup of large-scale ... develop new, leading differentiated solutions. You will interact with HPC , OS, GPU compute, and systems specialist...and visualization pipelines + Exposure to container technology and Linux performance tools. Widely considered to be one of… more
- NVIDIA (Santa Clara, CA)
- …As a Site Reliability Engineer, you are responsible for the big picture of how our systems relate to each other, we use a breadth of tools and approaches to tackle a ... + Manage and support workload and resource schedulers in a large-scale HPC environment. + Automate Everything: Develop automation scripts to automate deployment,… more
- NVIDIA (Santa Clara, CA)
- …familiarity with software testing and deployment, familiarity with distributed systems , and excellent communication and planning abilities. Experience working with ... High Performance Computing ( HPC ), GPUs, and high-performance networking (RDMA, Infiniband, RoCE) are strongly preferred. We also welcome out-of-the-box thinkers who… more
- NVIDIA (Santa Clara, CA)
- …a fit for you, we'd love to hear from you! NVIDIA is seeking a Senior High Performance Computing ( HPC ) and AI Networking Performance Research and Analysis ... and platforms, such as HCAs, Switches, CPUs, GPUs, and Systems . You will develop performance analysis tools and methodologies...Languages: Python, Bash and C languages + Experience with Linux OS distros. + Great teammate with good communication… more
- Cisco (San Jose, CA)
- …performance bottlenecks to drive system and workflow efficiency. + Administer Linux systems , ranging from powerful GPU-enabled servers to general-purpose ... AI Infrastructure Engineer - HPC Apply (https://jobs.cisco.com/jobs/Login?projectId=1443781) + Location:San Jose, California, US...Showcase the power of Cisco: our people, products, processes, systems , and data. Please join us and make this… more
- NVIDIA (Santa Clara, CA)
- We are now looking for a senior HPC software engineer. As a member of our the High Performance Computing Software development team, you will be responsible for ... technical leaders solving some of the biggest challenges in HPC , machine learning, cloud computing, and system co-design. What...of Programming in C/C++ + 3 years' experience in Linux environment and tools + Knowledge of Networking Protocols… more