- Meta (Menlo Park, CA)
- …15. 4. GPU, CPU, or AI hardware accelerator architectures 16. 5. Compiler optimizations such as loop optimizations, vectorization, parallelization, AND 17. 6. System ... performance optimizations such as runtime analysis of latency, memory bandwidth, I/O access, compute utilization analysis and associated tooling development **Public Compensation:** $178,360/year to $200,200/year + bonus + equity + benefits **Industry:**… more
- Amazon (Cupertino, CA)
- …many more. The Inference Model Enablement team works side by side with compiler engineers and runtime engineers to create, build and tune distributed inference ... solutions with Trainium and Inferentia. Experience optimizing inference performance for both latency and throughput on these large models using Python, Pytorch or JAX is a must. Experience with Deepspeed and other distributed inference libraries is a bonus, as… more
- NVIDIA (Santa Clara, CA)
- …TensorFlow, Pandas, NumPy, PyTorch a plus. + Exposure to tools/flows such as Design Compiler , PTPX, and Power Artist etc a huge plus. + Experience with lab setup ... and measurement using equipment such as scope/DAQ is helpful. Ways to stand out from the crowd: + A master's degree/internship with a focus/projects in Low Power Architecture, power modeling, and deep learning is a plus! NVIDIA is widely considered to be one… more
- Arrow Electronics (San Jose, CA)
- …physical verification challenges. + Proficient with industry-standard tools: Synopsys ICC2,Fusion Compiler and scripting in Tcl, Perl, Python for automation and flow ... development. + Expertise in clock gating, logic optimization, and integration of high-speed interfaces like DDR and PCIe. + Collaborate cross-functionally with RTL, STA, DFT, verification, and packaging teams to ensure smooth integration and closure. + Manage… more
- NVIDIA (Santa Clara, CA)
- …in software performance benchmarking, profiling, and optimizations. + Background in compiler development + Experience in working with TensorRT, PyTorch, ONNX ... Runtime, JAX, TRT-LLM, vLLM, SGLang, or other ML frameworks. + Experience developing custom GPU kernels in OpenAI Triton or CUDA for deep learning inference. NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some… more
- pony.ai (Fremont, CA)
- …experiences in building deep learning infrastructure + Experience with machine learning compiler / runtime or GPU accelerator + Passionate in autonomous driving ... Preferred Experience + Experience in building the infrastructure for large language or vision models + Experience building and architecting large-scale, production quality backend systems, especially in applied machine learning or data pipeline + Familiar with… more
- quadric.io, Inc (Burlingame, CA)
- …and DSP algorithms using Quadric's technology + Collaborate with Scheduler, Compiler and Hardware teams to build best-in-class quadric ecosystem Requirements + ... MS or Ph.D. in Electrical or Computer Science/Engineering with a minimum of ten years of experience in the industry + Demonstrated experience in successfully leading teams to deliver production SDKs + Proficiency in C++ >= 11 + Background in NN frameworks such… more
- Microsoft Corporation (Mountain View, CA)
- …experience in AI frameworks, large scale distributed computation, system programming, compiler or machine learning. + 4+ years of experience successfully ... collaborating with cross functional teams, owning deliverables and driving results to meet business objectives. + 2+ years of experience building Android applications from scratch. **Other Requirements:** Ability to meet Microsoft, customer and/or government… more
- Microsoft Corporation (Mountain View, CA)
- …experience in AI frameworks, large scale distributed computation, system programming, compiler or machine learning. + 6+ years of experience successfully ... collaborating with cross functional teams, owning deliverables and driving results to meet business objectives. **Preferred Qualifications:** + Experience building Android applications from scratch. + Experience working on systems performance optimization. +… more
- Amazon (Cupertino, CA)
- …of a vertically integrated system stack consisting of the PyTorch inference library, Neuron compiler , runtime, and collectives. A day in the life You will work with ... your senior management and technical leaders to define the model enablement and performance optimization for the latest SOTA LLMs, build and deliver them to customers. Meanwhile, lead the team to continue improving the model onboarding experience, as well as… more