"Alerted.org

Job Title, Industry, Employer
City & State or Zip Code
20 mi
  • 0 mi
  • 5 mi
  • 10 mi
  • 20 mi
  • 50 mi
  • 100 mi
Advanced Search

Advanced Search

Cancel
Remove
+ Add search criteria
City & State or Zip Code
20 mi
  • 0 mi
  • 5 mi
  • 10 mi
  • 20 mi
  • 50 mi
  • 100 mi
Related to

  • Senior AI Software Engineer, LLM Inference…

    NVIDIA (Santa Clara, CA)



    Apply Now

    We're now seeking a Senior AI Software Engineer, in our LLM Inference Performance Analysis and Optimization team!

     

    NVIDIA leads the generative AI revolution. We're now seeking an experienced AI Software Engineer to optimize LLM inference performance. Our team collaborates with compiler, kernel, hardware, and framework teams to assess bottlenecks, create optimization methods, and validate improvements. If you’re passionate about system-level performance, compiler IR, and GPU kernel optimization for deep learning inference, we’d love to consider you for our team.

    What you'll be doing:

    + Analyze the performance of LLMs on NVIDIA GPUs by employing advanced profiling and projection tools.

    + Find opportunities for performance improvements in the IR-based compiler middle end optimizer and/or in precompiled kernel optimizations driven by Graph IR transformations.

    + Build and develop new compiler passes and optimization techniques to deliver outstanding, robust, and maintainable compiler infrastructure and tools.

    + Collaborate closely with architecture teams to influence and co-design future hardware features that improve compiler and runtime efficiency.

    + Work with geographically distributed teams across compiler, hardware, kernel, and framework domains to drive performance improvements and resolve complex issues.

    + Contribute to a core team at the forefront of deep learning and LLM inference technology, spanning hardware architecture development, kernel optimization, and integration with higher-level deep learning frameworks.

    What we need to see:

    + Master’s or PhD in Computer Science, Computer Engineering, or a related field, or equivalent experience.

    + 5+ years relevant experience.

    + Strong hands-on programming expertise in C++ and Python, with solid software engineering fundamentals.

    + Skilled in innovative LLM architectures, covering inference optimization, profiling, and compiler-level performance tuning.

    + Significant background in optimizing kernels through information retrieval techniques and generating code, including graph transformations, fusion, scheduling, and developing custom kernel generation frameworks like OpenAI Triton or other compiler-based code generation pipelines.

    + Hands-on experience with deep learning frameworks like TensorRT-LLM, vLLM, SGLang, Jax/XLA, or related compiler/runtime environments.

    + Proven ability to analyze and optimize LLM performance bottlenecks across model development, kernel execution, and runtime systems.

    + Excellent communication and collaboration skills, with the ability to work independently and effectively across distributed teams in a fast-paced environment.

    + Display a robust determination to continuously improve software and hardware performance by engaging in profiling, analysis, and optimization.

    + Proficiency in CUDA programming and familiarity with GPU-accelerated deep learning frameworks and performance tuning techniques.

    Ways to stand out from the crowd:

    + Showcase innovative applications of agentic AI tools that enhance productivity and workflow automation.

    + Proven background in LLVM, MLIR, and/or Clang compiler development.

    + Active engagement with the open-source LLVM or MLIR community to ensure tighter integration and alignment with upstream efforts.

     

    NVIDIA is recognized as one of the world’s most desirable engineering environments, built by teams who value technical depth, innovation, and impact. We work alongside some of the best minds in GPU computing, systems software, and AI. If you’re driven by performance, enjoy solving complex problems, and thrive in an environment that rewards initiative and technical excellence, we’d love to hear from you!

     

    #LI-Hybrid

     

    Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 148,000 USD - 235,750 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.

     

    You will also be eligible for equity and benefits (https://www.nvidia.com/en-us/benefits/) .

     

    Applications for this job will be accepted at least until November 4, 2025.

     

    NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

     


    Apply Now



Recent Searches

  • HVAC Design Energy Analysis (Pennsylvania)
  • arms ammunition explosives aa (United States)
  • US Summer Internship Program (United States)
[X] Clear History

Recent Jobs

  • Senior AI Software Engineer, LLM Inference Performance Analysis
    NVIDIA (Santa Clara, CA)
  • Software Engineer II - Excel Agent
    Microsoft Corporation (Redmond, WA)
  • Production Plng And Scheduler Skillbridge Intern
    Huntington Ingalls Industries (Newport News, VA)
  • Advanced Manufacturing Engineer I
    Honeywell (Minneapolis, MN)
[X] Clear History

Account Login

Cancel
 
Forgot your password?

Not a member? Sign up

Sign Up

Cancel
 

Already have an account? Log in
Forgot your password?

Forgot your password?

Cancel
 
Enter the email associated with your account.

Already have an account? Sign in
Not a member? Sign up

© 2025 Alerted.org