-
Summer Intern - Large Language Models (Prescient…
- Genentech (New York, NY)
-
The Position
2026 Summer Intern - Large Language Models (Prescient Design / AI for Drug Discovery)
Department Summary
At Roche's AI for Drug Discovery (AIDD) group (formerly Prescient Design), we are revolutionizing drug discovery with cutting-edge machine learning techniques. We are seeking talented researchers and engineers with a passion for building machine learning systems that transform how scientific data is represented, modeled, and evaluated.
AIDD’s Foundation Model team is seeking a Machine Learning Research Intern to work on data interfaces between structured biochemical measurements and large language models, supporting next-generation foundation models for drug discovery as part of our broader Lab-in-the-Loop approach.
The intern will collaborate closely with researchers and engineers to design, implement, and evaluate data transformation and modeling pipelines, gaining hands-on experience with real-world scientific datasets and foundation-model workflows. This role is well suited for candidates who enjoy careful technical reasoning, experimentation, and building reusable components that sit at the intersection of machine learning and scientific data.
The group provides a dynamic and challenging environment for multidisciplinary research, including access to heterogeneous data sources, close links to top academic institutions around the world, as well as collaborations with internal Genentech and Roche teams.
This internship position is located in New York City, NY, On-Site.
The Opportunity
+ Design and implement textification pipelines that translate structured biochemical assay data (e.g., affinity, expression) into precise, uncertainty-aware natural language for LLM training.
+ Develop parsing logic and round-trip validation to recover structured values (numbers, units, QC indicators) from model-generated text, enabling consistent evaluation.
+ Integrate and evaluate optional sequence-neighborhood enrichment using embedding-based retrieval, and study its effect on model calibration and robustness.
+ Run controlled experiments and ablations to analyze how rendering and enrichment choices affect downstream property prediction, ranking, and calibration.
+ Contribute production-quality code to internal frameworks, including clear documentation, README-style usage examples, and comprehensive unit tests.
Program Highlights
+ **Intensive 12-weeks, full-time (40 hours per week) paid internship.**
+ **Program start dates are in May/June**
+ **A stipend, based on location, will be provided to help alleviate costs associated with the internship.**
+ Ownership of challenging and impactful business-critical projects.
+ Work with some of the most talented people in the biotechnology industry.
Who You Are
Required Education
+ Must be pursuing a Master's Degree (enrolled student).
+ Must be pursuing a PhD (enrolled student).
Required Majors
Computer Science, Machine Learning, Data Science, Bioinformatics or Computational Biology, Statistics, Applied Mathematics, Physics, or a related quantitative field
Required Skills:
+ Strong programming skills, particularly in Python, with experience writing clean and maintainable code.
+ Solid understanding of machine learning or NLP fundamentals, including model training and evaluation concepts.
+ Experience working with structured scientific or technical data (e.g., tables, fields, or schemas) in the context of data analysis or modeling.
+ Ability to reason carefully about experimental results and communicate technical ideas clearly.
Preferred Knowledge, Skills, and Qualifications
+ Excellent communication, collaboration, and interpersonal skills.
+ Complements our culture and the standards that guide our daily behavior & decisions: Integrity, Courage, and Passion.
+ Familiarity with biological or biochemical data (e.g., proteins, antibodies, or assays).
Relocation benefits are not available for this job posting.
The expected salary range for this position based on the primary location of the city of New York is $50.00 per hour. Actual pay will be determined based on experience, qualifications, geographic location, and other job-related factors permitted by law. This position also qualifies for paid holiday time off benefits.
Genentech is an equal opportunity employer. It is our policy and practice to employ, promote, and otherwise treat any and all employees and applicants on the basis of merit, qualifications, and competence. The company's policy prohibits unlawful discrimination, including but not limited to, discrimination on the basis of Protected Veteran status, individuals with disabilities status, and consistent with all federal, state, or local laws.
If you have a disability and need an accommodation in relation to the online application process, please contact us by completing this form Accommodations for Applicants (https://docs.google.com/forms/d/e/1FAIpQLSdZWlsbfQOvFVIQgHE\_iDzWUTlhZvj6FytIzjS7xq6IGh1H5g/viewform) .
-
Recent Searches
- Structural Engineering Intern (Schenectady, NY)
- ADAS Feature Verification Validation (North Carolina)
- Outpatient Coding Specialist Work (Ohio)
- Talent Program Manager Senior (Arizona)
Recent Jobs
-
Summer Intern - Large Language Models (Prescient Design / AI for Drug Discovery)
- Genentech (New York, NY)
-
Investment Banking Summer Analyst- Multiple Locations
- Raymond James Financial, Inc. (Chicago, IL)