-
Sr. Data Scientist, Special Projects
- Amazon (Seattle, WA)
-
Description
As a Data Scientist you will be working at the intersection of machine learning and advanced analytics, you will help develop innovative products that enhance customer experiences. Our team values intellectual curiosity while maintaining sharp focus on bringing products to market. Successful candidates demonstrate responsiveness, adaptability, and thrive in our open, collaborative, entrepreneurial environment.
Working at the forefront of both academic and applied research, you will join a diverse team of scientists, engineers, and product managers to solve complex business and technology problems using scientific approaches. You will collaborate closely with other teams to implement innovative solutions and drive improvements.
Our team rewards curiosity while maintaining a laser-focus in bringing products to market. Competitive candidates are responsive, flexible, and able to succeed within an open, collaborative, entrepreneurial, startup-like environment. At the forefront of both academic and applied research in this product area, you have the opportunity to work together with a diverse and talented team of scientists, engineers, and product managers and collaborate with other teams.
Key job responsibilities
- Work hands-on with complex, noisy datasets to derive actionable insights and explain/debug black-box models using interpretability and data-attribution methods.
- Design and analyze experiments and observational studies with rigorous statistical inference, including confidence intervals, power/sample-size estimation, variance reduction, and appropriate hypothesis testing.
- Benchmark models and datasets using classical and modern techniques; select ML methods based on data and operational constraints, and evaluate using robust metrics and diagnostic analyses.
- Apply production-grade measurement and MLOps practices, including data quality monitoring, drift/shift detection, and A/B test design and readouts with disciplined diagnosis of metric movement.
- Deliver end-to-end analyses that improve team execution and decision-making—define goal-driving metrics with stakeholders, build clear reporting (tables, dashboards, and visualizations), and communicate results that translate into concrete actions.
- Investigate anomalies and data integrity issues across diverse data sources using structured root-cause analysis, correlation diagnostics, significance testing, and simulation across high- and low-fidelity datasets.
- Partner closely with cross-functional domain experts to design experiments and interpret results, applying modern statistical methods to evaluate predictive and generative models as well as operational and process performance.
- Develop production-quality analytics and modeling code—write well-tested, maintainable SQL/Python scripts and analysis workflows that can be promoted into production pipelines, and continuously adopt new statistical methods and best practices as the field evolves.
A day in the life
New data has just landed and promoted to our datalake. You load the data and verify it's overall integrity by visualizing variation across target subsets. You realize we may have made progress toward our goals and begin to test the validity of your nominal results. At midday you grab lunch with new coworkers and learn about their fields or weird interests (there are many). You generate visualizations for the entire dataset and perform significance tests that reinforce specific findings. You meet with peers in the afternoon to discuss your findings and breakdown the remaining tasks to finalize your group report!
About the team
Innovators wanted! Are you an entrepreneur? A builder? A dreamer? This role is part of an Amazon Special Projects team that takes the company’s Think Big leadership principle to the limits. We focus on creating entirely new products and services with a goal of positively impacting the lives of our customers. No industries or subject areas are out of bounds. If you’re interested in innovating at scale to address big challenges in the world, this is the team for you.
Basic Qualifications
- Master's degree in computer science, machine learning, engineering, or related fields, or Bachelor's degree and 3+ years of data scientist experience
- Experience delivering end-to-end analyses: data cleaning/QC, modeling, evaluation, and stakeholder-ready reporting.
- Experience with statistical foundation for experiments and inference: confidence intervals, power/sample size, two-sample and permutation tests, sequential testing, and multiple-comparison control.
- Experience with model evaluation and diagnostics, including ranking/discrimination metrics (e.g., AUROC and precision-recall-gain), probabilistic metrics (e.g., proper scoring rules and divergence-based measures), and calibration assessment (e.g., reliability diagrams, ECE).
- Proficient in SQL and Python; able to write production-quality analysis code (testing, version control, reproducible workflows).
- Experience working with large-scale data pipelines and production analytics environments (data warehouses/lakes, workflow orchestration, batch or streaming processing).
Preferred Qualifications
- Ph.D. in Computer Science, Engineering, Statistics/Mathematics, Bioinformatics, or a related field.
- Hands-on experience with model interpretability and black-box debugging (e.g., SHAP/TreeSHAP, Anchors, Integrated Gradients, counterfactuals, influence/data attribution).
- Experience with production monitoring and experimentation diagnostics: data quality checks, drift/shift detection (e.g., KS, MMD/embedding drift), and diagnosing metric movement (e.g., instrumentation changes, sample-ratio mismatch, batch effects).
- Familiarity with modern deep learning and representation learning, including Transformers and LLM embedding pipelines, with experience in fine-tuning and post-training (e.g., PEFT/LoRA, instruction tuning, preference optimization).
- Experience with optimization and simulation to improve operational or experimental processes (e.g., Bayesian optimization, bandits, constrained optimization).
- Publications in top ML/AI or statistics venues (conferences/journals) or equivalent evidence of research impact (patents, widely used open-source, internal platforms at scale).
- Experience with high-performance or distributed systems (e.g., GPU training, distributed inference) supporting large-scale modeling and experimentation.
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.
The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave. Learn more about our benefits at https://amazon.jobs/en/benefits .
USA, WA, Seattle - 159,200.00 - 215,300.00 USD annually
-