-
Data Scientist - Researcher
- Insight Global (Woonsocket, RI)
-
Job Description
We are seeking an exceptional Data Scientist Researcher to join our advanced analytics team. This role will focus on cutting-edge machine learning techniques with particular emphasis on dimension reduction, feature selection, and signal detection. Working within our Databricks environment, you will research, develop, and implement sophisticated algorithms that extract meaningful patterns from complex, high-dimensional data. The ideal candidate combines strong theoretical knowledge with practical implementation skills to solve challenging real-world problems.
Key Responsibilities
Lead research initiatives focused on dimension reduction and feature selection methodologies
Develop novel approaches for variable importance ranking and optimal feature subset identification
Design and implement signal detection algorithms to identify patterns in noisy data
Conduct experiments to evaluate and benchmark different feature selection techniques
Create reproducible research workflows in Databricks using notebooks and ML pipelines
Collaborate with cross-functional teams to translate research findings into production solutions
Stay current with the latest academic research in dimension reduction and feature selection
Author technical documentation, white papers, and potentially academic publications
Mentor junior data scientists and share knowledge across the organization
Present research findings to both technical and non-technical stakeholders
Technical Skills
Programming Languages: Python (advanced), R, SQL
ML/AI Frameworks: scikit-learn, PyTorch, TensorFlow, Keras
Big Data Technologies: Apache Spark, Databricks, Delta Lake
Dimension Reduction: PCA, t-SNE, UMAP, LDA, autoencoders
Feature Selection: Recursive feature elimination, LASSO, tree-based methods, mutual information
Signal Processing: Fourier analysis, wavelets, filtering techniques
Statistical Analysis: Hypothesis testing, experimental design, Bayesian methods
Version Control: Git, GitHub
Visualization: Matplotlib, Seaborn, Plotly
Compensation:
$60/hr to $70/hr
Exact compensation may vary based on several factors, including skills, experience, and education.
Benefit packages for this role will start on the 31st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401K retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.
We are a company committed to creating inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity employer that believes everyone matters. Qualified candidates will receive consideration for employment opportunities without regard to race, religion, sex, age, marital status, national origin, sexual orientation, citizenship status, disability, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to Human Resources Request Form (https://airtable.com/app21VjYyxLDIX0ez/shrOg4IQS1J6dRiMo) . The EEOC "Know Your Rights" Poster is available here (https://www.eeoc.gov/sites/default/files/2023-06/22-088\_EEOC\_KnowYourRights6.12ScreenRdr.pdf) .
To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/ .
Skills and Requirements
PhD or MS in Computer Science, Statistics, Applied Mathematics, or related quantitative field
5+ years of experience applying advanced ML/AI techniques to real-world problems
Strong expertise in dimension reduction techniques (PCA, t-SNE, UMAP, autoencoders)
Deep knowledge of feature selection methods (filter, wrapper, embedded approaches)
Experience with signal processing and detection algorithms
Proficiency in Python and its scientific computing ecosystem (NumPy, SciPy, scikit-learn)
Hands-on experience with Databricks and Apache Spark for large-scale data processing
Strong understanding of statistical concepts and experimental design
Experience with deep learning frameworks (PyTorch, TensorFlow) Published research in feature selection, dimension reduction, or related fields
Experience with Databricks ML Flow for experiment tracking and model management
Knowledge of information theory approaches to feature selection
Familiarity with GPU acceleration for machine learning algorithms
Experience with time series analysis and anomaly detection
Understanding of distributed computing principles
Knowledge of Bayesian methods for feature selection
Experience with reinforcement learning techniques
Familiarity with graph-based dimension reduction approaches
Contributions to open-source ML/AI projects null
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal employment opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment without regard to race, color, ethnicity, religion,sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military oruniformed service member status, or any other status or characteristic protected by applicable laws, regulations, andordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request to [email protected].
-