-
Junior Data Engineer - Databricks
- Insight Global (Woonsocket, RI)
-
Job Description
Our client is leading health solutions organization with over 300,000 dedicated employees nationwide. As the Data Engineer on the Lumina team, you will own the analytics and telemetry foundation for our platform. You’ll design and operate the data pipelines that power internal and external dashboards for product and engineering partners and end users. You will instrument and collect telemetry with OpenTelemetry, model and serve trustworthy metrics, and manage the data workflows supporting our vector database that enables Lumina’s GenAI/RAG experiences. This role is ideal for an engineer who thrives at the intersection of data platforms, observability, and product outcomes, and who is hands on in Databricks and modern data/analytics tooling. Key responsibilities include:
• Own Lumina analytics end to end: define metric sources of truth, build semantic layers, and deliver reliable, documented datasets for self service and curated dashboards.
• Design and operate data pipelines in Databricks: implement ELT/ETL with PySpark/SQL, Delta Lake, and Databricks Jobs (orchestrations), including testing, monitoring, and cost/performance optimization.
• Telemetry instrumentation & observability: design event schemas, implement tracing/metrics/logs using OpenTelemetry, and build health KPIs (latency, throughput, error budgets, SLOs) for Lumina services and data products.
• Dashboards for internal & external audiences: build, publish, and govern high quality Databricks dashboards for leadership, product/engineering teams, and end users; manage refresh SLAs and access controls.
• Vector database support: build and maintain ingestion/indexing pipelines for embeddings and metadata; implement data quality
• checks, drift monitoring, and retention policies for the vector store (MongoDB).
• Data quality & governance: incorporate validation (unit/integration tests), documentation, lineage, and incident playbooks; partner with security to ensure PII/PHI handling, privacy, and compliance controls.
• Collaboration & product mindset: partner with Product, ML/Platform Engineering, and Support to define analytics requirements, operationalize new telemetry, and iterate dashboards based on feedback and adoption metrics.
• Operational excellence: establish alerting and runbooks for pipelines; drive root cause analysis and continuous improvement for stability, cost, and developer efficiency.
Compensation:
$30/hr to $40/hr
Exact compensation may vary based on several factors, including skills, experience, and education.
Benefit packages for this role will start on the 1st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401K retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to [email protected] learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Skills and Requirements
• 2+ years of professional data engineering experience delivering production g rade pipelines and analytics.
• Knowledge of data modeling principles
• High proficiency in Databricks (PySpark, SQL, Delta Lake; Databricks Jobs/Workflows; familiarity with MLflow is a plus).
• 2+ years experience with MongoDB and general experience writing MongoDB queries
• Hands on experience with OpenTelemetry for application/service instrumentation (traces, metrics, logs) and integrating with observability backends.
• Strong SQL and Python skills; experience with distributed data processing (Spark) and ELT/ETL patterns.
• Experience building dashboards and semantic models for both internal stakeholders and external/end-user experiences.
• Familiarity with orchestration, CI/CD, and testing of data pipelines (e.g., Databricks Jobs, GitHub Actions).
• Practical understanding of data quality, lineage, access controls, and governance in a cloud environment. • Exposure to GenAI and RAG architectures (embeddings, retrieval evaluation, chunking strategies, guardrails) and related tooling (e.g., LangChain/LlamaIndex)
• Experience with traceability tools such as Langfuse.
• Experience operating or integrating with vector databases (MongoDB, Milvus, Pinecone, Weaviate) and evaluating retrieval quality (precision/recall, MRR, nDCG).
• Experience designing software with managed databases
• Familiarity with BI governance (row level security, certified datasets, data catalog/lineage) and UX best practices for dashboard design.
• Cloud expertise in Azure services for data storage, compute, identity, and observability.
• Experience with Agile delivery (scrum) in a enterprise environment.
-
Recent Searches
- Sr NETA Field Technician (New York)
- Sr Manager Data Engineer (Massachusetts)
- Associate Portfolio Manager Remote (Louisiana)
- Systems Engineer 2 (Minnesota)
Recent Jobs
-
Junior Data Engineer - Databricks
- Insight Global (Woonsocket, RI)
-
Assembler-Micro Assembler
- MACOM Technology Solutions Holdings (Hamilton, NJ)
-
Outreach Manager, Bureau of Chronic Disease Prevention
- City of New York (New York, NY)
-
Director, Government Pricing
- Sanofi Group (Morristown, NJ)