Data Specialist/Scientist - PageGroup SSC

Barcelona Permanente Remoto / híbrido View Job Description
Data Scientist focuses on enhancing data profiling, reconciliation, and automation capabilities while leveraging machine learning (ML) and large language models (LLMs) to improve data accuracy and usability. The ideal candidate has strong technical expertise in Python and ML, experience with data platforms like Databricks, and a passion for solving data quality challenges.

Actualizado el 13/01/2025

  • New challenge in a International company. SSC. English. Barcelona.
  • Python, SQL, Machine Learning

¿Dónde vas a trabajar?

Are you looking for a place to work that inspire and challenge you? A place to unleash your potential? Then the PageGroup Barcelona Shared Service Center (SSC), with its flexible, open culture and meritocratic structure is the place for you.

https://www.pagepersonnel.es/clientprofile/pagegroup-shared-services-centre

Descripción

  • Data Profiling & Monitoring:
    • Develop and enhance data profiling engines to assess completeness, validity, and integrity of datasets.
    • Collaborate on AI-driven cataloguing projects, including a Purview Proof of Concept (POC).
    • Monitor data lake quality and performance using tools such as Databricks/Cloud based data platforms.
  • Machine Learning for Data Quality:
    • Apply ML techniques (e.g., Random Forest) to improve duplicate identification and matching.
    • Build models to validate taxonomy mapping using LLMs, similar to inferring roles/departments based on job titles.
  • Automation and API Integration:
    • Automate data update processes using APIs, reducing reliance on manual scripts, extracts, and CSVs.
    • Design scalable solutions for automated data reconciliation and integrity checks.
  • Data Quality Analysis:
    • Conduct detailed data quality assessments, measuring completeness, validity, and consistency across datasets.
    • Identify gaps in data pipelines and propose actionable solutions.

¿A quién buscamos (H/M/D)?

Expertise

  • ML Model Development and Evaluation:
    • Understanding statistical distributions and probabilities is key to choosing the right features, algorithms, and evaluation metrics for ML tasks (e.g., precision, recall, F1-score).
    • Advanced tasks like enhancing duplicate detection or inferring roles with LLMs may involve probabilistic approaches.
  • Data Quality Analysis:
  • Quantifying and diagnosing data completeness, validity, and integrity often require statistical tests and descriptive analytics.
  • General Problem-Solving:
  • Statistical reasoning aids in diagnosing anomalies, reconciling datasets, and creating predictive models.



Must have

  • Proficiency in Python (essential for ML and automation tasks).
  • Strong understanding of statistics and probability, including hypothesis testing, regression, and probabilistic reasoning.
  • Experience with machine learning techniques (e.g., Random Forest, clustering, or NLP-based models).
  • Solid grasp of data quality concepts: completeness, validity, reconciliation, and profiling.
  • Strong problem-solving skills and the ability to design scalable solutions.



Should-Have:

  • Hands-on experience with Databricks (for data lake monitoring and ML implementation).
  • Familiarity with data cataloguing tools like Purview or similar platforms.
  • Working knowledge of SQL and large datasets.



Could-Have:

  • Experience with R for statistical analysis or visualization.
  • Knowledge of LLMs for advanced text or taxonomy-related projects.
  • Familiarity with data governance frameworks or compliance requirements.

¿Cuáles son tus beneficios?

  • Meal vouchers
  • Bonus
  • Remote working (2 days per weeks)
  • Medical insurance (after 6 months)
  • Life insurance
  • Private pension (after 2 years)
  • Flexible compensation (after 6 months)
  • July & August 36h per week
  • Holidays per year - 25 days
  • 20 working days per year to work from abroad
  • EAP - since day one
Ver más ofertas de empleo
Jordi Aluma
Indicar número de referencia para la oferta
JN-012025-6634567

Resumen de empleo

Sector
Tecnología
Sub Sector
Big Data
Industria
Technology & Telecoms
Localización
Barcelona
Tipo de Contracto
Permanent
Nombre del consultor
Jordi Aluma
Número de referencia
JN-012025-6634567
Modalidad de trabajo
Remoto / híbrido

En Michael Page creemos en la diversidad e inclusión. Defendemos la igualdad de oportunidades sin discriminar por género, raza, edad, religión ni orientación sexual o por cualquier otro aspecto que pudiera ser considerado excluyente.