Guardar Back to Search Descripción Resumen Ofertas similares Actualizado el 13/01/2025New challenge in a International company. SSC. English. Barcelona.Python, SQL, Machine Learning¿Dónde vas a trabajar?Are you looking for a place to work that inspire and challenge you? A place to unleash your potential? Then the PageGroup Barcelona Shared Service Center (SSC), with its flexible, open culture and meritocratic structure is the place for you.https://www.pagepersonnel.es/clientprofile/pagegroup-shared-services-centreDescripciónData Profiling & Monitoring:Develop and enhance data profiling engines to assess completeness, validity, and integrity of datasets.Collaborate on AI-driven cataloguing projects, including a Purview Proof of Concept (POC).Monitor data lake quality and performance using tools such as Databricks/Cloud based data platforms.Machine Learning for Data Quality:Apply ML techniques (e.g., Random Forest) to improve duplicate identification and matching.Build models to validate taxonomy mapping using LLMs, similar to inferring roles/departments based on job titles.Automation and API Integration:Automate data update processes using APIs, reducing reliance on manual scripts, extracts, and CSVs.Design scalable solutions for automated data reconciliation and integrity checks.Data Quality Analysis:Conduct detailed data quality assessments, measuring completeness, validity, and consistency across datasets.Identify gaps in data pipelines and propose actionable solutions.¿A quién buscamos (H/M/D)?ExpertiseML Model Development and Evaluation:Understanding statistical distributions and probabilities is key to choosing the right features, algorithms, and evaluation metrics for ML tasks (e.g., precision, recall, F1-score).Advanced tasks like enhancing duplicate detection or inferring roles with LLMs may involve probabilistic approaches.Data Quality Analysis:Quantifying and diagnosing data completeness, validity, and integrity often require statistical tests and descriptive analytics.General Problem-Solving:Statistical reasoning aids in diagnosing anomalies, reconciling datasets, and creating predictive models.Must haveProficiency in Python (essential for ML and automation tasks).Strong understanding of statistics and probability, including hypothesis testing, regression, and probabilistic reasoning.Experience with machine learning techniques (e.g., Random Forest, clustering, or NLP-based models).Solid grasp of data quality concepts: completeness, validity, reconciliation, and profiling.Strong problem-solving skills and the ability to design scalable solutions.Should-Have:Hands-on experience with Databricks (for data lake monitoring and ML implementation).Familiarity with data cataloguing tools like Purview or similar platforms.Working knowledge of SQL and large datasets.Could-Have:Experience with R for statistical analysis or visualization.Knowledge of LLMs for advanced text or taxonomy-related projects.Familiarity with data governance frameworks or compliance requirements.¿Cuáles son tus beneficios?Meal vouchersBonusRemote working (2 days per weeks)Medical insurance (after 6 months)Life insurancePrivate pension (after 2 years)Flexible compensation (after 6 months)July & August 36h per weekHolidays per year - 25 days20 working days per year to work from abroadEAP - since day oneVer más ofertas de empleoJordi AlumaIndicar número de referencia para la ofertaJN-012025-6634567Resumen de empleoSectorTecnologíaSub SectorBig DataIndustriaTechnology & TelecomsLocalizaciónBarcelonaTipo de ContractoPermanentNombre del consultorJordi AlumaNúmero de referenciaJN-012025-6634567Modalidad de trabajoRemoto / híbrido