Senior Data Scientist Full Remote Portugal

Lisbon, Portugal

Job Openings Senior Data Scientist Full Remote Portugal

About the job Senior Data Scientist Full Remote Portugal

ABOUT THE OPPORTUNITY

Join a leading technology-driven gaming company as a Senior Data Scientist and build production-grade machine learning models that power data-driven automation and personalized customer experiences for millions of users globally.

This is a senior-level position seeking experienced data scientists who can own the complete ML lifecycle from research through production deployment. We're specifically looking for candidates from product companies who understand what it means to build models that don't just work in notebooks but deliver real business value in production environments serving millions of users with high availability and performance requirements.

The role combines strong machine learning expertise with software engineering best practices - you'll translate business requirements into ML problems, perform exploratory data analysis and feature engineering, run comparative experiments for model training, and implement best practices on model selection and parameter tuning. Working within cross-functional teams that include data scientists, ML engineers, and data engineers, you'll have the full skillset available to deliver end-to-end projects in an Agile/Scrum environment.

PROJECT & CONTEXT

You'll be working within a machine learning team dedicated to making data-driven decisions that automate services while focusing on delivering tailored customer experiences. The team builds a variety of models - from binary classification tasks and regression problems to sophisticated recommendation systems - covering a wide range of business sectors, utilizing different data types, and handling broad project diversity across the gaming platform.

The work environment emphasizes transforming business needs into production applications rather than just academic research or proof-of-concepts. You'll work with real-world constraints around latency, scalability, data quality, and business impact, requiring pragmatic approaches that balance model sophistication with operational requirements. This end-to-end ownership - from understanding business problems through deploying and monitoring production models - is central to the role.

Your technical responsibilities span the complete data science workflow. You'll translate business requirements into well-defined machine learning problems, identifying the right problem formulation (classification, regression, ranking, etc.) and success metrics. Exploratory Data Analysis (EDA) and feature engineering form a significant part of your work - understanding data distributions, identifying patterns, handling missing values, creating meaningful features, and preparing datasets that enable effective model training.

Experimentation and model development require running comparative experiments across different algorithms and approaches, implementing rigorous evaluation methodologies to ensure models generalize well, applying best practices in model selection based on problem characteristics and constraints, and conducting systematic parameter tuning to optimize performance. You'll work extensively with the Python machine learning ecosystem including scikit-learn, pandas, NumPy, and various specialized libraries.

Big data processing is handled through Spark (PySpark), requiring you to design and implement data processing pipelines that can handle large-scale datasets efficiently, write PySpark code for distributed feature engineering and model training, and optimize Spark jobs for performance and resource utilization. Your solid software engineering background in OOP ensures your code is maintainable, testable, and follows engineering best practices rather than being one-off notebook scripts.

Working in Agile/Scrum methodology, you'll participate in sprint planning, daily stand-ups, and retrospectives, collaborating closely with ML engineers who help operationalize your models, data engineers who build data pipelines, product managers who define requirements, and business stakeholders who use model outputs. Strong teamwork, communication, and analytical thinking skills are essential for navigating complex projects and making sound technical decisions.

Core Tech Stack: Python (primary), PySpark (distributed processing), ML libraries (scikit-learn, pandas, NumPy)
ML Focus: Binary classification, regression, recommendation systems, various supervised/unsupervised approaches
Scale: Large-scale data processing, production models serving millions of users
Methodology: Agile/Scrum with end-to-end project delivery
Domain: Gaming/iGaming with diverse business applications and data types

WHAT WE'RE LOOKING FOR (Required)

Senior Level Experience: This is a SENIOR position - we're seeking experienced data scientists, not junior or entry-level candidates
Product Company Background: Strong preference for candidates from product companies (e.g., Farfetch, Talkdesk, Outsystems, Feedzai, or similar) who understand production ML at scale
End-to-End Experience: Proven track record working E2E from research through production deployment - not just building models but seeing them through to real-world impact
Education: Background in Computer Science, Statistics, Mathematics, or related field (Master's or PhD advantageous)
ML Algorithm Knowledge: Strong knowledge of machine learning algorithms and respective theory - understanding when to apply different approaches and why
Production Experience: 2-8 years of hands-on experience delivering machine learning models to production environments with real users and business impact
Python ML Ecosystem: Deep knowledge of the Python machine learning ecosystem including scikit-learn, pandas, NumPy, matplotlib/seaborn, and specialized ML libraries
PySpark Proficiency: Experience with Spark (PySpark) for distributed data processing and large-scale ML pipelines
Software Engineering Background: Solid software background in OOP - ability to write clean, maintainable, production-quality code following engineering best practices
Business Translation: Ability to translate business requirements into machine learning problems with appropriate problem formulation and success metrics
EDA Expertise: Strong skills in exploratory data analysis - understanding data, identifying patterns, and deriving insights
Feature Engineering: Hands-on experience with feature engineering - creating meaningful features that improve model performance
Experimental Design: Capability to run comparative experiments for model training with rigorous evaluation methodologies
Model Selection Best Practices: Knowledge of best practices in model selection, parameter tuning, and avoiding overfitting
Teamwork: Strong teamwork skills - ability to collaborate with data engineers, ML engineers, and business stakeholders effectively
Communication: Excellent communication skills for explaining complex ML concepts to non-technical stakeholders and collaborating with technical teams
Analytical Thinking: Strong analytical thinking abilities for problem decomposition and solution design
Agile Experience: Working knowledge of Agile methodologies and Scrum framework with participation in agile ceremonies
Language: Fluency in English (B2+ minimum) both oral and written for international team collaboration and documentation
Location: Based in Portugal with availability for fully remote work

NICE TO HAVE (Preferred)

Azure/Databricks Experience: Hands-on experience with Azure cloud platform and Databricks for scalable ML workflows
Deep Learning: Knowledge of deep learning frameworks (TensorFlow, PyTorch, Keras) and neural network architectures
Recommendation Systems: Experience building recommendation systems - collaborative filtering, content-based, hybrid approaches, or modern neural recommenders
Advanced Spark: Deep Spark expertise including Spark MLlib, optimization techniques, and distributed training
MLOps Practices: Experience with MLOps tools and practices (MLflow, model versioning, automated retraining, monitoring)
Feature Stores: Experience with feature stores (Feast, Databricks Feature Store) for feature management
Model Deployment: Hands-on experience deploying models to production (REST APIs, batch processing, real-time inference)
A/B Testing: Understanding of A/B testing methodologies and experimentation frameworks for model evaluation
Additional ML Domains: Experience with time-series forecasting, NLP, computer vision, or anomaly detection
SQL Proficiency: Strong SQL skills for data extraction and validation beyond PySpark
Data Visualization: Advanced data visualization skills (Plotly, Tableau, PowerBI) for insights communication
Statistics Advanced: Deep statistical knowledge beyond standard ML - causal inference, Bayesian methods, experimental design
AutoML: Experience with AutoML tools and understanding when automated approaches are appropriate
Model Interpretability: Experience with model explainability techniques (SHAP, LIME, feature importance analysis)
Docker/Kubernetes: Understanding of containerization for ML model deployment
CI/CD for ML: Experience with CI/CD pipelines specifically for ML workflows
Performance Optimization: Skills in optimizing model inference latency and computational efficiency
Gaming Industry: Previous experience in gaming, iGaming, or high-transaction entertainment environments
Streaming Data: Experience with real-time data processing and online learning scenarios
Graph ML: Knowledge of graph neural networks or graph-based ML approaches
Reinforcement Learning: Understanding of RL concepts and applications
Publications: Research publications in ML/AI conferences or journals
Kaggle/Competitions: Strong performance in ML competitions demonstrating practical skills
Open Source Contributions: Contributions to ML open source projects or libraries

Location: Portugal (100% Remote)

Or refer someone