Job Openings Senior Data Scientist Full Remote Portugal

About the job Senior Data Scientist Full Remote Portugal

ABOUT THE OPPORTUNITY

Join a leading technology-driven gaming company as a Senior Data Scientist and build production-grade machine learning models that power data-driven automation and personalized customer experiences for millions of users globally. 

This is a senior-level position seeking experienced data scientists who can own the complete ML lifecycle from research through production deployment. We're specifically looking for candidates from product companies who understand what it means to build models that don't just work in notebooks but deliver real business value in production environments serving millions of users with high availability and performance requirements.

The role combines strong machine learning expertise with software engineering best practices - you'll translate business requirements into ML problems, perform exploratory data analysis and feature engineering, run comparative experiments for model training, and implement best practices on model selection and parameter tuning. Working within cross-functional teams that include data scientists, ML engineers, and data engineers, you'll have the full skillset available to deliver end-to-end projects in an Agile/Scrum environment.

PROJECT & CONTEXT

You'll be working within a machine learning team dedicated to making data-driven decisions that automate services while focusing on delivering tailored customer experiences. The team builds a variety of models - from binary classification tasks and regression problems to sophisticated recommendation systems - covering a wide range of business sectors, utilizing different data types, and handling broad project diversity across the gaming platform.

The work environment emphasizes transforming business needs into production applications rather than just academic research or proof-of-concepts. You'll work with real-world constraints around latency, scalability, data quality, and business impact, requiring pragmatic approaches that balance model sophistication with operational requirements. This end-to-end ownership - from understanding business problems through deploying and monitoring production models - is central to the role.

Your technical responsibilities span the complete data science workflow. You'll translate business requirements into well-defined machine learning problems, identifying the right problem formulation (classification, regression, ranking, etc.) and success metrics. Exploratory Data Analysis (EDA) and feature engineering form a significant part of your work - understanding data distributions, identifying patterns, handling missing values, creating meaningful features, and preparing datasets that enable effective model training.

Experimentation and model development require running comparative experiments across different algorithms and approaches, implementing rigorous evaluation methodologies to ensure models generalize well, applying best practices in model selection based on problem characteristics and constraints, and conducting systematic parameter tuning to optimize performance. You'll work extensively with the Python machine learning ecosystem including scikit-learn, pandas, NumPy, and various specialized libraries.

Big data processing is handled through Spark (PySpark), requiring you to design and implement data processing pipelines that can handle large-scale datasets efficiently, write PySpark code for distributed feature engineering and model training, and optimize Spark jobs for performance and resource utilization. Your solid software engineering background in OOP ensures your code is maintainable, testable, and follows engineering best practices rather than being one-off notebook scripts.

Working in Agile/Scrum methodology, you'll participate in sprint planning, daily stand-ups, and retrospectives, collaborating closely with ML engineers who help operationalize your models, data engineers who build data pipelines, product managers who define requirements, and business stakeholders who use model outputs. Strong teamwork, communication, and analytical thinking skills are essential for navigating complex projects and making sound technical decisions.

Core Tech Stack: Python (primary), PySpark (distributed processing), ML libraries (scikit-learn, pandas, NumPy)
ML Focus: Binary classification, regression, recommendation systems, various supervised/unsupervised approaches
Scale: Large-scale data processing, production models serving millions of users
Methodology: Agile/Scrum with end-to-end project delivery
Domain: Gaming/iGaming with diverse business applications and data types

WHAT WE'RE LOOKING FOR (Required)

  • Senior Level Experience: This is a SENIOR position - we're seeking experienced data scientists, not junior or entry-level candidates
  • Product Company Background: Strong preference for candidates from product companies (e.g., Farfetch, Talkdesk, Outsystems, Feedzai, or similar) who understand production ML at scale
  • End-to-End Experience: Proven track record working E2E from research through production deployment - not just building models but seeing them through to real-world impact
  • Education: Background in Computer Science, Statistics, Mathematics, or related field (Master's or PhD advantageous)
  • ML Algorithm Knowledge: Strong knowledge of machine learning algorithms and respective theory - understanding when to apply different approaches and why
  • Production Experience: 2-8 years of hands-on experience delivering machine learning models to production environments with real users and business impact
  • Python ML Ecosystem: Deep knowledge of the Python machine learning ecosystem including scikit-learn, pandas, NumPy, matplotlib/seaborn, and specialized ML libraries
  • PySpark Proficiency: Experience with Spark (PySpark) for distributed data processing and large-scale ML pipelines
  • Software Engineering Background: Solid software background in OOP - ability to write clean, maintainable, production-quality code following engineering best practices
  • Business Translation: Ability to translate business requirements into machine learning problems with appropriate problem formulation and success metrics
  • EDA Expertise: Strong skills in exploratory data analysis - understanding data, identifying patterns, and deriving insights
  • Feature Engineering: Hands-on experience with feature engineering - creating meaningful features that improve model performance
  • Experimental Design: Capability to run comparative experiments for model training with rigorous evaluation methodologies
  • Model Selection Best Practices: Knowledge of best practices in model selection, parameter tuning, and avoiding overfitting
  • Teamwork: Strong teamwork skills - ability to collaborate with data engineers, ML engineers, and business stakeholders effectively
  • Communication: Excellent communication skills for explaining complex ML concepts to non-technical stakeholders and collaborating with technical teams
  • Analytical Thinking: Strong analytical thinking abilities for problem decomposition and solution design
  • Agile Experience: Working knowledge of Agile methodologies and Scrum framework with participation in agile ceremonies
  • Language: Fluency in English (B2+ minimum) both oral and written for international team collaboration and documentation
  • Location: Based in Portugal with availability for fully remote work

NICE TO HAVE (Preferred)

  • Azure/Databricks Experience: Hands-on experience with Azure cloud platform and Databricks for scalable ML workflows
  • Deep Learning: Knowledge of deep learning frameworks (TensorFlow, PyTorch, Keras) and neural network architectures
  • Recommendation Systems: Experience building recommendation systems - collaborative filtering, content-based, hybrid approaches, or modern neural recommenders
  • Advanced Spark: Deep Spark expertise including Spark MLlib, optimization techniques, and distributed training
  • MLOps Practices: Experience with MLOps tools and practices (MLflow, model versioning, automated retraining, monitoring)
  • Feature Stores: Experience with feature stores (Feast, Databricks Feature Store) for feature management
  • Model Deployment: Hands-on experience deploying models to production (REST APIs, batch processing, real-time inference)
  • A/B Testing: Understanding of A/B testing methodologies and experimentation frameworks for model evaluation
  • Additional ML Domains: Experience with time-series forecasting, NLP, computer vision, or anomaly detection
  • SQL Proficiency: Strong SQL skills for data extraction and validation beyond PySpark
  • Data Visualization: Advanced data visualization skills (Plotly, Tableau, PowerBI) for insights communication
  • Statistics Advanced: Deep statistical knowledge beyond standard ML - causal inference, Bayesian methods, experimental design
  • AutoML: Experience with AutoML tools and understanding when automated approaches are appropriate
  • Model Interpretability: Experience with model explainability techniques (SHAP, LIME, feature importance analysis)
  • Docker/Kubernetes: Understanding of containerization for ML model deployment
  • CI/CD for ML: Experience with CI/CD pipelines specifically for ML workflows
  • Performance Optimization: Skills in optimizing model inference latency and computational efficiency
  • Gaming Industry: Previous experience in gaming, iGaming, or high-transaction entertainment environments
  • Streaming Data: Experience with real-time data processing and online learning scenarios
  • Graph ML: Knowledge of graph neural networks or graph-based ML approaches
  • Reinforcement Learning: Understanding of RL concepts and applications
  • Publications: Research publications in ML/AI conferences or journals
  • Kaggle/Competitions: Strong performance in ML competitions demonstrating practical skills
  • Open Source Contributions: Contributions to ML open source projects or libraries

Location: Portugal (100% Remote)