Double Master’s student in Data Science at the University of Padua and Universitat Politècnica de Catalunya, with a background in Statistics and Information Management and a strong focus on applied machine learning and experimental rigor.
-
🔭 I’m currently:
- Maintaining and improving open-source ML implementations on GitHub, including methods such as Friedman's Regularized Discriminant Analysis.
-
🎓 Education:
- Double Master Degree in Data Science – University of Padua & Universitat Politècnica de Catalunya (2024 – current).
- BSc in Statistics and Information Management – University of Milan-Bicocca.
-
🧪 Some recent projects and research:
- Benchmarked 20+ categorical encoders on 10+ tabular datasets in supervised learning, with a strong emphasis on reproducibility and rigorous evaluation.
- Implemented distributed k-means (Lloyd’s algorithm) and a fair variant in PySpark, plus streaming frequency estimation with Count-Min Sketch vs Count Sketch.
- Built NLP pipelines with Transformers, BERT, and RAG for tasks like text classification, fake-news detection, and domain-specific question answering.
-
💡 What I enjoy:
- Turning messy, real-world data into clear insights and robust models, from implementation of ML algorithms to recommender systems and optimization.
- Bridging theory and practice through open, reproducible experiments and clean, well-documented code.
-
📫 How to reach me:
- Email: fez.cle@gmail.com
- LinkedIn: linkedin.com/in/federicoclericii
- GitHub: github.com/F3z11
Also working with: R, SQL, SAS, Apache Spark/PySpark, Jupyter, VS Code, Tableau, Git/GitHub.