🎓 PhD Student in Computer Science @ LSU
🔬 Empirical Software Engineering | MSR | OSS Sustainability
📊 Large-scale GitHub Data Mining
🤖 Machine Learning + LLMs for Software Repositories
I build datasets and predictive models to understand how open source projects evolve, survive, and onboard newcomers.
- Pull Request & Issue Lifetime Prediction
- Newcomer Onboarding in Open Source
- OSS Project Liveness (Survival Analysis)
- Mining Software Repositories at scale
- LLM-based semantic signals for SE
🔹 PR & Issue Lifetime Prediction
Machine learning models using repository activity, social signals, and semantic features.
🔹 OSS Survival Analysis
Kaplan–Meier and Cox models to study project liveness in open-source ecosystems.
🔹 Newcomers in ROS Repositories
Mining ROS packages → GitHub repositories to analyze onboarding patterns.
🔹 GitHub Data Mining Pipelines
Automated extraction of commits, pull requests, issues, reviews, and comments → ML-ready datasets.
Data & Machine Learning
Python • Pandas • NumPy • scikit-learn • Survival Analysis
LLMs for Software Engineering
Prompt-based labeling • Semantic feature extraction • Local LLM inference
Software & Infrastructure
GitHub REST & GraphQL APIs • FastAPI • Docker • PostgreSQL • Git
💼 LinkedIn: https://www.linkedin.com/in/julianasantosfreitas/
🎓 Google Scholar: https://scholar.google.com/citations?user=-l1zxwMAAAAJ
🧪 ORCID: https://orcid.org/0000-0002-8824-280X
🌐 https://womeninoss.com/



