Releases: r-world-devs/GitStats
GitStats 2.5.0
This release introduces external storage backends (PostgreSQL and SQLite) for persisting pulled data across sessions, and optional parallel processing via mirai for faster API calls. It also brings several performance improvements, including a faster file tree retrieval and optimized GitLab repository queries.
- Added
add_languagesparameter toget_repos()(TRUEby default). When set toFALSE, languages data is excluded from the output and the GitLab REST languages API calls are skipped, speeding up the process. - Fixed
get_repos()returningNAforcommit_shaon archived GitLab projects. When the GraphQL API returnsnullforlastCommit, a REST Branches API fallback can now retrieve the SHA. Usefill_empty_sha = TRUEinget_repos()to enable this (#746). - Added optional parallel processing for API calls via
miraipackage. Useset_parallel()to enable concurrent data fetching across repositories and organizations (#736). - Cached
set_owner_type()results to avoid redundant GraphQL calls when multipleget_*functions are used in the same session (#738). - Replaced per-directory GraphQL file tree traversal with single-call REST recursive tree API for
get_repos_trees(), substantially improving speed of retrieving repository file trees (#740). - Added
set_postgres_storage(),set_sqlite_storage(), andset_local_storage()to configure external storage backends. PostgreSQL (viaRPostgres/DBI) and SQLite (viaRSQLite/DBI) are supported for persisting data in a database. Metadata (R classes, attributes) is preserved via a_metadatatable (#602). - Added
remove_from_storage()to remove a named table from the active storage backend (#747). - Added
remove_postgres_storage()andremove_sqlite_storage()to fully remove a database storage backend — the PostgreSQL variant drops the GitStats schema, the SQLite variant deletes the database file — and revert to local storage (#759). - Added
get_storage_metadata()to retrieve metadata (R classes, custom attributes, column types) for a stored table (#748). - Fixed slow
get_repos()for GitLab when specific repos are set. Previously, therepos_by_userGraphQL query searched the entire GitLab instance; now repos are queried directly byfullPath(#750). - Sped up vignettes generation (#504, @marcinkowskak).
GitStats 2.4.0
The newest minor release includes new functions for retrieving pull requests (get_pull_requests()) and their statistics (get_pull_requests_stats()), prettified repository URL outputs in get_repos_urls(), along with refactoring and code cleanup.
- Prettified output of repositories URLs in
get_repos_urls()function (#710). - Added
get_pull_requests()function for getting information about pull requests (#722). - Cleaned up unnecessary comments (#723).
- Added
get_pull_requests_stats()function (#726). - Reorganized fixtures and test helpers (#727).
- Prettified messages with new icons (#148, #361).
- Refactored code for getting repositories with code to make it more readable ([#612]#612)).
GitStats 2.3.9
This patch release covers fixes for get_files() function and updates for until parameter in get_release_logs(), get_commits() and get_issues() functions.
- Handled getting multiple files for GitHub in case some of these files did not exist in scanned repositories (#713).
- Added skipping files content when pulling files with
patternresults with empty files structure (#711). - Updated the logic for the
untilparameter inget_release_logs(),get_commits()andget_issues(). Functions will now include records from the specified date (e.g., passing "2025-12-08" tountilwill include data from December 8th, 2025), whereas previously, it only fetched data up to (but not including) that date (#718).
GitStats 2.3.8
This patch introduces the show_hosts() function to display host information, addresses empty GitLab project values in file tables, and updates verbose logic to default to FALSE while retaining critical messages.
- Added
show_hosts()function to print info on hosts set withset_*_host()functions (#672) - Handled empty GitLab project values when generating a files table (#702).
- Changed
verboselogic - by default user functions have nowverboseset toFALSE. Still, most important messages are printed (e.g. time span of the whole process) (#704).
GitStats 2.3.7
Patch release with some improvements and fixes for the process of pulling repositories data when getting commits and files, as well as change of the idea of progress bars, which are now displayed on GitHost level, instead of organization level.
- Handled GitLab GraphQL error for
get_repos_data()method inget_commits()andget_files()with switching to REST API engine (#690). - Introduced caching repositories data, as a bunch of functions for getting commits, files and issues etc. make use of this data (#693).
- Simplified code for handling complexity error when getting files from GitLab repositories (#695).
- Introduced changes to progress bars, most notably moved them to the GitHost level to display high-level progress (#687, #697).
- Made integration tests work for local testing (#595).
GitStats 2.3.6
A minor release with some substantial performance improvements on searching repositories by code, new features like filtering repositories data by languages and adding new columns in get_repos() and get_files() output.
- Added
commit_shacolumn toget_repos()andget_files()outputs (#546). - Fixed
depthparameter inget_files()- previously0and1value returned same output, i.e. files fromroot. Now it works the way as explained in function documentation - value0returns files from therootand value1goes 1 level deeper (#663). - Added
languageparameter toget_repos()function to pull repositories only with defined language (#654). For GitHub Search API it translates into language query, whereas in other cases the repositories output is simply filtered by the given language. - Introduced new
repo_fullpathcolumn, which replacedfullnamein output ofget_repos(). Thefullnamecolumn was flawed in case of GitLab repositories as it was created out of repositorynamewithorganization. In case of GitLab repositoryname(which is more of a user friendly label) differs from repositorypath(which is in theURL), unlike in GitHub where repositorynameis repositorypath(#659).repo_namecolumn for GitLab repositories now mirrors repositorypath. - Enhanced
verboserole to control displaying of response error statuses (#669). - Improved code for searching code blobs, so
get_repos()does not fail when user passes text, e.g. with spaces to thewith_codeparameter (#673). - Standardized
repo_idcolumn inget_repos()andget_files()outputs for GitLab hosts - it consists now only of digits formatted as acharacter(#675). - Optimized parsing search response to repositories response (#679).
GitStats 2.3.5
GitStats 2.3.4
- Enabled possibility to set public hosts without specifying
organizationsandrepositoriesscope (#640). This change was motivated by the need to enable the call of functions based on Search API on public hosts (such asget_repos(with_code = {code})), whose performance is acceptable on large public repositories. In the case of other slower functions, users will be informed of the estimated data retrieval time via a progress bar. - Enabled getting repositories trees for whole hosts (#641).
- Removed
get_repos_with_R_packages()function (#644), as it is not in line withGitStatslogic: getting formatted git data from repositories. - Standardized some column names (
repo_nameinstead ofrepository,githostinstead ofplatform) for some tables (issues,files_contentandrepos) (#632).