Skip to content

Releases: r-world-devs/GitStats

GitStats 2.5.0

02 Apr 12:14
190c358

Choose a tag to compare

This release introduces external storage backends (PostgreSQL and SQLite) for persisting pulled data across sessions, and optional parallel processing via mirai for faster API calls. It also brings several performance improvements, including a faster file tree retrieval and optimized GitLab repository queries.

  • Added add_languages parameter to get_repos() (TRUE by default). When set to FALSE, languages data is excluded from the output and the GitLab REST languages API calls are skipped, speeding up the process.
  • Fixed get_repos() returning NA for commit_sha on archived GitLab projects. When the GraphQL API returns null for lastCommit, a REST Branches API fallback can now retrieve the SHA. Use fill_empty_sha = TRUE in get_repos() to enable this (#746).
  • Added optional parallel processing for API calls via mirai package. Use set_parallel() to enable concurrent data fetching across repositories and organizations (#736).
  • Cached set_owner_type() results to avoid redundant GraphQL calls when multiple get_* functions are used in the same session (#738).
  • Replaced per-directory GraphQL file tree traversal with single-call REST recursive tree API for get_repos_trees(), substantially improving speed of retrieving repository file trees (#740).
  • Added set_postgres_storage(), set_sqlite_storage(), and set_local_storage() to configure external storage backends. PostgreSQL (via RPostgres/DBI) and SQLite (via RSQLite/DBI) are supported for persisting data in a database. Metadata (R classes, attributes) is preserved via a _metadata table (#602).
  • Added remove_from_storage() to remove a named table from the active storage backend (#747).
  • Added remove_postgres_storage() and remove_sqlite_storage() to fully remove a database storage backend — the PostgreSQL variant drops the GitStats schema, the SQLite variant deletes the database file — and revert to local storage (#759).
  • Added get_storage_metadata() to retrieve metadata (R classes, custom attributes, column types) for a stored table (#748).
  • Fixed slow get_repos() for GitLab when specific repos are set. Previously, the repos_by_user GraphQL query searched the entire GitLab instance; now repos are queried directly by fullPath (#750).
  • Sped up vignettes generation (#504, @marcinkowskak).

GitStats 2.4.0

17 Mar 08:48
9ec104e

Choose a tag to compare

The newest minor release includes new functions for retrieving pull requests (get_pull_requests()) and their statistics (get_pull_requests_stats()), prettified repository URL outputs in get_repos_urls(), along with refactoring and code cleanup.

  • Prettified output of repositories URLs in get_repos_urls() function (#710).
  • Added get_pull_requests() function for getting information about pull requests (#722).
  • Cleaned up unnecessary comments (#723).
  • Added get_pull_requests_stats() function (#726).
  • Reorganized fixtures and test helpers (#727).
  • Prettified messages with new icons (#148, #361).
  • Refactored code for getting repositories with code to make it more readable ([#612]#612)).

GitStats 2.3.9

12 Jan 07:52

Choose a tag to compare

This patch release covers fixes for get_files() function and updates for until parameter in get_release_logs(), get_commits() and get_issues() functions.

  • Handled getting multiple files for GitHub in case some of these files did not exist in scanned repositories (#713).
  • Added skipping files content when pulling files with pattern results with empty files structure (#711).
  • Updated the logic for the until parameter in get_release_logs(), get_commits() and get_issues(). Functions will now include records from the specified date (e.g., passing "2025-12-08" to until will include data from December 8th, 2025), whereas previously, it only fetched data up to (but not including) that date (#718).

GitStats 2.3.8

08 Dec 07:58
b2cef30

Choose a tag to compare

This patch introduces the show_hosts() function to display host information, addresses empty GitLab project values in file tables, and updates verbose logic to default to FALSE while retaining critical messages.

  • Added show_hosts() function to print info on hosts set with set_*_host() functions (#672)
  • Handled empty GitLab project values when generating a files table (#702).
  • Changed verbose logic - by default user functions have now verbose set to FALSE. Still, most important messages are printed (e.g. time span of the whole process) (#704).

GitStats 2.3.7

16 Oct 07:24
dede384

Choose a tag to compare

Patch release with some improvements and fixes for the process of pulling repositories data when getting commits and files, as well as change of the idea of progress bars, which are now displayed on GitHost level, instead of organization level.

  • Handled GitLab GraphQL error for get_repos_data() method in get_commits() and get_files() with switching to REST API engine (#690).
  • Introduced caching repositories data, as a bunch of functions for getting commits, files and issues etc. make use of this data (#693).
  • Simplified code for handling complexity error when getting files from GitLab repositories (#695).
  • Introduced changes to progress bars, most notably moved them to the GitHost level to display high-level progress (#687, #697).
  • Made integration tests work for local testing (#595).

GitStats 2.3.6

17 Sep 07:46

Choose a tag to compare

A minor release with some substantial performance improvements on searching repositories by code, new features like filtering repositories data by languages and adding new columns in get_repos() and get_files() output.

  • Added commit_sha column to get_repos() and get_files() outputs (#546).
  • Fixed depth parameter in get_files() - previously 0 and 1 value returned same output, i.e. files from root. Now it works the way as explained in function documentation - value 0 returns files from the root and value 1 goes 1 level deeper (#663).
  • Added language parameter to get_repos() function to pull repositories only with defined language (#654). For GitHub Search API it translates into language query, whereas in other cases the repositories output is simply filtered by the given language.
  • Introduced new repo_fullpath column, which replaced fullname in output of get_repos(). The fullname column was flawed in case of GitLab repositories as it was created out of repository name with organization. In case of GitLab repository name (which is more of a user friendly label) differs from repository path (which is in the URL), unlike in GitHub where repository name is repository path (#659). repo_name column for GitLab repositories now mirrors repository path.
  • Enhanced verbose role to control displaying of response error statuses (#669).
  • Improved code for searching code blobs, so get_repos() does not fail when user passes text, e.g. with spaces to the with_code parameter (#673).
  • Standardized repo_id column in get_repos() and get_files() outputs for GitLab hosts - it consists now only of digits formatted as a character (#675).
  • Optimized parsing search response to repositories response (#679).

GitStats 2.3.5

19 Aug 05:40

Choose a tag to compare

  • Unified approach to handling GraphQL errors in GitHub and GitLab (#622).
  • Added graceful handling for HTTP 404 error (#653).

GitStats 2.3.4

08 Jul 13:33

Choose a tag to compare

  • Enabled possibility to set public hosts without specifying organizations and repositories scope (#640). This change was motivated by the need to enable the call of functions based on Search API on public hosts (such as get_repos(with_code = {code})), whose performance is acceptable on large public repositories. In the case of other slower functions, users will be informed of the estimated data retrieval time via a progress bar.
  • Enabled getting repositories trees for whole hosts (#641).
  • Removed get_repos_with_R_packages() function (#644), as it is not in line with GitStats logic: getting formatted git data from repositories.
  • Standardized some column names (repo_name instead of repository, githost instead of platform) for some tables (issues, files_content and repos) (#632).

GitStats 2.3.3

03 Jun 08:06

Choose a tag to compare

  • Handled connection errors to GraphQL (502 Bad Gateway) occurring during pulling commits (#636).
  • Fixed pulling issues when there are no authors of the issue (#637).
  • Fixed pulling GitLab commits when org is set as a scope (#639).

GitStats 2.3.2

20 May 09:21

Choose a tag to compare

  • Added get_repos_trees() function (#614).
  • Fixed errors when pulling data on repositories with code (#634).