Skip to content

chore(backend): Repurpose /contributors endpoint to return all contributors without ranking #121

@jonasyr

Description

@jonasyr

🎯 Issue Type

  • 🧹 Chore – Maintenance, refactoring, cleanup

📋 Description

Problem / Need

The current endpoint POST /contributors aggregates commit statistics and returns only the top 5 contributors with their commit count, lines added/deleted and a percentage of contributions. DeepWiki’s documentation confirms that this endpoint “groups commits by author name, calculates totals and sorts by commit count to return the top 5 contributors by default”:contentReference[oaicite:0]{index=0}. The route in repositoryRoutes.ts invokes gitService.getTopContributors() and caches the result under contributors:{repoUrl}:contentReference[oaicite:1]{index=1}.

This behaviour constitutes profiling under the DSGVO (GDPR), because it exposes individual commit statistics. To comply with privacy regulations, we must remove the ranking logic and avoid exposing per‑author statistics.

Expected Behavior

  • The endpoint /contributors returns a list of all unique contributor names for the given repository, in no particular order or optionally alphabetically sorted.
  • The response contains only the contributor’s login (author name) – no commit counts, line statistics or contribution percentages.
  • The caching layer stores the entire list of names and still uses a consistent key (e.g., contributors:{repoUrl}:{filterOptions}). It should absolutly use the existing three tier caching logic leveraging hybridLRU Cache and its helper functions like the other endpoints too to ensure unified caching.
  • All code related to lines added/deleted and contribution percentages is removed. The service should use a simplified method like gitService.getContributors() that returns all authors.

Current Behavior

  • gitService.getTopContributors() parses git log --numstat, aggregates commits, sums lines added and deleted, sorts by commit count and slices the top five results.
  • The endpoint returns objects containing login, commitCount, linesAdded, linesDeleted and contributionPercentage:contentReference[oaicite:2]{index=2}.
  • The response is limited to five contributors, exposing statistics that violate GDPR profiling.

🔄 Steps to Reproduce

  1. Send a POST request to /contributors for a repository with more than five commit authors.
  2. Observe that the response contains only the top 5 authors along with commit counts and line statistics.
  3. Note that statistics are exposed and the list is incomplete.

🎨 Mockups / Code Example

Vorher (existing response):

[
  {
    "login": "Alice",
    "commitCount": 21,
    "linesAdded": 1522,
    "linesDeleted": 230,
    "contributionPercentage": 41
  },
  { ... }
]

Nachher (expected response):

{
  "contributors": ["Alice", "Bob", "Charlie", ""]
}

🧪 Acceptance Criteria

  • The /contributors endpoint returns all unique contributor names without ranking or statistics.
  • No fields such as commitCount, linesAdded, linesDeleted or contributionPercentage appear in the response.
  • A new method like gitService.getContributors() aggregates unique authors and replaces getTopContributors().
  • The caching key remains consistent and stores only the list of names.
  • Type definitions in packages/shared-types are updated to reflect a Contributor with only a login field.
  • Unit tests and integration tests are updated to verify the new behaviour.
  • Manual testing shows that the endpoint returns complete contributor lists for repositories of varying sizes.
  • Documentation (including DeepWiki) reflects the new endpoint behaviour and removal of ranking logic.

🛠 Technical Details

Affected Files / Components

  • apps/backend/src/routes/repositoryRoutes.ts – Modify the /contributors route to call the new getContributors() method and return only names.
  • apps/backend/src/services/gitService.ts – Refactor getTopContributors() into getContributors(), removing numstat parsing and ranking.
  • packages/shared-types – Simplify the contributor interface to { login: string }.
  • repositoryCache – Adjust caching logic to store the full list.
  • Tests – Update or add tests in services/gitService.unit.test.ts and route tests.

Dependencies

  • No external dependencies, but ensure no other features rely on the old ContributorStat fields.

Breaking Changes

  • Yes, this change modifies the API response shape for /contributors and will affect front‑end or consumers expecting statistics.

🏷 Categorization

Scope

  • scope:backend – Backend/API changes only.

Priority

  • prio:medium – Important for GDPR compliance but not critical for uptime.

Effort

  • effort:small – Estimated < 2 hours to implement and update tests.

🌍 Environment

Not applicable (endpoint logic only).

🔗 Related Issues/PRs

None.

📝 Additional Notes

  • Front‑end code may need adjustments if it expects commit statistics; update accordingly.
  • Removing profiling ensures compliance with DSGVO and simplifies contributor aggregation.

✅ Checklist for Reviewer

  • Issue description and expected behaviour are clear.
  • Acceptance criteria cover all necessary conditions.
  • Code changes are feasible and align with architecture.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions