-
-
Notifications
You must be signed in to change notification settings - Fork 10
Description
Feature and its Use Cases
I explored two possible approaches for implementing a duplicate issue detector that labels new issues as “possible duplicate” with an X% similarity score.
This issue is to compare both approaches before finalizing implementation at the org level.
Problem Statement
-
**Currently, contributors must manually search through existing issues before opening a new one.
-
This is time-consuming and still leads to duplicate issues.**
-
We need an automated way to:
-
Detect similar issues when a new issue is opened
-
Suggest possible duplicates (without auto-closing)
-
Label them clearly.
-
The system should assist, not enforce.
### Approach 1: Using CodeRabbit (Issue Enrichment)
- Org template: Template-Repo/.coderabbit.yaml
How it Works
- Configure duplicate detection via .coderabbit.yaml
- CodeRabbit enriches issues automatically
- Can be rolled out across org via template repo
Pros:
-
Centralized configuration (via template repo)
-
Org-wide scalability
-
No need to maintain custom workflows per repo
-
AI-powered enrichment (context-aware analysis)
-
Cleaner long-term maintenance
Cons
-
May not allow fine-grained control over scoring logic
-
Requires org-level alignment before rollout
Approach 2: Using GitHub Action Bot (Custom Workflow)
- Implemented example in PictoPy testing issue.
How it Works
-
Custom GitHub Action triggers on issues: opened
-
Compares issue title + body with existing issues
-
Calculates similarity score
-
Comments with similar issues
-
Labels as possible duplicate
Pros
-
Full control over logic and scoring
-
Customizable similarity threshold
-
Can mark as “X% duplicate”
-
No dependency on third-party enrichment
-
Transparent workflow
Cons
-
Needs per-repo workflow unless added to template repo
-
Maintenance burden on org
-
AI capability depends on implementation quality
-
Might require API rate handling
-
Less intelligent compared to AI-native enrichment (unless enhanced)
Additional Context
this is for duplicate issue detector and labelling as possible duplicate ,using github action bot
-
How: Use a GitHub bot and a simple GitHub Action to automatically check new issues and show similar existing ones.
-
Impact: Duplicate issues get clearly marked, so maintainers don’t have to spend extra time checking or replying again.
-
Notes: The system will only suggest duplicates and will not auto-close any issue.**
-
Note : it marks as x% duplicate
-
for more details checkout this testing issue BUG:A Time-of-Check to Time-of-Use (TOCTOU) race condition exists in sync-microservice/app/utils/watcher.py aniket866/PictoPy#5
so
Code of Conduct
- I have joined the Discord server and will post updates there
- I have searched existing issues to avoid duplicates