Skip to content

feat: AI auto-tagging#1232

Open
emanuelebeffa wants to merge 20 commits intosissbruecker:masterfrom
emanuelebeffa:master
Open

feat: AI auto-tagging#1232
emanuelebeffa wants to merge 20 commits intosissbruecker:masterfrom
emanuelebeffa:master

Conversation

@emanuelebeffa
Copy link
Copy Markdown
Contributor

@emanuelebeffa emanuelebeffa commented Nov 27, 2025

Adds AI-powered auto-tagging functionality using OpenAI compatible APIs. Users can configure their API key, model, base URL and define a vocabulary of allowed tags, then use the "Refresh AI tags" action to automatically generate tag suggestions for their bookmarks based on content (URL, title, description).

The feature includes:

  • API key validation
  • Base URL support to use any OpenAI compatible APIs (e.g. Ollama)
  • Configurable tag vocabulary to constrain AI suggestions
  • Bulk tagging
  • Unit tests

Estimated cost analysis

Using OpenAI's gpt-5-nano model at $0.05 per 1M input tokens and $0.4 per 1M output tokens:

Average cost per bookmark: ~$0.000043

  • Input: ~250 tokens

  • Cost: $0,0000125

  • Output: ~100 tokens

  • Cost: $0.00004

Volume Cost
100 bookmarks ~$0.0043
1,000 bookmarks ~$0.043

Possible future evolution

  • Batch processing optimizations for large bulk operations, further reducing costs
  • Prompt customization
  • List supported models

@emanuelebeffa emanuelebeffa mentioned this pull request Nov 27, 2025
@oliexe
Copy link
Copy Markdown

oliexe commented Dec 2, 2025

This is awesome. Lgtm, hopefully this will get merged soon.

@7jrxt42BxFZo4iAnN4CX
Copy link
Copy Markdown

@sissbruecker we are waiting for this

@Eragos
Copy link
Copy Markdown

Eragos commented Dec 8, 2025

@7jrxt42BxFZo4iAnN4CX sadly this is only a single OpenAI, not a self-hosted Ollama or other AI-APIs solution… The difference is not so bad but should be carefully thought of – my2ct

@7jrxt42BxFZo4iAnN4CX
Copy link
Copy Markdown

@Eragos Then expand to any OpenAI compatible ones? I'd like an open router.

@emanuelebeffa
Copy link
Copy Markdown
Contributor Author

@7jrxt42BxFZo4iAnN4CX sadly this is only a single OpenAI, not a self-hosted Ollama or other AI-APIs solution… The difference is not so bad but should be carefully thought of – my2ct

I’m already working on Ollama integration, I was just waiting for this PR to be merged first and to check whether there were any blockers

@7jrxt42BxFZo4iAnN4CX
Copy link
Copy Markdown

@emanuelebeffa Great news, that would be great.
We are waiting for merge.

@emanuelebeffa
Copy link
Copy Markdown
Contributor Author

I almost finished Ollama integration, I’ll be reopening the PR soon

@emanuelebeffa emanuelebeffa reopened this Dec 9, 2025
@emanuelebeffa emanuelebeffa changed the title feat: AI auto tagging feat: AI auto-tagging Dec 9, 2025
@7jrxt42BxFZo4iAnN4CX
Copy link
Copy Markdown

@emanuelebeffa

This review was generated with AI assistance

Review Summary

Thanks for this feature! The implementation is well-structured with good test coverage. I found a few issues worth addressing before merge.


🔴 Critical

Bulk "Refresh AI tags" doesn't work for bookmarks with existing tags

The documentation states:

"This will replace the existing tags with new AI-generated suggestions"

However, _auto_tag_bookmark_task skips bookmarks that already have tags:

if bookmark.tags.exists():
    logger.info(f"Skipping AI tagging - bookmark {bookmark_id} already has tags")
    return

Suggested fix: Add a force parameter to allow bulk refresh to override existing tags:

def auto_tag_bookmark(user: User, bookmark: Bookmark, force: bool = False):
    # ...
    _auto_tag_bookmark_task(bookmark.id, user.id, force)

@task()
def _auto_tag_bookmark_task(bookmark_id: int, user_id: int, force: bool = False):
    if bookmark.tags.exists() and not force:
        return
    # ...

Then in refresh_ai_tags, pass force=True.


🟠 Important

1. API key exposed in API response

ai_api_key is included in UserProfileSerializer fields without write_only=True. While only the profile owner can access it, this is still risky (XSS, logging, etc.).

Suggested fix:

ai_api_key = serializers.CharField(write_only=True, required=False, allow_blank=True)

2. No rate limiting for bulk AI operations

Selecting many bookmarks for bulk refresh could trigger hundreds of API calls, potentially exceeding provider rate limits or incurring unexpected costs.

Suggestion: Consider adding a limit or at least a warning in the UI.


🟡 Minor / Nice-to-have

  • Timeout: Consider adding explicit timeout to OpenAI client (default is 10 min, which is quite long)
  • Vocabulary size limit: Large tag lists increase API costs per request

Overall, great work! The Pydantic structured outputs, error handling with retry logic for 5xx errors, and hallucination filtering are all well done. 👍

@emanuelebeffa
Copy link
Copy Markdown
Contributor Author

Thanks for the review! I've pushed fixes for Bulk "Refresh AI tags" doesn't work for bookmarks with existing tags, API key exposed in API response and Timeout. I've also pushed UI warnings for No rate limiting for bulk AI operations and Vocabulary size limit.

@7jrxt42BxFZo4iAnN4CX
Copy link
Copy Markdown

Looks great! I've reviewed the latest changes, and they successfully address all the previous points.

The implementation looks solid and ready to go.
Waiting for @sissbruecker for the final review and merge. 🚀

@emanuelebeffa
Copy link
Copy Markdown
Contributor Author

Hey @sissbruecker I just rebased the PR after the latest commits on master. Could you please take a look so we can merge it? Thanks!

@electricmessiah
Copy link
Copy Markdown

To clarify.. this will allow us to use a local AI model to take all of our existing bookmarks and have AI auto-tag them based on title, content, both?

@emanuelebeffa
Copy link
Copy Markdown
Contributor Author

The auto-tagging is based on URL, title and description.

@js94x
Copy link
Copy Markdown

js94x commented Feb 3, 2026

Thanks, this is awesome. I built a docker image locally for testing your feature but I had to remove python type hint at first to make it work:

...
1.953   File "/etc/linkding/bookmarks/services/bookmarks.py", line 212, in <module>
1.953     def refresh_ai_tags(bookmark_ids: list[Union[int, str]], current_user: User):
1.953                                            ^^^^^
1.953 NameError: name 'Union' is not defined
------
...
ERROR: failed to solve: process "/bin/sh -c mkdir data &&     python manage.py collectstatic" did not complete successfully: exit code: 1

Did you experience the same error?

@emanuelebeffa
Copy link
Copy Markdown
Contributor Author

Hi, when I merged the latest changes from origin/master, I forgot to build the Docker image. I’ve now pushed a commit that fixes the issue.
@sissbruecker do you have any updates on the review?
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants