-
Notifications
You must be signed in to change notification settings - Fork 457
Index to search #1276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Index to search #1276
Conversation
89fd00e to
526d757
Compare
10bfd94 to
5bd6b18
Compare
️✅ There are no secrets present in this pull request anymore.If these secrets were true positive and are still valid, we highly recommend you to revoke them. 🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request. |
5bd6b18 to
e966594
Compare
qbey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First review, I know work is still ongoing and I did not read all the tests... :)
7cfa907 to
7255ec2
Compare
7255ec2 to
ee1105f
Compare
64b77bc to
7521e24
Compare
652c868 to
e9fdc43
Compare
Search in Docs relies on an external project like "La Suite Find". We need to declare a common external network in order to connect to the search app and index our documents.
We need to content in our demo documents so that we can test indexing.
Add indexer that loops across documents in the database, formats them as json objects and indexes them in the remote "Find" mico-service.
On document content or permission changes, start a celery job that will call the indexation API of the app "Find". Signed-off-by: Fabre Florian <ffabre@hybird.org>
Signed-off-by: Fabre Florian <ffabre@hybird.org>
Signed-off-by: Fabre Florian <ffabre@hybird.org>
New API view that calls the indexed documents search view (resource server) of app "Find". Signed-off-by: Fabre Florian <ffabre@hybird.org>
New SEARCH_INDEXER_CLASS setting to define the indexer service class. Raise ImpoperlyConfigured errors instead of RuntimeError in index service. Signed-off-by: Fabre Florian <ffabre@hybird.org>
Signed-off-by: Fabre Florian <ffabre@hybird.org>
Filter deleted documents from visited ones. Set default ordering to the Find API search call (-updated_at) BaseDocumentIndexer.search now returns a list of document ids instead of models. Do not call the indexer in signals when SEARCH_INDEXER_CLASS is not defined or properly configured. Signed-off-by: Fabre Florian <ffabre@hybird.org>
Only documents without title and content are ignored by indexer.
Add SEARCH_INDEXER_COUNTDOWN as configurable setting. Make the search backend creation simplier (only 'get_document_indexer' now). Allow indexation of deleted documents. Signed-off-by: Fabre Florian <ffabre@hybird.org>
Add bin/fernetkey that generates a key for the OIDC_STORE_REFRESH_TOKEN_KEY setting. Signed-off-by: Fabre Florian <ffabre@hybird.org>
Add nginx with 'nginx' alias to the 'lasuite-net' network (keycloak calls) Add celery-dev to the 'lasuite-net' network (Find API calls in jobs) Set app-dev alias as 'impress' in the 'lasuite-net' network Add indexer configuration in common settings Signed-off-by: Fabre Florian <ffabre@hybird.org>
Rename FindDocumentIndexer as SearchIndexer Rename FindDocumentSerializer as SearchDocumentSerializer Rename package core.tasks.find as core.task.search Remove logs on http errors in SearchIndexer Factorise some code in search API view. Signed-off-by: Fabre Florian <ffabre@hybird.org>
Replace indexer_debounce_lock|release functions by indexer_throttle_acquire() Instead of mutex-like mechanism, simply set a flag in cache for an amount of time that prevents any other task creation. Signed-off-by: Fabre Florian <ffabre@hybird.org>
Keep ordering by score from Find API on search/ results and fallback search still uses "-update_at" ordering as default Refactor pagination to work with a list instead of a queryset Signed-off-by: Fabre Florian <ffabre@hybird.org>
Set SEARCH_INDEXER_CLASS=None as default configuration for dev. Rename docker network 'lasuite-net' as 'lasuite' to match with Drive configuration. Signed-off-by: Fabre Florian <ffabre@hybird.org>
Add documentation for env & Find+Docs configuration in dev mode Signed-off-by: Fabre Florian <ffabre@hybird.org>
Reduce the number of Find API calls by grouping all the latest changes for indexation : send all the documents updated or deleted since the triggering of the task. Signed-off-by: Fabre Florian <ffabre@hybird.org>
104e7bf to
553332f
Compare
Purpose
We want to add fulltext (and semantic in a second phase) search to Docs.
The goal is to enable efficient and scalable search across document content by pushing relevant data to a dedicated search backend, such as OpenSearch. The backend should be pluggable.
Proposal
Fixes #322