-
Notifications
You must be signed in to change notification settings - Fork 0
Index to search #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Search in Docs relies on an external project like "La Suite Find". We need to declare a common external network in order to connect to the search app and index our documents.
We need to content in our demo documents so that we can test indexing.
Add indexer that loops across documents in the database, formats them as json objects and indexes them in the remote "Find" mico-service.
On document content or permission changes, start a celery job that will call the indexation API of the app "Find". Signed-off-by: Fabre Florian <ffabre@hybird.org>
Signed-off-by: Fabre Florian <ffabre@hybird.org>
Signed-off-by: Fabre Florian <ffabre@hybird.org>
New API view that calls the indexed documents search view (resource server) of app "Find". Signed-off-by: Fabre Florian <ffabre@hybird.org>
New SEARCH_INDEXER_CLASS setting to define the indexer service class. Raise ImpoperlyConfigured errors instead of RuntimeError in index service. Signed-off-by: Fabre Florian <ffabre@hybird.org>
Signed-off-by: Fabre Florian <ffabre@hybird.org>
Filter deleted documents from visited ones. Set default ordering to the Find API search call (-updated_at) BaseDocumentIndexer.search now returns a list of document ids instead of models. Do not call the indexer in signals when SEARCH_INDEXER_CLASS is not defined or properly configured. Signed-off-by: Fabre Florian <ffabre@hybird.org>
Only documents without title and content are ignored by indexer.
Add SEARCH_INDEXER_COUNTDOWN as configurable setting. Make the search backend creation simplier (only 'get_document_indexer' now). Allow indexation of deleted documents. Signed-off-by: Fabre Florian <ffabre@hybird.org>
Generate a fernet key for the OIDC_STORE_REFRESH_TOKEN_KEY in development settings if not set. Signed-off-by: Fabre Florian <ffabre@hybird.org>
Add nginx with 'nginx' alias to the 'lasuite-net' network (keycloak calls) Add celery-dev to the 'lasuite-net' network (Find API calls in jobs) Set app-dev alias as 'impress' in the 'lasuite-net' network Add indexer configuration in common settings Signed-off-by: Fabre Florian <ffabre@hybird.org>
Set a valid Ferney for OIDC_STORE_REFRESH_TOKEN_KEY in env.d/development/common Add .gitguardian.yaml configuration to ignore this key. Signed-off-by: Fabre Florian <ffabre@hybird.org>
Rename FindDocumentIndexer as SearchIndexer Rename FindDocumentSerializer as SearchDocumentSerializer Rename package core.tasks.find as core.task.search Remove logs on http errors in SearchIndexer Factorise some code in search API view. Signed-off-by: Fabre Florian <ffabre@hybird.org>
Replace indexer_debounce_lock|release functions by indexer_throttle_acquire() Instead of mutex-like mechanism, simply set a flag in cache for an amount of time that block any other task creation. Signed-off-by: Fabre Florian <ffabre@hybird.org>
Keep ordering by score from Find API on search/ results and fallback search still uses "-update_at" ordering as default Refactor pagination to work with a list instead of a queryset Signed-off-by: Fabre Florian <ffabre@hybird.org>
| @@ -0,0 +1,5 @@ | |||
| secret: | |||
| ignored_matches: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ignored value contains a 44‐character key; storing static secrets in configuration is risky and may be better done via environment variables.
| env.d/terraform | ||
|
|
||
| # Docker | ||
| compose.override.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding "docker/auth/*.local" could inadvertently ignore local authentication files for all Docker builds; confirm this is intended.
| - 🍱(frontend) Fonts GDPR compliants #1453 | ||
| - ♻️(service-worker) improve SW registration and update handling #1473 | ||
| - ✨(backend) add async indexation of documents on save (or access save) #1276 | ||
| - ✨(backend) add debounce mechanism to limit indexation jobs #1276 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same ticket number suitenumerique#1276 appears twice in the added feature list; duplicates can confuse readers and should be consolidated.
| .PHONY: demo | ||
|
|
||
| index: ## index all documents to remote search | ||
| @$(MANAGE) index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new index target references $(MANAGE) index but no phony dependency is added for 'demo', potentially causing unintended dependency ordering.
| @@ -0,0 +1,6 @@ | |||
| #!/usr/bin/env bash | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The generated key is output with an extra newline; while harmless, it may cause downstream parsing errors if the consumer expects a plain key.
| ports: | ||
| - "8071:8000" | ||
| networks: | ||
| default: {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Network alias "impress" defined for the backend service may conflict with existing aliases; verify that no other service uses the same alias to avoid name clashes.
| OIDC_AUTH_REQUEST_EXTRA_PARAMS={"acr_values": "eidas1"} | ||
|
|
||
| # Store OIDC tokens in the session | ||
| OIDC_STORE_ACCESS_TOKEN = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The placeholder value for OIDC_STORE_REFRESH_TOKEN_KEY looks like a toy 44‐character string; replace it with a real Fernet key before deployment.
| default=enums.MoveNodePositionChoices.LAST_CHILD, | ||
| ) | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The string literal for the page field ends with an unmatched single quote, causing a syntax error that will prevent the module from loading.
| from django.core.exceptions import ValidationError | ||
| from django.core.files.storage import default_storage | ||
| from django.core.paginator import InvalidPage, Paginator | ||
| from django.core.validators import URLValidator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
search_serializer_class is set to ListDocumentSerializer, but the new SearchDocumentSerializer was added; this mismatch will make search responses use the wrong serializer.
| @@ -0,0 +1,5 @@ | |||
| secret: | |||
| ignored_matches: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ignored value contains a 44‐character key; storing static secrets in configuration is risky and may be better done via environment variables.
| env.d/terraform | ||
|
|
||
| # Docker | ||
| compose.override.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding "docker/auth/*.local" could inadvertently ignore local authentication files for all Docker builds; confirm this is intended.
| - 🍱(frontend) Fonts GDPR compliants #1453 | ||
| - ♻️(service-worker) improve SW registration and update handling #1473 | ||
| - ✨(backend) add async indexation of documents on save (or access save) #1276 | ||
| - ✨(backend) add debounce mechanism to limit indexation jobs #1276 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same ticket number suitenumerique#1276 appears twice in the added feature list; duplicates can confuse readers and should be consolidated.
| .PHONY: demo | ||
|
|
||
| index: ## index all documents to remote search | ||
| @$(MANAGE) index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new index target references $(MANAGE) index but no phony dependency is added for 'demo', potentially causing unintended dependency ordering.
| @@ -0,0 +1,6 @@ | |||
| #!/usr/bin/env bash | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The generated key is output with an extra newline; while harmless, it may cause downstream parsing errors if the consumer expects a plain key.
| ports: | ||
| - "8071:8000" | ||
| networks: | ||
| default: {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Network alias "impress" defined for the backend service may conflict with existing aliases; verify that no other service uses the same alias to avoid name clashes.
| OIDC_AUTH_REQUEST_EXTRA_PARAMS={"acr_values": "eidas1"} | ||
|
|
||
| # Store OIDC tokens in the session | ||
| OIDC_STORE_ACCESS_TOKEN = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The placeholder value for OIDC_STORE_REFRESH_TOKEN_KEY looks like a toy 44‐character string; replace it with a real Fernet key before deployment.
| default=enums.MoveNodePositionChoices.LAST_CHILD, | ||
| ) | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The string literal for the page field ends with an unmatched single quote, causing a syntax error that will prevent the module from loading.
| from django.core.exceptions import ValidationError | ||
| from django.core.files.storage import default_storage | ||
| from django.core.paginator import InvalidPage, Paginator | ||
| from django.core.validators import URLValidator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
search_serializer_class is set to ListDocumentSerializer, but the new SearchDocumentSerializer was added; this mismatch will make search responses use the wrong serializer.
| file_key = self.file_key | ||
| bytes_content = self._content.encode("utf-8") | ||
| self.save_content(self._content) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The save method is not properly handling the case where the document's content is empty.
| @@ -0,0 +1,295 @@ | |||
| """Document search index management utilities and indexers""" | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get_document_indexer function is not properly handling the case where the indexer class is not configured.
| @@ -0,0 +1,65 @@ | |||
| """ | |||
| Unit test for `index` command. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test_index command is not properly handling the case where the indexer is not configured.
| @@ -0,0 +1,429 @@ | |||
| """ | |||
| Tests for Documents API endpoint in impress's core app: list | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test_api_documents_search endpoint is not properly handling the case where the indexer is not configured.
| import smtplib | ||
| from logging import Logger | ||
| from operator import itemgetter | ||
| from unittest import mock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test_models_documents test case is not properly handling the case where the document's content is empty.
| @@ -0,0 +1,540 @@ | |||
| """Tests for Documents search indexers""" | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test_services_search_indexers test case is not properly handling the case where the indexer is not configured.
| import base64 | ||
| import re | ||
| from collections import defaultdict | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get_ancestor_to_descendants_map function is not properly handling the case where the path is empty.
| """create_demo management command""" | ||
|
|
||
| import base64 | ||
| import logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The create_demo command is not properly handling the case where the demo data is not properly formatted.
| DEFAULT_AUTO_FIELD = "django.db.models.AutoField" | ||
|
|
||
| # Search | ||
| SEARCH_INDEXER_CLASS = values.Value( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SEARCH_INDEXER_CLASS setting is not properly configured.
| const searchParams = constructParams(params); | ||
| const response = await fetchAPI(`documents/?${searchParams.toString()}`); | ||
| let response | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The getDocs function is not properly handling the case where the search query is empty.
Purpose
Describe the purpose of this pull request.
Proposal
External contributions
Thank you for your contribution! 🎉
Please ensure the following items are checked before submitting your pull request:
git commit --signoff(DCO compliance)git commit -S)<gitmoji>(type) title description## [Unreleased]section (if noticeable change)