Conversation
Add optional basic authentication for Elasticsearch connections using ES_USER and ES_PWD environment variables. When both credentials are provided, the client is initialized with http_auth. Otherwise, it falls back to the previous behavior without authentication for backwards compatibility. Previously, Elasticsearch connections only used host and port. Now supports authenticated Elasticsearch instances that require username and password.
Add recode and jq packages to enable running datagouv-get-files directly within the same container instead of using tools repository.
WalkthroughThe changes add two command-line utilities (recode and jq) to the Docker image and implement optional HTTP basic authentication for the Elasticsearch client, sourced from environment variables ES_USER and ES_PWD. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
code/recipes.py (1)
199-220: Consolidate duplicate Elasticsearch client instantiation to reduce maintenance overhead.The current implementation correctly adds optional HTTP basic authentication with proper fallback behavior. However, the two
Elasticsearchclient instantiations differ only in thehttp_authparameter.The
http_authparameter with(username, password)tuple format is supported in elasticsearch-py 7.17.7, so this refactoring is safe:# Optional basic auth, only from env (ES_USER / ES_PWD) es_user = os.getenv("ES_USER") es_pwd = os.getenv("ES_PWD") + es_kwargs = { + "host": self.host, + "port": self.port, + "scheme": self.scheme, + "timeout": self.timeout, + } + if es_user and es_pwd: - # Authenticated Elasticsearch client - self.es = Elasticsearch( - self.host, - port=self.port, - scheme=self.scheme, - timeout=self.timeout, - http_auth=(es_user, es_pwd), - ) - else: - # Backwards-compatible, no-auth client - self.es = Elasticsearch( - self.host, - port=self.port, - scheme=self.scheme, - timeout=self.timeout, - ) + es_kwargs["http_auth"] = (es_user, es_pwd) + + self.es = Elasticsearch(**es_kwargs)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
Dockerfile(1 hunks)code/recipes.py(1 hunks)
🔇 Additional comments (1)
Dockerfile (1)
15-15: LGTM!The addition of
recodeandjqutilities aligns with the PR objectives to enable the datagouv-get-files script execution within the container.
This Pull Request permit to the backend to request Elasticsearch with authentication if asked by Elastic, but doesn't break the code if no user and password are given.
The second change is about Dockerfile, where we install recode and jq in the image, so we can run the datagouv-get-files script ( from tools repository ), without using another container.
Summary by CodeRabbit
Release Notes
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.