-
Notifications
You must be signed in to change notification settings - Fork 3
Add authentication #97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add authentication (#96)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds authentication support for Elasticsearch connections throughout the application. The changes implement both Basic authentication (username/password) and Bearer token authentication with proper priority handling.
- Adds authentication configuration options for Elasticsearch (username, password, and token)
- Updates all Elasticsearch client code to include authentication headers
- Implements comprehensive test coverage for the new authentication functionality
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/utils/config.py | Adds auth configuration variables and get_elasticsearch_auth_header() function |
| src/utils/wait_for_service.py | Updates service connection function to accept and use auth tokens |
| src/server/main.py | Updates service startup to use authentication when connecting to Elasticsearch |
| src/es_client/query.py | Adds authentication headers to Elasticsearch search requests |
| tests/helpers/init_elasticsearch.py | Updates test helper to use centralized auth header function |
| tests/unit/utils/test_config.py | Adds comprehensive tests for authentication header generation |
| tests/unit/utils/test_wait_for_service.py | Adds tests for service connection with and without authentication |
| .github/workflows/docker-image.yml | Adds CI/CD workflow for Docker image building |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #97 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 21 21
Lines 660 685 +25
=========================================
+ Hits 660 685 +25 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
src/utils/config.py
Outdated
| logger.info("Using Basic Authentication for Elasticsearch.") | ||
| return f"Basic {base64_credentials}" | ||
| elif auth_token: | ||
| # Bearer authentication |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no way to test this live right now, as this requires enabling SSL communication to be able to generate API keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so with our current elastic search configuration we can't use auth tokens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. We would have to create and distribute SSL certs and enable HTTPS. @jbezouska-ANL were there any plans to enable HTTPS on the backend?
src/utils/config.py
Outdated
| Priority: | ||
| 1. If username and password are set, use Basic authentication | ||
| 2. If only auth token is set, use Bearer authentication |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this precedence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I picked it arbitrarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking is that tokens are more secure than user/pwd so they should be preferred
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we can get tokens working, we can delete the username and password code too, but since they aren't working yet, I just got it working with what we have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well leave it if it works and just swap the preference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked this up the other day and AI says they aren't more secure.
TL;DR: With One Username Per Service, They're Identical
If you have one username per service, then API keys and username/password are literally the same thing security-wise:
Both are:
Base64 encoded strings in headers
Unique per service (no shared credentials)
Equally vulnerable if intercepted
Equally easy to revoke (just disable that one user/key)
Can have identical RBAC permissions
The only remaining differences are cosmetic:
API Key: Authorization: ApiKey <base64(id:key)>
Basic Auth: Authorization: Basic <base64(username:password)>
Both decode to two strings separated by a colon. Both can be scoped, rotated, and revoked independently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Equally easy to revoke (just disable that one user/key)
I don't think that's true, it's easier to revoke a token than change a password
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is it easier? Both will require shutting down and restarting the service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed it was easier on the ES side, like how in KBase revoking a token is trivial.
If that's not the case and you're saying that passwords and tokens are completely indistinguishable for this use case then we should just drop token support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, I haven't tested it out, I was thinking more from the side of what happens to the service. Ok I will remove it.
Refactor tests for wait_for_service to reduce duplication and improve clarity.
|
@MrCreosote Can we merge this? |
|
This one is still open #97 (comment) |
Updated the docstring to clarify test purpose.
Removed unused logging import and logger initialization.
Remove comment about calculating Elasticsearch auth header.
Replaced centralized auth header function with config-based authorization header retrieval.
Removed unused import of get_elasticsearch_auth_header.
Refactor authentication header retrieval in show_indexes function.
Refactor Elasticsearch authentication header retrieval.
src/es_client/query.py
Outdated
| params = {'allow_no_indices': 'true'} | ||
|
|
||
| resp = requests.post(url, data=json.dumps(options), params=params, headers=headers) | ||
| resp = requests.post(url, data=json.dumps(options), params=params, headers=headers, timeout=[120, 600]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added timeouts, maybe I should instead add an annotation for bandit to ignore these and not add timeouts, since I don't know what the real timeouts should be?
|
Closing in favor of #100 |
This PR adds Basic Authentication to the Elasticsearch client, allowing the service to connect to secured clusters.
The startup script has been updated to use the new authentication variables, resolving the connection issues when pointed at the elastic-next-spike instance. I've deployed and verified this on the next environment.
Next Steps:
Closes #90