-
Notifications
You must be signed in to change notification settings - Fork 527
[DRAFT] AIO PR #2641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[DRAFT] AIO PR #2641
Conversation
(cherry picked from commit 01d4159)
Co-authored-by: Patryk Czajka <patryk.czajka@snowflake.com> (cherry picked from commit 54906d6)
Co-authored-by: Patryk Czajka <patryk.czajka@snowflake.com>
…2631) The failure was caused by boto PythonDeprecationWarning. To avoid if/else logic for checking boto availability I decided to check suffixes of the warnings instead of their types.
…re for impersonation (#2510)
Co-authored-by: Peter Mansour <peter.mansour@snowflake.com>
Co-authored-by: Peter Mansour <peter.mansour@snowflake.com>
…n pandas.to_sql anyway
.github/workflows/build_test.yml
Outdated
| name: Test FIPS linux-3.8-${{ matrix.cloud-provider }} | ||
| name: Test FIPS linux-3.9-${{ matrix.cloud-provider }} | ||
| needs: build | ||
| runs-on: ubuntu-latest | ||
| strategy: | ||
| fail-fast: false | ||
| matrix: | ||
| cloud-provider: [aws] | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: Setup parameters file | ||
| shell: bash | ||
| env: | ||
| PARAMETERS_SECRET: ${{ secrets.PARAMETERS_SECRET }} | ||
| run: | | ||
| gpg --quiet --batch --yes --decrypt --passphrase="$PARAMETERS_SECRET" \ | ||
| .github/workflows/parameters/public/parameters_${{ matrix.cloud-provider }}.py.gpg > test/parameters.py | ||
| - name: Setup private key file | ||
| shell: bash | ||
| env: | ||
| PYTHON_PRIVATE_KEY_SECRET: ${{ secrets.PYTHON_PRIVATE_KEY_SECRET }} | ||
| run: | | ||
| gpg --quiet --batch --yes --decrypt --passphrase="$PYTHON_PRIVATE_KEY_SECRET" \ | ||
| .github/workflows/parameters/public/rsa_keys/rsa_key_python_${{ matrix.cloud-provider }}.p8.gpg > test/rsa_key_python_${{ matrix.cloud-provider }}.p8 | ||
| - name: Download wheel(s) | ||
| uses: actions/download-artifact@v4 | ||
| with: | ||
| name: manylinux_x86_64_py3.8 | ||
| name: manylinux_x86_64_py3.9 | ||
| path: dist | ||
| - name: Show wheels downloaded | ||
| run: ls -lh dist | ||
| shell: bash | ||
| - name: Run tests | ||
| run: ./ci/test_fips_docker.sh | ||
| env: | ||
| PYTHON_VERSION: 3.8 | ||
| PYTHON_VERSION: 3.9 | ||
| cloud_provider: ${{ matrix.cloud-provider }} | ||
| PYTEST_ADDOPTS: --color=yes --tb=short | ||
| TOX_PARALLEL_NO_SPINNER: 1 | ||
| shell: bash | ||
| - uses: actions/upload-artifact@v4 | ||
| with: | ||
| include-hidden-files: true | ||
| name: coverage_linux-fips-3.8-${{ matrix.cloud-provider }} | ||
| name: coverage_linux-fips-3.9-${{ matrix.cloud-provider }} | ||
| path: | | ||
| .coverage | ||
| coverage.xml | ||
| - uses: actions/upload-artifact@v4 | ||
| with: | ||
| include-hidden-files: true | ||
| name: junit_linux-fips-3.9-${{ matrix.cloud-provider }} | ||
| path: | | ||
| junit.*.xml | ||
| test-lambda: | ||
| name: Test Lambda linux-${{ matrix.python-version }}-${{ matrix.cloud-provider }} | ||
| needs: build | ||
| runs-on: ubuntu-latest | ||
| strategy: | ||
| fail-fast: false | ||
| matrix: | ||
| python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"] | ||
| # TODO: temporarily reduce number of jobs: SNOW-2311643 | ||
| # python-version: ["3.9", "3.10", "3.11", "3.12", "3.13"] |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 7 days ago
The best way to fix the problem is to add a permissions block at the root of the workflow file (.github/workflows/build_test.yml), just after the name and before or after the on: trigger, but before defining jobs. This block should minimally specify contents: read, which means the default GITHUB_TOKEN provided to all jobs will only be able to read repository contents, not write or modify anything. If any job requires elevated permissions, those can be configured individually at the job level, but in this workflow, based on provided code snippets, none of the jobs appear to require more than contents: read.
Specifically, add:
permissions:
contents: readdirectly after the workflow name at the top of .github/workflows/build_test.yml.
No imports or definitions are required; just the YAML block as described.
-
Copy modified lines R2-R3
| @@ -1,4 +1,6 @@ | ||
| name: Build and Test | ||
| permissions: | ||
| contents: read | ||
|
|
||
| on: | ||
| push: |
| python-version: ["3.13"] | ||
| cloud-provider: [aws, azure, gcp] | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: Set up Python | ||
| uses: actions/setup-python@v4 | ||
| with: | ||
| python-version: ${{ matrix.python-version }} | ||
| - name: Display Python version | ||
| run: python -c "import sys; print(sys.version)" | ||
| - name: Set up Java | ||
| uses: actions/setup-java@v4 # for wiremock | ||
| with: | ||
| java-version: 11 | ||
| distribution: 'temurin' | ||
| java-package: 'jre' | ||
| - name: Fetch Wiremock | ||
| shell: bash | ||
| run: curl https://repo1.maven.org/maven2/org/wiremock/wiremock-standalone/3.11.0/wiremock-standalone-3.11.0.jar --output .wiremock/wiremock-standalone.jar | ||
| - name: Setup parameters file | ||
| shell: bash | ||
| env: | ||
| PARAMETERS_SECRET: ${{ secrets.PARAMETERS_SECRET }} | ||
| run: | | ||
| gpg --quiet --batch --yes --decrypt --passphrase="$PARAMETERS_SECRET" \ | ||
| .github/workflows/parameters/public/parameters_${{ matrix.cloud-provider }}.py.gpg > test/parameters.py | ||
| - name: Setup private key file | ||
| shell: bash | ||
| env: | ||
| PYTHON_PRIVATE_KEY_SECRET: ${{ secrets.PYTHON_PRIVATE_KEY_SECRET }} | ||
| run: | | ||
| gpg --quiet --batch --yes --decrypt --passphrase="$PYTHON_PRIVATE_KEY_SECRET" \ | ||
| .github/workflows/parameters/public/rsa_keys/rsa_key_python_${{ matrix.cloud-provider }}.p8.gpg > test/rsa_key_python_${{ matrix.cloud-provider }}.p8 | ||
| - name: Download wheel(s) | ||
| uses: actions/download-artifact@v4 | ||
| with: | ||
| name: ${{ matrix.os.download_name }}_py${{ matrix.python-version }} | ||
| path: dist | ||
| - name: Show wheels downloaded | ||
| run: ls -lh dist | ||
| shell: bash | ||
| - name: Upgrade setuptools, pip and wheel | ||
| run: python -m pip install -U setuptools pip wheel | ||
| - name: Install tox | ||
| run: python -m pip install tox>=4 | ||
| - name: Run tests | ||
| run: python -m tox run -e aio | ||
| env: | ||
| PYTHON_VERSION: ${{ matrix.python-version }} | ||
| cloud_provider: ${{ matrix.cloud-provider }} | ||
| PYTEST_ADDOPTS: --color=yes --tb=short | ||
| TOX_PARALLEL_NO_SPINNER: 1 | ||
| shell: bash | ||
| - name: Combine coverages | ||
| run: python -m tox run -e coverage --skip-missing-interpreters false | ||
| shell: bash | ||
| - uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: coverage_aio_${{ matrix.os.download_name }}-${{ matrix.python-version }}-${{ matrix.cloud-provider }} | ||
| path: | | ||
| .tox/.coverage | ||
| .tox/coverage.xml | ||
| test-unsupporeted-aio: | ||
| name: Test unsupported asyncio ${{ matrix.os.download_name }}-${{ matrix.python-version }} | ||
| runs-on: ${{ matrix.os.image_name }} | ||
| strategy: | ||
| fail-fast: false | ||
| matrix: | ||
| os: | ||
| - image_name: ubuntu-latest | ||
| download_name: manylinux_x86_64 | ||
| python-version: [ "3.9", ] | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - name: Set up Python | ||
| uses: actions/setup-python@v4 | ||
| with: | ||
| python-version: ${{ matrix.python-version }} |
Check warning
Code scanning / CodeQL
Workflow does not contain permissions Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 7 days ago
To fix this issue, the permissions: key should be set explicitly for the test-aio job, or globally at the workflow level if all jobs share the same minimal needs. For jobs that only need to check out code and upload/download artifacts (such as test-aio), setting contents: read is sufficient and recommended. This prevents GITHUB_TOKEN from having broader write privileges. The explicit block should look like:
permissions:
contents: readThe change should be made within .github/workflows/build_test.yml, either globally (above jobs:) or locally in the definition for the test-aio: job. If other jobs in the workflow might require different permissions, setting it per-job is safer; otherwise, global is preferred for consistency. For this fix, since CodeQL specifically flagged line 449 (the start of the test-aio job), the precise minimal fix is to add permissions: contents: read to the job definition immediately below test-aio: (i.e., after line 449).
No additional dependencies or code changes are required.
-
Copy modified lines R450-R451
| @@ -447,6 +447,8 @@ | ||
|
|
||
| test-aio: | ||
| name: Test asyncio ${{ matrix.os.download_name }}-${{ matrix.python-version }}-${{ matrix.cloud-provider }} | ||
| permissions: | ||
| contents: read | ||
| needs: build | ||
| runs-on: ${{ matrix.os.image_name }} | ||
| strategy: |
| { | ||
| k: v if k in AUTHENTICATION_REQUEST_KEY_WHITELIST else "******" | ||
| for (k, v) in body["data"].items() | ||
| }, |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 7 days ago
To fix the risk of cleartext logging of sensitive information, we must guarantee that sensitive keys (such as "PASSCODE", "password", etc.) are never logged.
The best approach is:
- Maintain a separate set of sensitive keys (e.g.,
SENSITIVE_AUTH_KEYS) that are explicitly redacted regardless of the whitelist. - When constructing the log message, always redact the value for any key in
SENSITIVE_AUTH_KEYS, even if it is also present in the whitelist. - Double-check that the
PASSCODE(and similar sensitive keys) are always redacted. - Importantly, make this redaction explicit and simple, so future maintainers cannot mistakenly expose sensitive values by merely editing the whitelist.
Implementation plan:
- Define a set (or list) called
SENSITIVE_AUTH_KEYS, including"PASSCODE","password", and any similar key. - In the log message dictionary comprehension, for each key:
- If key is in
SENSITIVE_AUTH_KEYS, log"******". - Else if key is in the existing whitelist, log its value.
- Else, log
"******".
- If key is in
- Edit only the relevant log statement and local definitions in
src/snowflake/connector/aio/auth/_auth.py. - No external library is needed; use built-in Python set/dict logic.
-
Copy modified lines R46-R47 -
Copy modified line R151
| @@ -43,6 +43,8 @@ | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| # Always redact these keys, even if they are included in any whitelist | ||
| SENSITIVE_AUTH_KEYS = {"PASSCODE", "PASSWORD", "password", "passcode"} | ||
|
|
||
| class Auth(AuthSync): | ||
| async def authenticate( | ||
| @@ -146,7 +148,7 @@ | ||
| logger.debug( | ||
| "body['data']: %s", | ||
| { | ||
| k: v if k in AUTHENTICATION_REQUEST_KEY_WHITELIST else "******" | ||
| k: "******" if k in SENSITIVE_AUTH_KEYS else (v if k in AUTHENTICATION_REQUEST_KEY_WHITELIST else "******") | ||
| for (k, v) in body["data"].items() | ||
| }, | ||
| ) |
| received_redirected_request = input( | ||
| "Enter the URL the OAuth flow redirected you to: " | ||
| ) | ||
| code, state = self._parse_authorization_redirected_request( |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (password)
This expression logs
sensitive data (password)
This expression logs
sensitive data (secret)
This expression logs
sensitive data (secret)
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 7 days ago
To fix the issue, the code should avoid displaying the OAuth URL with sensitive parameters in cleartext. Instead, before printing the URL for manual use, sensitive parameters embedded in the URL (notably client_id, and potentially authorization codes/tokens if present) should be redacted or masked.
The optimal solution is to parse the URL, locate known sensitive query parameters (client_id, client_secret, possibly code and token), and replace their values with placeholders (e.g., '***') before printing.
Edit only the affected region in src/snowflake/connector/auth/oauth_code.py where the URL is printed in _ask_authorization_callback_from_user, adding a small helper method for redaction and changing the print to print the sanitized version.
-
Copy modified line R370 -
Copy modified lines R392-R407
| @@ -367,7 +367,7 @@ | ||
| "We were unable to open a browser window for you, " | ||
| "please open the URL manually then paste the " | ||
| "URL you are redirected to into the terminal:\n" | ||
| f"{authorization_request}" | ||
| f"{self._redact_sensitive_url_params(authorization_request)}" | ||
| ) | ||
| received_redirected_request = input( | ||
| "Enter the URL the OAuth flow redirected you to: " | ||
| @@ -389,6 +389,22 @@ | ||
| ) | ||
| return code, state | ||
|
|
||
|
|
||
| @staticmethod | ||
| def _redact_sensitive_url_params(url: str) -> str: | ||
| """Redact sensitive OAuth query parameters from URL.""" | ||
| parsed_url = urllib.parse.urlparse(url) | ||
| params = urllib.parse.parse_qs(parsed_url.query, keep_blank_values=True) | ||
| # Redact known sensitive parameters | ||
| sensitive_keys = {"client_id", "client_secret", "code", "token"} | ||
| for key in sensitive_keys: | ||
| if key in params: | ||
| params[key] = ["***"] | ||
| redacted_query = urllib.parse.urlencode(params, doseq=True) | ||
| redacted_url = urllib.parse.urlunparse( | ||
| parsed_url._replace(query=redacted_query) | ||
| ) | ||
| return redacted_url | ||
| def _parse_authorization_redirected_request( | ||
| self, | ||
| url: str, |
| c[0][1].startswith( | ||
| "https://test-account.snowflakecomputing.com:443" | ||
| ) |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
https://test-account.snowflakecomputing.com:443
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 7 days ago
The best way to fix this issue is to parse the potentially returned URL (i.e., c[0][1]) using Python's urllib.parse.urlparse and then check the individual fields (scheme, host, port) for correctness. This avoids confusing "starts with" logic due to userinfo and other URL fields, and directly asserts correct URL composition. Specifically, the assertion in the test should be modified so that for each call in mocked_fetch.call_args_list, the argument at [0][1] is parsed as a URL, and the test asserts that the scheme is "https", host is "test-account.snowflakecomputing.com", and port is 443 (or the absence of port in the netloc is OK if the library omits explicit port 443).
This requires adding from urllib.parse import urlparse if not already imported in this file.
The instrumentation of the assertion should be changed from a startswith check to an explicit dissection and validation of the URL fields.
-
Copy modified line R25 -
Copy modified lines R725-R730
| @@ -22,6 +22,7 @@ | ||
|
|
||
| import snowflake.connector.aio | ||
| from snowflake.connector import DatabaseError, OperationalError, ProgrammingError | ||
| from urllib.parse import urlparse | ||
| from snowflake.connector.aio import SnowflakeConnection | ||
| from snowflake.connector.aio._description import CLIENT_NAME | ||
| from snowflake.connector.compat import IS_WINDOWS | ||
| @@ -721,8 +722,12 @@ | ||
| ) # Skip tear down, there's only a mocked rest api | ||
| assert any( | ||
| [ | ||
| c[0][1].startswith( | ||
| "https://test-account.snowflakecomputing.com:443" | ||
| ( | ||
| (lambda u: ( | ||
| u.scheme == "https" | ||
| and u.hostname == "test-account.snowflakecomputing.com" | ||
| and (u.port == 443 or (u.port is None and u.netloc.endswith(":443"))) | ||
| ))(urlparse(c[0][1])) | ||
| ) | ||
| for c in mocked_fetch.call_args_list | ||
| ] |
| # azure_request_present = False | ||
| expected_token_prefix = "sig=" | ||
| for line in caplog.text.splitlines(): | ||
| if "blob.core.windows.net" in line and expected_token_prefix in line: |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
blob.core.windows.net
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 7 days ago
To fix the problem, the code should parse potential URLs in log lines and verify that any host matches (or ends with) 'blob.core.windows.net' rather than checking for this substring anywhere in the log line. The best way is to attempt to extract URLs from each log line and, for each, parse out the host portion. One can use the re module to extract URLs and urllib.parse.urlparse to get hosts, and then check if any host matches or ends with 'blob.core.windows.net'. Only those lines are subjected to further checks about expected_token_prefix. This keeps the test intent intact while avoiding false positives inherent to pure substring checks.
Required changes:
- Add
import reat the top (if not present). - Add
from urllib.parse import urlparseat the top (if not present). - Refactor the for-loop on log lines:
- Extract all candidate URLs from the line using regex.
- For each URL, parse it and check if its host matches/ends with
'blob.core.windows.net'. - Only run the sensitive-information assertion on such lines.
All changes are localized to the test file and specifically to lines near the existing substring check.
-
Copy modified lines R13-R14 -
Copy modified lines R97-R107
| @@ -10,6 +10,8 @@ | ||
| import sys | ||
| import time | ||
| from logging import getLogger | ||
| import re | ||
| from urllib.parse import urlparse | ||
|
|
||
| import pytest | ||
|
|
||
| @@ -92,14 +94,17 @@ | ||
| # azure_request_present = False | ||
| expected_token_prefix = "sig=" | ||
| for line in caplog.text.splitlines(): | ||
| if "blob.core.windows.net" in line and expected_token_prefix in line: | ||
| # azure_request_present = True | ||
| # getattr is used to stay compatible with old driver - before SECRET_STARRED_MASK_STR was added | ||
| assert ( | ||
| expected_token_prefix | ||
| + getattr(SecretDetector, "SECRET_STARRED_MASK_STR", "****") | ||
| in line | ||
| ), "connectionpool logger is leaking sensitive information" | ||
| # Find all potential URLs in the line | ||
| urls = re.findall(r'(https?://[^\s\'"<>]+)', line) | ||
| for url in urls: | ||
| host = urlparse(url).hostname | ||
| if host and host.endswith("blob.core.windows.net") and expected_token_prefix in url: | ||
| # getattr is used to stay compatible with old driver - before SECRET_STARRED_MASK_STR was added | ||
| assert ( | ||
| expected_token_prefix | ||
| + getattr(SecretDetector, "SECRET_STARRED_MASK_STR", "****") | ||
| in url | ||
| ), "connectionpool logger is leaking sensitive information" | ||
|
|
||
| # TODO: disable the check for now - SNOW-2311540 | ||
| # assert ( |
| private_key = rsa.generate_private_key( | ||
| public_exponent=65537, key_size=1024, backend=default_backend() | ||
| ) |
Check failure
Code scanning / CodeQL
Use of weak cryptographic key High test
1024
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 7 days ago
To fix the security issue, the RSA key generated on line 146 should be at least 2048 bits, per current recommendations. Change the line
private_key = rsa.generate_private_key(
public_exponent=65537, key_size=1024, backend=default_backend()
)so that key_size=2048 (or higher). This change preserves the intent and functionality of the test code with a secure key size. No other code modifications are necessary, and no additional imports or libraries are required since the rest of the key creation process remains unaffected.
-
Copy modified line R147
| @@ -144,7 +144,7 @@ | ||
| def create_x509_cert(hash_algorithm): | ||
| # Generate a private key | ||
| private_key = rsa.generate_private_key( | ||
| public_exponent=65537, key_size=1024, backend=default_backend() | ||
| public_exponent=65537, key_size=2048, backend=default_backend() | ||
| ) | ||
|
|
||
| # Generate a public key |
Please answer these questions before submitting your pull requests. Thanks!
What GitHub issue is this PR addressing? Make sure that there is an accompanying issue to your PR.
Fixes #NNNN
Fill out the following pre-review checklist:
Please describe how your code solves the related issue.
Please write a short description of how your code change solves the related issue.
(Optional) PR for stored-proc connector: