Skip to content

feat: add AWS_USE_IMDS support for ambient IAM credential resolution on Bedrock#2307

Open
ira-at-work wants to merge 15 commits intoThe-PR-Agent:mainfrom
Agrematch:feature/imds-support
Open

feat: add AWS_USE_IMDS support for ambient IAM credential resolution on Bedrock#2307
ira-at-work wants to merge 15 commits intoThe-PR-Agent:mainfrom
Agrematch:feature/imds-support

Conversation

@ira-at-work
Copy link
Copy Markdown

@ira-at-work ira-at-work commented Apr 5, 2026

Summary

  • When AWS_USE_IMDS=true is set in the environment, PR-Agent resolves AWS credentials via boto3's standard provider chain instead of requiring static IAM keys. This covers all AWS compute contexts (EC2 instance roles, ECS/Fargate task roles, EKS with IRSA, Lambda runtime credentials) transparently — no code-path changes needed per compute type.
  • AWS_REGION_NAME is auto-resolved from the compute environment when not explicitly configured.
  • Static keys in [aws] config become an optional fallback: if the ambient credentials fail a Bedrock call, PR-Agent retries with static keys and logs a warning.
  • Added missing Bedrock model IDs for Sonnet 4.6 (-v1:0 versioned and regional variants) and completed regional coverage for Opus 4.5 and Opus 4.6.

Motivation

Users running PR-Agent on AWS compute (self-hosted GitHub Actions runners on EC2/ECS/EKS) must currently create long-lived static IAM keys and store them as GitHub Secrets. This is unnecessary secret management overhead and a credential-leakage risk. The instance/task/pod already has an IAM role — this change lets PR-Agent use it directly.

Changes

pr_agent/algo/ai_handlers/litellm_ai_handler.py

  • __init__: when AWS_USE_IMDS is truthy, resolve credentials via boto3.Session().get_credentials() (wrapped in try/except so IMDS timeouts fall through gracefully). Static keys in config are stashed as _aws_static_creds (including AWS_SESSION_TOKEN for STS-derived static creds).
  • _refresh_imds_credentials(): called before each Bedrock chat_completion to avoid stale credentials in long-lived processes (EC2 role tokens rotate every ~6 hours).
  • _activate_static_aws_fallback(): swaps env vars to static credentials (with correct session token handling) on first Bedrock API failure; tenacity retries the call with the new creds.

pr_agent/algo/__init__.py

  • Added bedrock/anthropic.claude-sonnet-4-6-v1:0 (versioned form was missing)
  • Added regional -v1:0 variants for Sonnet 4.6: us, au, eu, jp, apac, global
  • Added missing bedrock/anthropic.claude-opus-4-5-20251101-v1:0 base entry
  • Completed regional coverage for Opus 4.5 and Opus 4.6 (eu, au, jp, apac)

Docs

  • docs/docs/usage-guide/changing_a_model.md: new "Using IAM Role Credentials (Recommended on AWS Compute)" subsection with compute context table, IAM policy example, and minimal GitHub Actions workflow snippet.
  • docs/docs/installation/github.md: new "Using Amazon Bedrock" provider subsection under "Switching Models" with both static-key and IMDS workflow examples.
  • pr_agent/settings/.secrets_template.toml: comments clarifying when static keys are optional; added AWS_SESSION_TOKEN field.

Test plan

  • Set AWS_USE_IMDS=true with no static keys on an EC2 instance with a Bedrock-enabled IAM role → review runs without secrets
  • Set AWS_USE_IMDS=true on a machine with no IMDS (e.g. GitHub-hosted runner, no AWS creds) → boto3 exception caught, falls through to static keys with an error log
  • Set AWS_USE_IMDS=true + valid static keys, with IMDS creds that lack bedrock:InvokeModel → first call fails, warning logged, second call uses static keys and succeeds
  • Unset AWS_USE_IMDS with static keys configured → behavior identical to before this PR
  • Unset AWS_USE_IMDS with no keys → behavior identical to before this PR (no boto3 import)

Notes

  • AWS_USE_IMDS is env-var-only (not a configuration.toml key) to keep it ergonomic in GitHub Actions env: blocks.
  • IMDSv2 vs v1: boto3 defaults (IMDSv2 preferred, v1 fallback). No enforcement change.
  • The existing os.environ mutation pattern for credentials is preserved for consistency with the rest of the file; passing creds via per-call litellm kwargs would be a cleaner future improvement.

🤖 Generated with Claude Code

Ira Abramov added 4 commits April 5, 2026 17:12
When AWS_USE_IMDS=true, PR-Agent resolves credentials via boto3's standard
provider chain (EC2 IMDS, ECS task metadata, EKS IRSA, Lambda runtime) so
operators running on AWS compute no longer need to store long-lived static
IAM keys. Static keys in [aws] config become an optional fallback used
automatically if the ambient credentials fail a Bedrock call.

Also resolves AWS_REGION_NAME from the compute environment when not
explicitly configured.

Docs updated with IAM role workflow example and per-provider setup section
in github.md.
- Add versioned bedrock/anthropic.claude-sonnet-4-6-v1:0 (bare versioned form
  was missing unlike the equivalent Opus 4.6 entry)
- Add regional -v1:0 variants for Sonnet 4.6: us, au, eu, jp, apac, global
- Add apac variant for Sonnet 4.6 bare name
- Add missing bedrock/anthropic.claude-opus-4-5-20251101-v1:0 base entry
- Complete Opus 4.5 regional coverage: eu, au, jp, apac
- Complete Opus 4.6 regional coverage: eu, au, jp, apac -v1:0 variants
Three issues found in code review:

1. Wrap boto3 credential/region resolution in try/except so a timeout or
   missing endpoint falls through to static keys instead of crashing init.

2. Add _refresh_imds_credentials() called before each Bedrock chat_completion
   to avoid serving stale credentials from long-lived processes (EC2 instance
   role credentials rotate every ~6 hours).

3. Store AWS_SESSION_TOKEN in _aws_static_creds when present (STS-derived
   static creds) and restore/clear it correctly in _activate_static_aws_fallback,
   preventing silent auth failures when the fallback credentials require a token.
17 tests covering:
- boto3 creds written to os.environ on init
- AWS_SESSION_TOKEN set/cleared correctly from IMDS
- Region auto-resolved and not overwritten when already set
- Graceful handling when boto3 returns no creds or raises an exception
- boto3 never called when AWS_USE_IMDS is absent
- Static keys stashed for fallback (including session token)
- _refresh_imds_credentials called before each Bedrock chat_completion
- Fallback to static keys triggered on Bedrock APIError
- Fallback not triggered when no static creds are configured
- _activate_static_aws_fallback correctly restores/clears session token
@seefood
Copy link
Copy Markdown

seefood commented Apr 5, 2026

@naorpeled
good evening! I tried to test this in my own repo, using the action from my own branch, but ji think it is taking the pr-agent as a packaged docker image and not really testing the branch's code. is there a step I am missing here for testing this live before it merges?

@seefood
Copy link
Copy Markdown

seefood commented Apr 5, 2026

/review

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Apr 5, 2026

Code Review by Qodo

🐞 Bugs (8)   📘 Rule violations (6)   📎 Requirement gaps (0)
🐞\ ≡ Correctness (3) ☼ Reliability (4) ➹ Performance (1) ⭐ New (1)
📘\ ≡ Correctness (1) ⛨ Security (1) ⚙ Maintainability (4)

Grey Divider


Action required

1. except Exception in IMDS📘
Description
Credential resolution under AWS_USE_IMDS uses a broad except Exception, which can mask
unexpected bugs and makes failure modes harder to diagnose. The compliance checklist requires
targeted exception handling instead of blanket catches.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R61-77]

+            try:
+                creds = session.get_credentials()
+                if creds:
+                    self._aws_boto3_creds = creds  # store for refresh; avoids env-var re-read
+                    self._write_frozen_aws_creds_to_env(creds.get_frozen_credentials())
+                    self._aws_imds_mode = True
+                    get_logger().info("Using ambient AWS credentials from IMDS/task-role/IRSA")
+                else:
+                    get_logger().warning(
+                        "AWS_USE_IMDS is set but boto3 found no credentials; "
+                        "falling through to static keys"
+                    )
+            except Exception:
+                get_logger().exception(
+                    "AWS_USE_IMDS: failed to resolve credentials via boto3; "
+                    "falling through to static keys"
+                )
Evidence
The checklist forbids broad except Exception handlers; the new IMDS credential-resolution block
catches Exception without narrowing to expected boto3/botocore error types.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[61-77]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`AWS_USE_IMDS` credential resolution catches `Exception` broadly, which can hide real bugs and violates the targeted-exception requirement.
## Issue Context
This code path should fail gracefully for known AWS credential/metadata resolution errors (e.g., botocore credential/provider/endpoint issues) while still surfacing unexpected programming errors.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[61-77]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. except Exception in refresh📘
Description
_refresh_aws_imds_credentials() uses a broad except Exception, which can mask unexpected runtime
bugs and reduce diagnosability. The compliance checklist requires catching specific expected
exception types.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R256-264]

+        try:
+            if self._aws_boto3_creds is None:
+                get_logger().warning("IMDS credential refresh: no boto3 credentials object stored")
+                return False
+            self._write_frozen_aws_creds_to_env(self._aws_boto3_creds.get_frozen_credentials())
+            return True
+        except Exception:
+            get_logger().exception("IMDS credential refresh failed")
+            return False
Evidence
The checklist requires targeted exception handling; the new IMDS refresh helper catches all
exceptions via except Exception:.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[256-264]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`_refresh_aws_imds_credentials()` catches `Exception` broadly.
## Issue Context
This refresh should handle known credential-refresh failures (provider/metadata/expiry) explicitly while allowing unexpected errors to surface.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[256-264]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Non-Ruff multi-line imports 📘
Description
New multi-line imports use non-standard indentation/formatting that is likely to be reformatted by
Ruff/Black and can fail repo formatting checks. The compliance checklist requires Python changes to
conform to Ruff/isort formatting expectations.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R10-21]

+from tenacity import (retry, retry_if_exception_type,
+                      retry_if_not_exception_type, stop_after_attempt)
+
+from pr_agent.algo import (CLAUDE_EXTENDED_THINKING_MODELS,
+                           NO_SUPPORT_TEMPERATURE_MODELS,
+                           STREAMING_REQUIRED_MODELS,
+                           SUPPORT_REASONING_EFFORT_MODELS,
+                           USER_MESSAGE_ONLY_MODELS)
from pr_agent.algo.ai_handlers.base_ai_handler import BaseAiHandler
-from pr_agent.algo.ai_handlers.litellm_helpers import _handle_streaming_response, MockResponse, _get_azure_ad_token, \
-    _process_litellm_extra_body
+from pr_agent.algo.ai_handlers.litellm_helpers import (
+    MockResponse, _get_azure_ad_token, _handle_streaming_response,
+    _process_litellm_extra_body)
Evidence
The checklist requires Ruff/isort-compatible formatting; the added import blocks are not in standard
Ruff/Black format (hanging indentation and grouping).

AGENTS.md
pr_agent/algo/ai_handlers/litellm_ai_handler.py[10-21]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Multi-line imports in `litellm_ai_handler.py` are not formatted in the repository’s expected Ruff/Black style.
## Issue Context
CI may enforce Ruff formatting and import style, causing failures.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[10-21]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (10)
4. _frozen_creds not Ruff-formatted📘
Description
The new test helper _frozen_creds uses a non-Black/Ruff multi-line function signature indentation
style. This is likely to be reformatted by Ruff/Black and may fail formatting checks.
Code

tests/unittest/test_litellm_imds.py[R85-88]

+def _frozen_creds(access_key="FAKE-KEY",
+                  secret_key="FAKE-SECRET",
+                  token=None):
+    frozen = MagicMock()
Evidence
The compliance checklist requires Python changes to conform to Ruff/formatter expectations; the
added function signature formatting is not in standard Ruff/Black style.

AGENTS.md
tests/unittest/test_litellm_imds.py[85-88]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The `_frozen_creds` helper signature formatting likely violates Ruff/Black formatting expectations.
## Issue Context
This is a new test file and should adhere to the same formatting rules as production code to avoid CI failures.
## Fix Focus Areas
- tests/unittest/test_litellm_imds.py[85-88]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Static token not exported 🐞
Description
When AWS_USE_IMDS is not set, the static-credentials path exports access key/secret/region but
never exports aws.AWS_SESSION_TOKEN, so STS-derived static credentials (that require a session
token) will fail.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R119-122]

+        elif get_settings().get("aws.AWS_ACCESS_KEY_ID"):
       assert get_settings().aws.AWS_SECRET_ACCESS_KEY and get_settings().aws.AWS_REGION_NAME, "AWS credentials are incomplete"
       os.environ["AWS_ACCESS_KEY_ID"] = get_settings().aws.AWS_ACCESS_KEY_ID
       os.environ["AWS_SECRET_ACCESS_KEY"] = get_settings().aws.AWS_SECRET_ACCESS_KEY
Evidence
The non-IMDS static path sets only AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY/AWS_REGION_NAME. The
secrets template explicitly documents AWS_SESSION_TOKEN as a supported field for temporary
credentials, but this code path ignores it.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[119-123]
pr_agent/settings/.secrets_template.toml[132-140]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
In the non-IMDS path (static AWS keys), `AWS_SESSION_TOKEN` from settings is ignored, breaking temporary/STS static credentials.
### Issue Context
IMDS mode already handles session token for both ambient creds and static fallback; only the non-IMDS static branch is missing it.
### Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[119-123]
### Implementation notes
- Read `aws.AWS_SESSION_TOKEN` from settings in this branch and set `os.environ["AWS_SESSION_TOKEN"]` when present.
- If absent, consider clearing `AWS_SESSION_TOKEN` from env to avoid a stale token interfering with static long-lived keys (match `_write_frozen_aws_creds_to_env` / `_activate_static_aws_fallback` behavior).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. except Exception as e unused📘
Description
The new except Exception as e: in the IMDS credential resolution does not use e, which will fail
Ruff (F841) and can break CI. This violates the repository lint/format tooling requirements.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R73-77]

+            except Exception as e:
+                get_logger().exception(
+                    "AWS_USE_IMDS: failed to resolve credentials via boto3; "
+                    "falling through to static keys"
+                )
Evidence
Ruff is configured with E/F checks enabled; the added except Exception as e: does not
reference e, which is an unused local variable and triggers lint failures.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[61-77]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`except Exception as e:` binds `e` but does not use it, triggering Ruff unused-variable lint.
## Issue Context
This is in the new IMDS credential resolution path.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[61-77]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


7. Invalid async nullcontext 🐞
Description
LiteLLMAIHandler.chat_completion() uses async with (... else contextlib.nullcontext()), but
contextlib.nullcontext() is a synchronous context manager and cannot be used in an async with,
causing a runtime exception whenever _bedrock_imds is false. This breaks all non-Bedrock model
calls (and Bedrock calls when IMDS mode is off) and would also cause the new
test_refresh_not_called_for_non_bedrock_model to fail.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R386-390]

+        # Serialize env-var mutation + Bedrock call for IMDS mode to prevent concurrent
+        # requests from interleaving os.environ credentials during asyncio.gather usage.
+        _bedrock_imds = self._aws_imds_mode and 'bedrock/' in model
+        async with (self._aws_bedrock_lock if _bedrock_imds else contextlib.nullcontext()):
+            if _bedrock_imds and not self._aws_imds_fell_back:
Evidence
The new code conditionally selects either an asyncio.Lock() or contextlib.nullcontext() inside
an async with. The test suite explicitly calls chat_completion(model="gpt-4o"), which makes
_bedrock_imds false and therefore uses the contextlib.nullcontext() branch, triggering the
runtime failure.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[385-390]
tests/unittest/test_litellm_imds.py[419-436]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`chat_completion()` uses `async with (lock if cond else contextlib.nullcontext())`, but `contextlib.nullcontext()` is not an async context manager. This will raise at runtime whenever `_bedrock_imds` is false (e.g., non-Bedrock models like `gpt-4o`).
## Issue Context
The lock is only needed to serialize env-var mutation + Bedrock calls in IMDS mode; for all other cases this should be a no-op without using a synchronous context manager in an `async with`.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[385-390]
## Suggested implementation direction
Pick one of:
1) Restructure control flow:

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


8. IMDS refresh can stay stale🐞
Description
_refresh_imds_credentials() re-resolves credentials via boto3 after the process has already written
AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY/AWS_SESSION_TOKEN into os.environ, so the refresh path can
effectively re-read the same env-sourced values and fail to pick up rotated role credentials in
long-lived processes.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R232-245]

+    def _refresh_imds_credentials(self):
+        """Refresh ambient AWS credentials from boto3 provider chain. Called before each Bedrock call
+        to avoid serving stale credentials from long-lived processes (EC2 roles rotate every ~6h)."""
+        try:
+            import boto3
+            creds = boto3.Session().get_credentials()
+            if creds:
+                frozen = creds.get_frozen_credentials()
+                os.environ["AWS_ACCESS_KEY_ID"] = frozen.access_key
+                os.environ["AWS_SECRET_ACCESS_KEY"] = frozen.secret_key
+                if frozen.token:
+                    os.environ["AWS_SESSION_TOKEN"] = frozen.token
+                elif "AWS_SESSION_TOKEN" in os.environ:
+                    del os.environ["AWS_SESSION_TOKEN"]
Evidence
In IMDS mode, the handler freezes resolved credentials into environment variables during __init__.
Later, the refresh function re-calls boto3.Session().get_credentials() without clearing those
environment variables first, so the refresh mechanism is not guaranteed to fetch new/rotated ambient
credentials (it can end up sourcing from the env it previously populated).

pr_agent/algo/ai_handlers/litellm_ai_handler.py[57-71]
pr_agent/algo/ai_handlers/litellm_ai_handler.py[232-245]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`_refresh_imds_credentials()` attempts to refresh ambient AWS credentials, but it re-resolves credentials via boto3 after the handler has already written AWS credentials into `os.environ`. This can cause the “refresh” to keep returning the same env-sourced credentials instead of picking up rotated role credentials.
### Issue Context
The IMDS implementation relies on mutating `os.environ` because the downstream Bedrock integration consumes env vars. To truly refresh, the boto3 resolution step must not be polluted by the already-set AWS_* env vars.
### Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[57-71]
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[232-249]
### Implementation direction (one of these)
1) **Store refreshable credentials from the initial boto3 session** *before* writing to env:
- In `__init__`, after `creds = session.get_credentials()`, store `self._aws_refreshable_creds = creds` (or store the `session`) and then write `creds.get_frozen_credentials()` into env.
- In `_refresh_imds_credentials`, use `self._aws_refreshable_creds.get_frozen_credentials()` to get updated values, and then write them into env.
2) **Temporarily clear AWS credential env vars during refresh**:
- In `_refresh_imds_credentials`, snapshot and `del` `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_SESSION_TOKEN` from `os.environ` before calling `boto3.Session().get_credentials()`.
- Restore by writing the newly resolved frozen credentials back into env.
Either approach ensures refresh can actually fetch new ambient credentials rather than reusing env values.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


9. IMDS ignores configured region🐞
Description
When AWS_USE_IMDS is enabled and aws.AWS_REGION_NAME is set in settings (but not already in the
environment), the code skips region auto-resolution but also never exports the configured region
into AWS_REGION_NAME, leaving Bedrock calls without the expected region env var.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R82-94]

+            if not os.environ.get("AWS_REGION_NAME") and not get_settings().get("aws.AWS_REGION_NAME"):
+                try:
+                    region = session.region_name
+                    if region:
+                        os.environ["AWS_REGION_NAME"] = region
+                        get_logger().info(f"AWS region resolved from environment: {region}")
+                    else:
+                        get_logger().warning(
+                            "AWS_USE_IMDS: could not determine AWS region; "
+                            "set AWS_REGION_NAME explicitly"
+                        )
+                except Exception as e:
+                    get_logger().warning(f"AWS_USE_IMDS: failed to resolve region via boto3: {e}")
Evidence
In the AWS_USE_IMDS branch, region resolution only happens when BOTH the env var is missing and the
config key is missing. If the config key exists, this block is skipped, and there is no other
assignment of os.environ["AWS_REGION_NAME"] in the IMDS-success path; the config->env export exists
only in the non-IMDS static-key branch.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[82-94]
pr_agent/algo/ai_handlers/litellm_ai_handler.py[120-124]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
With `AWS_USE_IMDS=true`, if the user configured `aws.AWS_REGION_NAME` in settings and did not set `AWS_REGION_NAME` in the environment, the code skips region auto-detection but also never copies the configured region into `os.environ`. This can leave Bedrock calls without a region.
### Issue Context
The existing non-IMDS path exports config region to env. IMDS mode should preserve that behavior for region (even if static access keys are not configured).
### Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[82-95]
### Suggested fix
In the `AWS_USE_IMDS` branch, add logic such as:
- If `AWS_REGION_NAME` is not set in `os.environ` **and** `get_settings().get("aws.AWS_REGION_NAME")` is set, then assign `os.environ["AWS_REGION_NAME"] = get_settings().aws.AWS_REGION_NAME`.
- Otherwise, if both are missing, keep the current auto-resolve attempt + warning.
This ensures region is available for Bedrock regardless of whether static keys are present.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


10. Hardcoded AWS keys in tests📘
Description
The new tests embed AWS access key/secret-like strings (including an AKIA... access key and a
secret key in the canonical AWS format), which can be flagged as committed secrets and increases
credential-leakage risk. Test fixtures should avoid secret-looking values even when intended as
examples.
Code

tests/unittest/test_litellm_imds.py[R64-70]

+def _frozen_creds(access_key="AKIAIOSFODNN7EXAMPLE",
+                  secret_key="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
+                  token=None):
+    frozen = MagicMock()
+    frozen.access_key = access_key
+    frozen.secret_key = secret_key
+    frozen.token = token
Evidence
Compliance prohibits committing secrets/tokens anywhere in the repo, including tests. The added
_frozen_creds helper hardcodes AWS credential-shaped values.

AGENTS.md
tests/unittest/test_litellm_imds.py[64-70]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The new test helper `_frozen_creds()` hardcodes AWS credential-shaped values (`AKIA...` access key and a canonical secret format). Even if they are examples, they can be treated as real secrets by scanners and violate the no-secrets policy.
## Issue Context
Tests should use clearly fake placeholders that do not match real credential patterns.
## Fix Focus Areas
- tests/unittest/test_litellm_imds.py[64-70]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


11. Over-120 AWS_USE_IMDS log lines📘
Description
New log statements exceed the configured 120-character line length, which can fail Ruff/formatting
checks and reduce readability. These strings should be wrapped or split across lines.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R70-72]

+                    get_logger().warning("AWS_USE_IMDS is set but boto3 found no credentials; falling through to static keys")
+            except Exception as e:
+                get_logger().error(f"AWS_USE_IMDS: failed to resolve credentials via boto3: {e}; falling through to static keys")
Evidence
The compliance checklist requires adherence to Ruff/isort conventions including 120-character lines.
The added AWS credential-resolution log lines are longer than 120 characters.

AGENTS.md
pr_agent/algo/ai_handlers/litellm_ai_handler.py[70-72]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Several newly added AWS_USE_IMDS logger calls exceed the 120-character line limit.
## Issue Context
Ruff line-length enforcement may fail CI, and the repo expects 120-char compliance.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[70-72]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


12. No static creds on IMDS miss🐞
Description
If AWS_USE_IMDS is set but boto3 returns no credentials (or throws), __init__ only logs “falling
through to static keys” and stashes static creds without exporting them to AWS_* env vars, so
Bedrock runs without credentials. The retry fallback is also gated on self._aws_imds_mode, which
remains False in this failure case, so static fallback never activates.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R54-96]

+        if os.environ.get("AWS_USE_IMDS", "").strip().lower() in ("1", "true", "yes"):
+            import boto3
+            session = boto3.Session()
+            try:
+                creds = session.get_credentials()
+                if creds:
+                    frozen = creds.get_frozen_credentials()
+                    os.environ["AWS_ACCESS_KEY_ID"] = frozen.access_key
+                    os.environ["AWS_SECRET_ACCESS_KEY"] = frozen.secret_key
+                    if frozen.token:
+                        os.environ["AWS_SESSION_TOKEN"] = frozen.token
+                    elif "AWS_SESSION_TOKEN" in os.environ:
+                        del os.environ["AWS_SESSION_TOKEN"]
+                    self._aws_imds_mode = True
+                    get_logger().info("Using ambient AWS credentials from IMDS/task-role/IRSA")
+                else:
+                    get_logger().warning("AWS_USE_IMDS is set but boto3 found no credentials; falling through to static keys")
+            except Exception as e:
+                get_logger().error(f"AWS_USE_IMDS: failed to resolve credentials via boto3: {e}; falling through to static keys")
+            if not os.environ.get("AWS_REGION_NAME") and not get_settings().get("aws.AWS_REGION_NAME"):
+                try:
+                    region = session.region_name
+                    if region:
+                        os.environ["AWS_REGION_NAME"] = region
+                        get_logger().info(f"AWS region resolved from environment: {region}")
+                    else:
+                        get_logger().warning("AWS_USE_IMDS: could not determine AWS region; set AWS_REGION_NAME explicitly")
+                except Exception as e:
+                    get_logger().warning(f"AWS_USE_IMDS: failed to resolve region via boto3: {e}")
+            if get_settings().get("aws.AWS_ACCESS_KEY_ID"):
+                if get_settings().aws.AWS_SECRET_ACCESS_KEY and get_settings().aws.AWS_REGION_NAME:
+                    self._aws_static_creds = {
+                        "AWS_ACCESS_KEY_ID": get_settings().aws.AWS_ACCESS_KEY_ID,
+                        "AWS_SECRET_ACCESS_KEY": get_settings().aws.AWS_SECRET_ACCESS_KEY,
+                        "AWS_REGION_NAME": get_settings().aws.AWS_REGION_NAME,
+                    }
+                    static_token = get_settings().get("aws.AWS_SESSION_TOKEN", None)
+                    if static_token:
+                        self._aws_static_creds["AWS_SESSION_TOKEN"] = static_token
+        elif get_settings().get("aws.AWS_ACCESS_KEY_ID"):
assert get_settings().aws.AWS_SECRET_ACCESS_KEY and get_settings().aws.AWS_REGION_NAME, "AWS credentials are incomplete"
os.environ["AWS_ACCESS_KEY_ID"] = get_settings().aws.AWS_ACCESS_KEY_ID
os.environ["AWS_SECRET_ACCESS_KEY"] = get_settings().aws.AWS_SECRET_ACCESS_KEY
Evidence
In __init__, when AWS_USE_IMDS is set and boto3 finds no credentials (or raises), the code does not
set AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY from settings; it only logs and (optionally) populates
self._aws_static_creds. In chat_completion, both the pre-call refresh and the APIError-triggered
static fallback are conditional on self._aws_imds_mode being True, which never happens when boto3
returns None/throws, so no fallback path is available.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[54-97]
pr_agent/algo/ai_handlers/litellm_ai_handler.py[350-508]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
When `AWS_USE_IMDS` is enabled but boto3 cannot resolve ambient credentials (`get_credentials()` returns `None` or raises), the handler logs that it is “falling through to static keys” but does not actually set `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY` / `AWS_REGION_NAME` from `[aws]` settings. Additionally, the runtime fallback logic is gated by `_aws_imds_mode`, so it never runs in this scenario.
## Issue Context
This breaks the documented behavior for non-AWS environments (or blocked IMDS) where users expect `AWS_USE_IMDS=true` to gracefully fall back to static keys.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[54-97]
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[350-508]
## Expected fix
- If boto3 credential resolution fails/returns none, and static keys are present in settings, set the AWS env vars immediately (same behavior as the non-IMDS branch), and log that static keys are being used.
- Optionally, add/adjust unit tests to cover: `AWS_USE_IMDS=true` + boto3 returns None/raises + static keys configured => env vars are set to static and Bedrock calls can proceed.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


13. Invalid raise openai.APIError usage 📘
Description
chat_completion() catches a broad exception and then does raise openai.APIError from e, which
will typically raise a TypeError because openai.APIError expects constructor arguments. This
breaks robust error handling and can hide the original failure mode behind a new exception.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R545-547]

+            except Exception as e:
+                get_logger().warning(f"Unknown error during LLM inference: {e}")
+                raise openai.APIError from e
Evidence
Robust error handling requires raising appropriate exceptions without introducing new errors; the
current re-raise pattern likely fails at runtime and also uses a broad except Exception.

Rule 3: Robust Error Handling
pr_agent/algo/ai_handlers/litellm_ai_handler.py[545-547]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`raise openai.APIError from e` is likely invalid (missing required constructor args), causing a new exception and obscuring the original error.
## Issue Context
This occurs in `LiteLLMAIHandler.chat_completion()` when an unexpected exception is caught.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[545-547]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

14. IMDS test response mock broken 🐞
Description
_mock_acompletion_response() assigns a lambda to mock.__getitem__, but _get_completion()
indexes the response via response["choices"]; this mock setup does not reliably provide a real
list under "choices", so the tests may not exercise the real success path (and can become
flaky/incorrect).
Code

tests/unittest/test_litellm_imds.py[R56-62]

+def _mock_acompletion_response():
+    mock = MagicMock()
+    mock.__getitem__ = lambda self, key: {
+        "choices": [{"message": {"content": "ok"}, "finish_reason": "stop"}]
+    }[key]
+    mock.dict.return_value = {"choices": [{"message": {"content": "ok"}, "finish_reason": "stop"}]}
+    return mock
Evidence
The unit test’s mock response attempts to emulate dict-style subscription by overwriting
__getitem__, while production code relies on response["choices"] and len(response["choices"])
to validate/parse the response. This mismatch can cause the handler to see MagicMock values instead
of concrete structures, weakening the test’s correctness.

tests/unittest/test_litellm_imds.py[56-62]
pr_agent/algo/ai_handlers/litellm_ai_handler.py[583-602]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The test helper `_mock_acompletion_response()` tries to support `response['choices']` by assigning a lambda to `mock.__getitem__`. The handler code uses `response['choices']` and expects a concrete list/dict structure; the current mock does not reliably model that access pattern.

## Issue Context
`LiteLLMAIHandler._get_completion()` validates `len(response['choices'])` and then reads `response['choices'][0]['message']['content']`, so the test response must behave like a mapping with real nested objects.

## Fix Focus Areas
- tests/unittest/test_litellm_imds.py[56-62]
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[583-602]

## Implementation notes
Update the mock to correctly support subscription, e.g.:
- Use `mock.__getitem__.side_effect = lambda k: {...}[k]` (instead of overwriting `__getitem__`), and ensure the returned `choices` value is a real list of dicts.
- Keep `mock.dict.return_value` as-is so `prepare_logs()` continues to work.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


15. Retry budget exceeded 🐞
Description
In chat_completion(), the Bedrock IMDS→static fallback performs an extra in-handler
_get_completion() call while the whole method is also tenacity-retried, so a failing request can
result in more than MODEL_RETRIES Bedrock attempts. This violates the retry contract and can amplify
load/cost during persistent failures.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R544-552]

+            except openai.APIError as e:
+                if _bedrock_imds and not self._aws_imds_fell_back and self._aws_static_creds:
+                    self._activate_static_aws_fallback()
+                    # Retry immediately while still holding the lock so that the
+                    # env-var swap is fully visible to this call. Letting @retry
+                    # handle the retry would release the lock between attempts,
+                    # allowing a concurrent coroutine to overwrite os.environ.
+                    resp, finish_reason, response_obj = await self._get_completion(**kwargs)
+                else:
Evidence
chat_completion() is decorated with tenacity stop_after_attempt(MODEL_RETRIES), but on
openai.APIError for Bedrock+IMDS it immediately retries _get_completion() after swapping to static
creds, which is an additional call outside tenacity’s attempt accounting; if that still fails,
tenacity can still perform its next attempt, increasing total calls beyond MODEL_RETRIES.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[26-27]
pr_agent/algo/ai_handlers/litellm_ai_handler.py[381-384]
pr_agent/algo/ai_handlers/litellm_ai_handler.py[544-552]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`chat_completion()` is decorated with tenacity retries, but it also performs an additional “fallback retry” inside the `except openai.APIError` block. This can exceed the configured retry budget (`MODEL_RETRIES`) during failures.
## Issue Context
The in-handler retry is used to keep the env-var swap + retry within the Bedrock lock, but it bypasses tenacity’s attempt counting.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[381-384]
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[544-552]
## What to change
- Refactor the retry logic so *all* attempts (including IMDS→static fallback) are counted within a single explicit retry loop.
- Example approach: remove the `@retry` decorator from `chat_completion()` and implement a bounded loop inside the Bedrock lock that:
1) refreshes IMDS creds (if enabled and not fallen back),
2) calls `_get_completion()`,
3) on APIError switches to static creds (once) and retries within the same bounded loop,
4) stops after `MODEL_RETRIES` total attempts.
- Ensure the resulting behavior cannot exceed `MODEL_RETRIES` total `_get_completion()` invocations per request.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


16. Config read bypasses get_settings() 📘
Description
The new IMDS toggle reads AWS_USE_IMDS directly from os.environ while other configuration uses
get_settings(), creating inconsistent configuration access. This increases the chance of
divergence in behavior across environments and makes validation/normalization harder to centralize.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R56-60]

elif 'OPENAI_API_KEY' not in os.environ:
    litellm.api_key = DUMMY_LITELLM_API_KEY
-        if get_settings().get("aws.AWS_ACCESS_KEY_ID"):
+        if os.environ.get("AWS_USE_IMDS", "").strip().lower() in ("1", "true", "yes"):
+            import boto3
+            session = boto3.Session()
Evidence
PR Compliance ID 19 requires consistent settings/config access and normalized/validated values. The
new AWS_USE_IMDS path bypasses get_settings() and directly inspects environment variables
in-line.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[56-60]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`AWS_USE_IMDS` is read via `os.environ.get(...)` instead of the repository’s standard config access (`get_settings().get(...)`), violating the consistency requirement.
## Issue Context
Dynaconf typically supports environment overrides, so `AWS_USE_IMDS` can still remain env-driven while being accessed through the same configuration mechanism.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[56-60]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (7)
17. IMDS fallback creds not validated 🐞
Description
In AWS_USE_IMDS mode, if aws.AWS_ACCESS_KEY_ID is configured but AWS_SECRET_ACCESS_KEY or
AWS_REGION_NAME is missing, the static fallback is silently ignored, so fallback can never activate
and misconfiguration is only discovered later at call time.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R95-97]

+            if get_settings().get("aws.AWS_ACCESS_KEY_ID"):
+                if get_settings().aws.AWS_SECRET_ACCESS_KEY and get_settings().aws.AWS_REGION_NAME:
+                    static_creds = {
Evidence
The IMDS branch checks for aws.AWS_ACCESS_KEY_ID and then gates stashing static fallback on
secret+region being truthy, but provides no else-path for warning/assertion. The explicit
completeness assertion only exists in the non-IMDS path, so IMDS-mode fallback misconfiguration is
not surfaced early.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[95-119]
pr_agent/algo/ai_handlers/litellm_ai_handler.py[120-124]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
When `AWS_USE_IMDS` is enabled and the user provides partial static credentials (e.g., `aws.AWS_ACCESS_KEY_ID` but missing secret or region), the code silently skips stashing/using static creds. This makes the fallback path non-functional without clear feedback.
### Issue Context
Static creds are optional in IMDS mode, but if the user provides any of them (especially `AWS_ACCESS_KEY_ID`), it’s a strong signal they intend to use fallback.
### Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[95-119]
### Suggested fix
Add an explicit validation branch in IMDS mode:
- If `aws.AWS_ACCESS_KEY_ID` is set but either `aws.AWS_SECRET_ACCESS_KEY` or `aws.AWS_REGION_NAME` is missing, log a warning (or raise an assertion/error) indicating static fallback credentials are incomplete and will be ignored.
This preserves “static creds optional” while making misconfiguration diagnosable.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


18. Broad except Exception for boto3 📘
Description
The new IMDS credential/region resolution catches Exception, which can mask unexpected bugs and
makes it harder to handle expected AWS SDK errors precisely. This violates the requirement to use
targeted exceptions and preserve debuggability.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R57-94]

+        if os.environ.get("AWS_USE_IMDS", "").strip().lower() in ("1", "true", "yes"):
+            import boto3
+            session = boto3.Session()
+            try:
+                creds = session.get_credentials()
+                if creds:
+                    frozen = creds.get_frozen_credentials()
+                    os.environ["AWS_ACCESS_KEY_ID"] = frozen.access_key
+                    os.environ["AWS_SECRET_ACCESS_KEY"] = frozen.secret_key
+                    if frozen.token:
+                        os.environ["AWS_SESSION_TOKEN"] = frozen.token
+                    elif "AWS_SESSION_TOKEN" in os.environ:
+                        del os.environ["AWS_SESSION_TOKEN"]
+                    self._aws_imds_mode = True
+                    get_logger().info("Using ambient AWS credentials from IMDS/task-role/IRSA")
+                else:
+                    get_logger().warning(
+                        "AWS_USE_IMDS is set but boto3 found no credentials; "
+                        "falling through to static keys"
+                    )
+            except Exception as e:
+                get_logger().exception(
+                    "AWS_USE_IMDS: failed to resolve credentials via boto3; "
+                    "falling through to static keys"
+                )
+            if not os.environ.get("AWS_REGION_NAME") and not get_settings().get("aws.AWS_REGION_NAME"):
+                try:
+                    region = session.region_name
+                    if region:
+                        os.environ["AWS_REGION_NAME"] = region
+                        get_logger().info(f"AWS region resolved from environment: {region}")
+                    else:
+                        get_logger().warning(
+                            "AWS_USE_IMDS: could not determine AWS region; "
+                            "set AWS_REGION_NAME explicitly"
+                        )
+                except Exception as e:
+                    get_logger().warning(f"AWS_USE_IMDS: failed to resolve region via boto3: {e}")
Evidence
Compliance requires avoiding broad exception handling; the IMDS path uses except Exception for
both credential and region resolution, rather than catching expected boto3/botocore exceptions.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[57-94]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The IMDS initialization code catches `Exception` broadly when calling boto3 for credentials and region detection, which can hide unexpected failures.
## Issue Context
This code path runs during handler initialization when `AWS_USE_IMDS` is enabled.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[57-94]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


19. Mixed get_settings() access styles 📘
Description
The new AWS static-credential logic mixes get_settings().get("aws.X") with attribute access like
get_settings().aws.AWS_SECRET_ACCESS_KEY, which is inconsistent and risks subtle
configuration/key-path issues. This violates the requirement to use a single consistent settings
access pattern and normalize/validate config values.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R95-104]

+            if get_settings().get("aws.AWS_ACCESS_KEY_ID"):
+                if get_settings().aws.AWS_SECRET_ACCESS_KEY and get_settings().aws.AWS_REGION_NAME:
+                    static_creds = {
+                        "AWS_ACCESS_KEY_ID": get_settings().aws.AWS_ACCESS_KEY_ID,
+                        "AWS_SECRET_ACCESS_KEY": get_settings().aws.AWS_SECRET_ACCESS_KEY,
+                        "AWS_REGION_NAME": get_settings().aws.AWS_REGION_NAME,
+                    }
+                    static_token = get_settings().get("aws.AWS_SESSION_TOKEN", None)
+                    if static_token:
+                        static_creds["AWS_SESSION_TOKEN"] = static_token
Evidence
The checklist requires consistent settings access; the code checks one AWS field via
.get("aws...") but reads related fields via attribute access, mixing patterns in the same block.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[95-104]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The AWS credential block mixes Dynaconf access patterns (`get_settings().get("aws.KEY")` vs `get_settings().aws.KEY`).
## Issue Context
This logic determines whether static creds are complete and whether to stash/apply them.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[95-104]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


20. Over-120 char error_msg line 📘
Description
A newly-touched f-string for error_msg appears to exceed the repository’s 120-character line
expectation, which may fail Ruff/formatting checks. This violates the lint/format compliance
requirements.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[400]

+                            error_msg = f"The image link is not [alive](img_path).\nPlease repost the original image as a comment, and send the question again with 'quote reply' (see [instructions](https://pr-agent-docs.codium.ai/tools/ask/#ask-on-images-using-the-pr-code-as-context))."
Evidence
The compliance checklist requires adherence to Ruff line-length conventions; the `error_msg =
f"..."` line contains a long embedded URL and message likely exceeding 120 chars.

AGENTS.md
pr_agent/algo/ai_handlers/litellm_ai_handler.py[400-400]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A long `error_msg` f-string is likely over the 120-character limit.
## Issue Context
This message is used when an image link returns 404.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[398-405]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


21. Blocking boto3 refresh 🐞
Description
IMDS credential refresh uses synchronous boto3 calls inside async chat_completion, so IMDS/STS
latency runs on the event-loop thread and can slow concurrent webhook processing.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R232-249]

+    def _refresh_imds_credentials(self):
+        """Refresh ambient AWS credentials from boto3 provider chain. Called before each Bedrock call
+        to avoid serving stale credentials from long-lived processes (EC2 roles rotate every ~6h)."""
+        try:
+            import boto3
+            creds = boto3.Session().get_credentials()
+            if creds:
+                frozen = creds.get_frozen_credentials()
+                os.environ["AWS_ACCESS_KEY_ID"] = frozen.access_key
+                os.environ["AWS_SECRET_ACCESS_KEY"] = frozen.secret_key
+                if frozen.token:
+                    os.environ["AWS_SESSION_TOKEN"] = frozen.token
+                elif "AWS_SESSION_TOKEN" in os.environ:
+                    del os.environ["AWS_SESSION_TOKEN"]
+            else:
+                get_logger().warning("IMDS credential refresh: boto3 returned no credentials")
+        except Exception as e:
+            get_logger().exception("IMDS credential refresh failed")
Evidence
The IMDS refresh function calls boto3 synchronously and is invoked from the async chat_completion
path for Bedrock models. In server mode, PR-Agent processes webhooks via async functions, so this
synchronous network-capable operation executes on the event loop thread.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[232-249]
pr_agent/algo/ai_handlers/litellm_ai_handler.py[377-384]
pr_agent/servers/github_app.py[38-54]
pr_agent/agent/pr_agent.py[54-117]

Agent prompt
The issue bel...

Comment thread tests/unittest/test_litellm_imds.py Outdated
Comment thread pr_agent/algo/ai_handlers/litellm_ai_handler.py Outdated
Comment thread pr_agent/algo/ai_handlers/litellm_ai_handler.py
Ira Abramov and others added 2 commits April 5, 2026 19:13
- Replace AKIA-pattern example keys in tests with scanner-safe fakes
- Wrap log strings exceeding 120-char line limit
- Fix IMDS-miss bug: apply static keys to env when boto3 returns None/raises
  (previously _aws_imds_mode stayed False and static keys were never exported)
- Use get_logger().exception() to preserve stack traces on boto3 failures
- Add asyncio.Lock to serialize os.environ mutation + Bedrock call, preventing
  concurrent asyncio.gather calls from interleaving AWS credentials
- Add two tests covering IMDS-miss + static fallback paths

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Critical: do IMDS fallback retry inside the lock, not via @Retry.
  Releasing the lock between attempts left a window for concurrent
  coroutines to overwrite os.environ before the retry ran.
- High: stash _aws_static_creds even when IMDS fails so runtime fallback
  state is consistent regardless of which init path was taken.
- Refactor: extract _static_aws_settings() test helper to eliminate 4
  copies of the inline static-credential settings setup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ira-at-work ira-at-work marked this pull request as ready for review April 5, 2026 16:20
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Add AWS_USE_IMDS ambient credential support and expand Bedrock model coverage

✨ Enhancement 🧪 Tests

Grey Divider

Walkthroughs

Description
• Add AWS_USE_IMDS support for ambient IAM credential resolution on Bedrock
  - Resolves credentials via boto3's standard provider chain (EC2 IMDS, ECS task metadata, EKS IRSA,
  Lambda)
  - Auto-resolves AWS_REGION_NAME from compute environment when not configured
  - Static keys become optional fallback with automatic retry on Bedrock API failure
• Add missing Bedrock model IDs for Sonnet 4.6 and Opus 4.5/4.6
  - Add versioned bedrock/anthropic.claude-sonnet-4-6-v1:0 entry
  - Add regional -v1:0 variants for Sonnet 4.6 (us, au, eu, jp, apac, global)
  - Complete regional coverage for Opus 4.5 and Opus 4.6 (eu, au, jp, apac)
• Harden IMDS credential handling with refresh and fallback logic
  - Wrap boto3 resolution in try/except to prevent init crashes
  - Call _refresh_imds_credentials() before each Bedrock call to avoid stale credentials
  - Correctly handle AWS_SESSION_TOKEN in fallback path for STS-derived credentials
• Add comprehensive unit tests for AWS_USE_IMDS credential resolution and fallback behavior
• Update documentation with IAM role workflow examples and Bedrock provider setup
Diagram
flowchart LR
  A["AWS_USE_IMDS env var"] -->|"true"| B["boto3 credential resolution"]
  B -->|"success"| C["IMDS credentials in env"]
  B -->|"failure"| D["Fall back to static keys"]
  C -->|"Bedrock call"| E["_refresh_imds_credentials"]
  E -->|"success"| F["Invoke Bedrock"]
  F -->|"auth failure"| G["_activate_static_aws_fallback"]
  G -->|"retry"| H["Invoke Bedrock with static creds"]
  A -->|"false/absent"| I["Use static keys only"]
  I -->|"Bedrock call"| F
Loading

Grey Divider

File Changes

1. pr_agent/algo/ai_handlers/litellm_ai_handler.py ✨ Enhancement +295/-174

Implement IMDS credential resolution with fallback logic

• Add IMDS credential resolution in __init__ via boto3 when AWS_USE_IMDS=true
• Store IMDS mode flag and static credentials for fallback in instance variables
• Add _refresh_imds_credentials() method to refresh ambient credentials before Bedrock calls
• Add _activate_static_aws_fallback() method to swap to static credentials on API failure
• Wrap chat_completion() in async lock for Bedrock IMDS mode to prevent concurrent env-var
 mutations
• Add fallback retry logic on openai.APIError for Bedrock calls with IMDS credentials
• Handle AWS_SESSION_TOKEN correctly in both IMDS and static credential paths

pr_agent/algo/ai_handlers/litellm_ai_handler.py


2. pr_agent/algo/__init__.py ✨ Enhancement +17/-0

Expand Bedrock model ID coverage for Sonnet and Opus

• Add bedrock/anthropic.claude-sonnet-4-6-v1:0 versioned model entry
• Add bedrock/anthropic.claude-opus-4-5-20251101-v1:0 base entry
• Add regional -v1:0 variants for Sonnet 4.6: us, au, eu, jp, apac, global
• Add regional variants for Opus 4.5: eu, au, jp, apac
• Add regional -v1:0 variants for Opus 4.6: eu, au, jp, apac
• Add apac variant for Sonnet 4.6 bare name

pr_agent/algo/init.py


3. tests/unittest/test_litellm_imds.py 🧪 Tests +450/-0

Add unit tests for IMDS credential resolution

• Add 450 lines of comprehensive unit tests for AWS_USE_IMDS functionality
• Test credential resolution from boto3 and environment variable population
• Test AWS_SESSION_TOKEN set/clear behavior for IMDS and static credentials
• Test region auto-resolution from boto3 when not explicitly configured
• Test boto3 exception handling and graceful fallthrough to static keys
• Test static credential stashing for fallback scenarios
• Test _refresh_imds_credentials() called before Bedrock calls only
• Test fallback to static credentials on Bedrock API failure with retry
• Test session token handling in _activate_static_aws_fallback()

tests/unittest/test_litellm_imds.py


View more (3)
4. docs/docs/installation/github.md 📝 Documentation +30/-1

Add Bedrock provider setup documentation for GitHub Actions

• Add new "Using Amazon Bedrock" subsection under "Switching Models"
• Document static IAM credential workflow for Bedrock
• Document recommended IAM role credential workflow with AWS_USE_IMDS=true
• Include example GitHub Actions workflow snippets for both approaches
• Reference IAM policy requirements and model configuration documentation

docs/docs/installation/github.md


5. docs/docs/usage-guide/changing_a_model.md 📝 Documentation +39/-2

Add IAM role credential documentation for Bedrock

• Add new "Using IAM Role Credentials (Recommended on AWS Compute)" subsection
• Document AWS compute contexts (EC2, ECS, EKS, Lambda) and credential mechanisms
• Include table mapping compute contexts to credential resolution methods
• Provide minimal GitHub Actions workflow example with AWS_USE_IMDS=true
• Include IAM policy example for bedrock:InvokeModel permission
• Document fallback behavior when static keys are also configured

docs/docs/usage-guide/changing_a_model.md


6. pr_agent/settings/.secrets_template.toml 📝 Documentation +5/-0

Update AWS secrets template with IMDS guidance

• Add clarifying comments about when static AWS keys are optional
• Document AWS_USE_IMDS=true as preferred approach on AWS compute
• Add AWS_SESSION_TOKEN field for STS-derived temporary credentials
• Explain that static keys serve as optional fallback when IMDS is enabled

pr_agent/settings/.secrets_template.toml


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Apr 5, 2026

Code Review by Qodo

🐞 Bugs (4) 📘 Rule violations (6) 📎 Requirement gaps (0) 🎨 UX Issues (0)

Grey Divider


Action required

1. Invalid raise openai.APIError usage 📘 Rule violation ≡ Correctness ⭐ New
Description
chat_completion() catches a broad exception and then does raise openai.APIError from e, which
will typically raise a TypeError because openai.APIError expects constructor arguments. This
breaks robust error handling and can hide the original failure mode behind a new exception.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R545-547]

+            except Exception as e:
+                get_logger().warning(f"Unknown error during LLM inference: {e}")
+                raise openai.APIError from e
Evidence
Robust error handling requires raising appropriate exceptions without introducing new errors; the
current re-raise pattern likely fails at runtime and also uses a broad except Exception.

Rule 3: Robust Error Handling
pr_agent/algo/ai_handlers/litellm_ai_handler.py[545-547]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`raise openai.APIError from e` is likely invalid (missing required constructor args), causing a new exception and obscuring the original error.

## Issue Context
This occurs in `LiteLLMAIHandler.chat_completion()` when an unexpected exception is caught.

## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[545-547]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Hardcoded AWS keys in tests 📘 Rule violation ⛨ Security
Description
The new tests embed AWS access key/secret-like strings (including an AKIA... access key and a
secret key in the canonical AWS format), which can be flagged as committed secrets and increases
credential-leakage risk. Test fixtures should avoid secret-looking values even when intended as
examples.
Code

tests/unittest/test_litellm_imds.py[R64-70]

+def _frozen_creds(access_key="AKIAIOSFODNN7EXAMPLE",
+                  secret_key="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
+                  token=None):
+    frozen = MagicMock()
+    frozen.access_key = access_key
+    frozen.secret_key = secret_key
+    frozen.token = token
Evidence
Compliance prohibits committing secrets/tokens anywhere in the repo, including tests. The added
_frozen_creds helper hardcodes AWS credential-shaped values.

AGENTS.md
tests/unittest/test_litellm_imds.py[64-70]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The new test helper `_frozen_creds()` hardcodes AWS credential-shaped values (`AKIA...` access key and a canonical secret format). Even if they are examples, they can be treated as real secrets by scanners and violate the no-secrets policy.
## Issue Context
Tests should use clearly fake placeholders that do not match real credential patterns.
## Fix Focus Areas
- tests/unittest/test_litellm_imds.py[64-70]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Over-120 AWS_USE_IMDS log lines 📘 Rule violation ⚙ Maintainability
Description
New log statements exceed the configured 120-character line length, which can fail Ruff/formatting
checks and reduce readability. These strings should be wrapped or split across lines.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R70-72]

+                    get_logger().warning("AWS_USE_IMDS is set but boto3 found no credentials; falling through to static keys")
+            except Exception as e:
+                get_logger().error(f"AWS_USE_IMDS: failed to resolve credentials via boto3: {e}; falling through to static keys")
Evidence
The compliance checklist requires adherence to Ruff/isort conventions including 120-character lines.
The added AWS credential-resolution log lines are longer than 120 characters.

AGENTS.md
pr_agent/algo/ai_handlers/litellm_ai_handler.py[70-72]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Several newly added AWS_USE_IMDS logger calls exceed the 120-character line limit.
## Issue Context
Ruff line-length enforcement may fail CI, and the repo expects 120-char compliance.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[70-72]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (1)
4. No static creds on IMDS miss 🐞 Bug ≡ Correctness
Description
If AWS_USE_IMDS is set but boto3 returns no credentials (or throws), __init__ only logs “falling
through to static keys” and stashes static creds without exporting them to AWS_* env vars, so
Bedrock runs without credentials. The retry fallback is also gated on self._aws_imds_mode, which
remains False in this failure case, so static fallback never activates.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R54-96]

+        if os.environ.get("AWS_USE_IMDS", "").strip().lower() in ("1", "true", "yes"):
+            import boto3
+            session = boto3.Session()
+            try:
+                creds = session.get_credentials()
+                if creds:
+                    frozen = creds.get_frozen_credentials()
+                    os.environ["AWS_ACCESS_KEY_ID"] = frozen.access_key
+                    os.environ["AWS_SECRET_ACCESS_KEY"] = frozen.secret_key
+                    if frozen.token:
+                        os.environ["AWS_SESSION_TOKEN"] = frozen.token
+                    elif "AWS_SESSION_TOKEN" in os.environ:
+                        del os.environ["AWS_SESSION_TOKEN"]
+                    self._aws_imds_mode = True
+                    get_logger().info("Using ambient AWS credentials from IMDS/task-role/IRSA")
+                else:
+                    get_logger().warning("AWS_USE_IMDS is set but boto3 found no credentials; falling through to static keys")
+            except Exception as e:
+                get_logger().error(f"AWS_USE_IMDS: failed to resolve credentials via boto3: {e}; falling through to static keys")
+            if not os.environ.get("AWS_REGION_NAME") and not get_settings().get("aws.AWS_REGION_NAME"):
+                try:
+                    region = session.region_name
+                    if region:
+                        os.environ["AWS_REGION_NAME"] = region
+                        get_logger().info(f"AWS region resolved from environment: {region}")
+                    else:
+                        get_logger().warning("AWS_USE_IMDS: could not determine AWS region; set AWS_REGION_NAME explicitly")
+                except Exception as e:
+                    get_logger().warning(f"AWS_USE_IMDS: failed to resolve region via boto3: {e}")
+            if get_settings().get("aws.AWS_ACCESS_KEY_ID"):
+                if get_settings().aws.AWS_SECRET_ACCESS_KEY and get_settings().aws.AWS_REGION_NAME:
+                    self._aws_static_creds = {
+                        "AWS_ACCESS_KEY_ID": get_settings().aws.AWS_ACCESS_KEY_ID,
+                        "AWS_SECRET_ACCESS_KEY": get_settings().aws.AWS_SECRET_ACCESS_KEY,
+                        "AWS_REGION_NAME": get_settings().aws.AWS_REGION_NAME,
+                    }
+                    static_token = get_settings().get("aws.AWS_SESSION_TOKEN", None)
+                    if static_token:
+                        self._aws_static_creds["AWS_SESSION_TOKEN"] = static_token
+        elif get_settings().get("aws.AWS_ACCESS_KEY_ID"):
           assert get_settings().aws.AWS_SECRET_ACCESS_KEY and get_settings().aws.AWS_REGION_NAME, "AWS credentials are incomplete"
           os.environ["AWS_ACCESS_KEY_ID"] = get_settings().aws.AWS_ACCESS_KEY_ID
           os.environ["AWS_SECRET_ACCESS_KEY"] = get_settings().aws.AWS_SECRET_ACCESS_KEY
Evidence
In __init__, when AWS_USE_IMDS is set and boto3 finds no credentials (or raises), the code does not
set AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY from settings; it only logs and (optionally) populates
self._aws_static_creds. In chat_completion, both the pre-call refresh and the APIError-triggered
static fallback are conditional on self._aws_imds_mode being True, which never happens when boto3
returns None/throws, so no fallback path is available.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[54-97]
pr_agent/algo/ai_handlers/litellm_ai_handler.py[350-508]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
When `AWS_USE_IMDS` is enabled but boto3 cannot resolve ambient credentials (`get_credentials()` returns `None` or raises), the handler logs that it is “falling through to static keys” but does not actually set `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY` / `AWS_REGION_NAME` from `[aws]` settings. Additionally, the runtime fallback logic is gated by `_aws_imds_mode`, so it never runs in this scenario.
## Issue Context
This breaks the documented behavior for non-AWS environments (or blocked IMDS) where users expect `AWS_USE_IMDS=true` to gracefully fall back to static keys.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[54-97]
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[350-508]
## Expected fix
- If boto3 credential resolution fails/returns none, and static keys are present in settings, set the AWS env vars immediately (same behavior as the non-IMDS branch), and log that static keys are being used.
- Optionally, add/adjust unit tests to cover: `AWS_USE_IMDS=true` + boto3 returns None/raises + static keys configured => env vars are set to static and Bedrock calls can proceed.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

5. Broad except Exception for boto3 📘 Rule violation ⛨ Security ⭐ New
Description
The new IMDS credential/region resolution catches Exception, which can mask unexpected bugs and
makes it harder to handle expected AWS SDK errors precisely. This violates the requirement to use
targeted exceptions and preserve debuggability.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R57-94]

+        if os.environ.get("AWS_USE_IMDS", "").strip().lower() in ("1", "true", "yes"):
+            import boto3
+            session = boto3.Session()
+            try:
+                creds = session.get_credentials()
+                if creds:
+                    frozen = creds.get_frozen_credentials()
+                    os.environ["AWS_ACCESS_KEY_ID"] = frozen.access_key
+                    os.environ["AWS_SECRET_ACCESS_KEY"] = frozen.secret_key
+                    if frozen.token:
+                        os.environ["AWS_SESSION_TOKEN"] = frozen.token
+                    elif "AWS_SESSION_TOKEN" in os.environ:
+                        del os.environ["AWS_SESSION_TOKEN"]
+                    self._aws_imds_mode = True
+                    get_logger().info("Using ambient AWS credentials from IMDS/task-role/IRSA")
+                else:
+                    get_logger().warning(
+                        "AWS_USE_IMDS is set but boto3 found no credentials; "
+                        "falling through to static keys"
+                    )
+            except Exception as e:
+                get_logger().exception(
+                    "AWS_USE_IMDS: failed to resolve credentials via boto3; "
+                    "falling through to static keys"
+                )
+            if not os.environ.get("AWS_REGION_NAME") and not get_settings().get("aws.AWS_REGION_NAME"):
+                try:
+                    region = session.region_name
+                    if region:
+                        os.environ["AWS_REGION_NAME"] = region
+                        get_logger().info(f"AWS region resolved from environment: {region}")
+                    else:
+                        get_logger().warning(
+                            "AWS_USE_IMDS: could not determine AWS region; "
+                            "set AWS_REGION_NAME explicitly"
+                        )
+                except Exception as e:
+                    get_logger().warning(f"AWS_USE_IMDS: failed to resolve region via boto3: {e}")
Evidence
Compliance requires avoiding broad exception handling; the IMDS path uses except Exception for
both credential and region resolution, rather than catching expected boto3/botocore exceptions.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[57-94]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The IMDS initialization code catches `Exception` broadly when calling boto3 for credentials and region detection, which can hide unexpected failures.

## Issue Context
This code path runs during handler initialization when `AWS_USE_IMDS` is enabled.

## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[57-94]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Mixed get_settings() access styles 📘 Rule violation ⚙ Maintainability ⭐ New
Description
The new AWS static-credential logic mixes get_settings().get("aws.X") with attribute access like
get_settings().aws.AWS_SECRET_ACCESS_KEY, which is inconsistent and risks subtle
configuration/key-path issues. This violates the requirement to use a single consistent settings
access pattern and normalize/validate config values.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R95-104]

+            if get_settings().get("aws.AWS_ACCESS_KEY_ID"):
+                if get_settings().aws.AWS_SECRET_ACCESS_KEY and get_settings().aws.AWS_REGION_NAME:
+                    static_creds = {
+                        "AWS_ACCESS_KEY_ID": get_settings().aws.AWS_ACCESS_KEY_ID,
+                        "AWS_SECRET_ACCESS_KEY": get_settings().aws.AWS_SECRET_ACCESS_KEY,
+                        "AWS_REGION_NAME": get_settings().aws.AWS_REGION_NAME,
+                    }
+                    static_token = get_settings().get("aws.AWS_SESSION_TOKEN", None)
+                    if static_token:
+                        static_creds["AWS_SESSION_TOKEN"] = static_token
Evidence
The checklist requires consistent settings access; the code checks one AWS field via
.get("aws...") but reads related fields via attribute access, mixing patterns in the same block.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[95-104]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The AWS credential block mixes Dynaconf access patterns (`get_settings().get("aws.KEY")` vs `get_settings().aws.KEY`).

## Issue Context
This logic determines whether static creds are complete and whether to stash/apply them.

## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[95-104]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


7. Over-120 char error_msg line 📘 Rule violation ⚙ Maintainability ⭐ New
Description
A newly-touched f-string for error_msg appears to exceed the repository’s 120-character line
expectation, which may fail Ruff/formatting checks. This violates the lint/format compliance
requirements.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[400]

+                            error_msg = f"The image link is not [alive](img_path).\nPlease repost the original image as a comment, and send the question again with 'quote reply' (see [instructions](https://pr-agent-docs.codium.ai/tools/ask/#ask-on-images-using-the-pr-code-as-context))."
Evidence
The compliance checklist requires adherence to Ruff line-length conventions; the `error_msg =
f"..."` line contains a long embedded URL and message likely exceeding 120 chars.

AGENTS.md
pr_agent/algo/ai_handlers/litellm_ai_handler.py[400-400]
Best Practice: Learned patterns

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A long `error_msg` f-string is likely over the 120-character limit.

## Issue Context
This message is used when an image link returns 404.

## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[398-405]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (3)
8. Blocking boto3 refresh 🐞 Bug ➹ Performance ⭐ New
Description
IMDS credential refresh uses synchronous boto3 calls inside async chat_completion, so IMDS/STS
latency runs on the event-loop thread and can slow concurrent webhook processing.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R232-249]

+    def _refresh_imds_credentials(self):
+        """Refresh ambient AWS credentials from boto3 provider chain. Called before each Bedrock call
+        to avoid serving stale credentials from long-lived processes (EC2 roles rotate every ~6h)."""
+        try:
+            import boto3
+            creds = boto3.Session().get_credentials()
+            if creds:
+                frozen = creds.get_frozen_credentials()
+                os.environ["AWS_ACCESS_KEY_ID"] = frozen.access_key
+                os.environ["AWS_SECRET_ACCESS_KEY"] = frozen.secret_key
+                if frozen.token:
+                    os.environ["AWS_SESSION_TOKEN"] = frozen.token
+                elif "AWS_SESSION_TOKEN" in os.environ:
+                    del os.environ["AWS_SESSION_TOKEN"]
+            else:
+                get_logger().warning("IMDS credential refresh: boto3 returned no credentials")
+        except Exception as e:
+            get_logger().exception("IMDS credential refresh failed")
Evidence
The IMDS refresh function calls boto3 synchronously and is invoked from the async chat_completion
path for Bedrock models. In server mode, PR-Agent processes webhooks via async functions, so this
synchronous network-capable operation executes on the event loop thread.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[232-249]
pr_agent/algo/ai_handlers/litellm_ai_handler.py[377-384]
pr_agent/servers/github_app.py[38-54]
pr_agent/agent/pr_agent.py[54-117]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`_refresh_imds_credentials()` performs synchronous boto3 credential resolution and is called from the async Bedrock request path, which can block the event loop during IMDS/STS I/O.

## Issue Context
This code runs in FastAPI/async webhook processing and PR-Agent also parallelizes some AI calls; event-loop blocking increases latency and reduces throughput.

## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[232-249]
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[377-384]

## Implementation direction
- Make refresh non-blocking by running boto3 credential fetching in a worker thread (e.g., `await asyncio.to_thread(...)`) and only applying the resulting env mutation while holding the Bedrock/shared lock.
- Alternatively, remove per-call refresh and rely on botocore’s refreshable credentials without copying values into `os.environ` on every request (if compatible with the Bedrock/litellm integration).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


9. Fallback too broad 🐞 Bug ≡ Correctness ⭐ New
Description
Any Bedrock openai.APIError (not just credential/permission failures) triggers a swap to static
credentials and sets _aws_imds_fell_back=True, so non-credential errors can permanently switch
subsequent Bedrock calls away from ambient credentials.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R534-542]

+            except openai.APIError as e:
+                if _bedrock_imds and not self._aws_imds_fell_back and self._aws_static_creds:
+                    self._activate_static_aws_fallback()
+                    # Retry immediately while still holding the lock so that the
+                    # env-var swap is fully visible to this call. Letting @retry
+                    # handle the retry would release the lock between attempts,
+                    # allowing a concurrent coroutine to overwrite os.environ.
+                    resp, finish_reason, response_obj = await self._get_completion(**kwargs)
+                else:
Evidence
The fallback logic catches all openai.APIError and activates static credentials, setting
_aws_imds_fell_back=True. The refresh path is gated by _aws_imds_fell_back, so once set, IMDS
refresh is skipped for later calls regardless of the root cause of the initial APIError.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[251-261]
pr_agent/algo/ai_handlers/litellm_ai_handler.py[380-383]
pr_agent/algo/ai_handlers/litellm_ai_handler.py[531-544]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Static-credential fallback is triggered for any `openai.APIError` during Bedrock calls and permanently disables IMDS refresh (`_aws_imds_fell_back=True`). This can switch to static credentials even on non-auth failures and keep using them afterward.

## Issue Context
The intended behavior is fallback when ambient credentials are insufficient (e.g., AccessDenied), not when the Bedrock call fails for unrelated reasons.

## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[251-261]
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[380-383]
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[534-542]

## Implementation direction
- Trigger fallback only for credential/permission-related failures (e.g., inspect the exception’s status/code/message if available).
- Avoid making fallback permanent on the first failure: either
 - apply static creds only for the immediate retry and then restore IMDS creds, or
 - set `_aws_imds_fell_back=True` only after confirming the failure is auth-related (and/or after repeated auth failures).
- Ensure the decision logic is covered by tests for a non-auth `APIError` path (should not flip to static).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


10. Global AWS env races 🐞 Bug ☼ Reliability
Description
_refresh_imds_credentials and _activate_static_aws_fallback mutate process-wide os.environ during
async chat_completion, but PR-Agent can run multiple chat_completion calls concurrently. This can
cause intermittent Bedrock requests to observe credentials/region swapped by another in-flight task.
Code

pr_agent/algo/ai_handlers/litellm_ai_handler.py[R205-234]

+    def _refresh_imds_credentials(self):
+        """Refresh ambient AWS credentials from boto3 provider chain. Called before each Bedrock call
+        to avoid serving stale credentials from long-lived processes (EC2 roles rotate every ~6h)."""
+        try:
+            import boto3
+            creds = boto3.Session().get_credentials()
+            if creds:
+                frozen = creds.get_frozen_credentials()
+                os.environ["AWS_ACCESS_KEY_ID"] = frozen.access_key
+                os.environ["AWS_SECRET_ACCESS_KEY"] = frozen.secret_key
+                if frozen.token:
+                    os.environ["AWS_SESSION_TOKEN"] = frozen.token
+                elif "AWS_SESSION_TOKEN" in os.environ:
+                    del os.environ["AWS_SESSION_TOKEN"]
+            else:
+                get_logger().warning("IMDS credential refresh: boto3 returned no credentials")
+        except Exception as e:
+            get_logger().warning(f"IMDS credential refresh failed: {e}")
+
+    def _activate_static_aws_fallback(self):
+        """Swap process env to static credentials for Bedrock fallback after IMDS failure."""
+        os.environ["AWS_ACCESS_KEY_ID"] = self._aws_static_creds["AWS_ACCESS_KEY_ID"]
+        os.environ["AWS_SECRET_ACCESS_KEY"] = self._aws_static_creds["AWS_SECRET_ACCESS_KEY"]
+        os.environ["AWS_REGION_NAME"] = self._aws_static_creds["AWS_REGION_NAME"]
+        if "AWS_SESSION_TOKEN" in self._aws_static_creds:
+            os.environ["AWS_SESSION_TOKEN"] = self._aws_static_creds["AWS_SESSION_TOKEN"]
+        elif "AWS_SESSION_TOKEN" in os.environ:
+            del os.environ["AWS_SESSION_TOKEN"]
+        self._aws_imds_fell_back = True
+        get_logger().warning("Bedrock call failed with ambient (IMDS) credentials; retrying with static credentials")
Evidence
The handler updates AWS credential env vars inside _refresh_imds_credentials and
_activate_static_aws_fallback, and triggers these from chat_completion. Separately, the codebase
parallelizes AI calls using asyncio.gather, enabling concurrent chat_completion executions that can
interleave these global env mutations.

pr_agent/algo/ai_handlers/litellm_ai_handler.py[205-234]
pr_agent/algo/ai_handlers/litellm_ai_handler.py[350-352]
pr_agent/tools/pr_code_suggestions.py[701-706]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
AWS credentials are stored in `os.environ` and are refreshed/swapped during `chat_completion`. Since `chat_completion` is async and the codebase uses `asyncio.gather()` for parallel calls, concurrent invocations can race and overwrite each other’s AWS env vars mid-request.
## Issue Context
This can manifest as intermittent Bedrock auth failures or wrong-account/region usage when parallel calls are enabled.
## Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[205-234]
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[350-352]
- pr_agent/tools/pr_code_suggestions.py[701-706]
## Suggested mitigation
Choose one:
1) **Preferred**: avoid using `os.environ` for per-request AWS creds; instead pass credentials/session explicitly to the Bedrock client/provider used by LiteLLM.
2) **Minimal change**: add an `asyncio.Lock` on the handler and wrap the Bedrock path so that refresh/fallback + the actual Bedrock completion execute under the same lock (serializing Bedrock calls in IMDS mode to prevent env interleaving).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Comment thread pr_agent/algo/ai_handlers/litellm_ai_handler.py
…eholders

TESTFAKEACCESSKEYID00 / TestFakeSecretKeyNotRealForTestingOnly00 match
real AWS key length patterns and can trigger secret-scanning tools.
Replace with clearly fake FAKE-KEY / FAKE-SECRET values.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ira-at-work
Copy link
Copy Markdown
Author

Addressing the qodo bot findings:

Issue 1 (credential-shaped test values): Fixed in 9420cdc. TESTFAKEACCESSKEYID00 / TestFakeSecretKeyNotRealForTestingOnly00 matched real AWS key lengths (20/40 chars). Replaced with FAKE-KEY / FAKE-SECRET.

Issue 2 (over-120 line length): False positive. The longest line in the new IMDS block is 103 characters (line 82: if not os.environ.get("AWS_REGION_NAME") and not get_settings().get("aws.AWS_REGION_NAME"):). All lines are under the configured 120-char ruff limit.

Issue 3 (static creds not applied on IMDS miss): False positive. Lines 108–119 of __init__ explicitly set AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and AWS_REGION_NAME from [aws] settings when self._aws_imds_mode is False (i.e., when IMDS resolution fails or returns no credentials). The fallback path is functional.

Issue 4 (raise openai.APIError from e): Pre-existing code that my diff shifted in line number but did not write. The - and + in the diff reflect a move, not a new addition. Out of scope for this PR.

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Apr 5, 2026

Persistent review updated to latest commit 9420cdc

Comment thread pr_agent/algo/ai_handlers/litellm_ai_handler.py Outdated
Comment thread pr_agent/algo/ai_handlers/litellm_ai_handler.py Outdated
1. IMDS credential refresh was creating a new boto3.Session() which
   would re-read the AWS_* env vars we set in __init__, returning the
   same stale credentials instead of going back to IMDS/task-role.
   Fix: store the original boto3 credentials object (_aws_boto3_creds)
   before writing to env; _refresh_imds_credentials now calls
   get_frozen_credentials() on that stored object, which honours
   boto3's own RefreshableCredentials TTL logic.

2. When aws.AWS_REGION_NAME was configured in settings but
   AWS_REGION_NAME was absent from the environment, the IMDS branch
   skipped auto-resolution but also never exported the configured
   region, leaving Bedrock calls without a region.
   Fix: if env region is absent and a settings region is present,
   export it immediately; only fall back to session.region_name when
   neither source has a region.

Tests added for both bugs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ira-at-work
Copy link
Copy Markdown
Author

Addressing the second pass of bot findings:

ID:3037074690 (stale IMDS refresh): Valid. _refresh_imds_credentials() was creating boto3.Session() which would re-read the AWS_* env vars already set in __init__, returning the same values instead of going to IMDS. Fixed in 9e00f35: the original boto3 credentials object is now stored as self._aws_boto3_creds before any env writes; the refresh method calls get_frozen_credentials() on that stored object, which honours boto3's RefreshableCredentials TTL and refresh logic. boto3.Session is no longer called during refresh.

ID:3037074691 (configured region not exported): Valid. The and not get_settings().get("aws.AWS_REGION_NAME") guard in the region block meant a configured aws.AWS_REGION_NAME was silently ignored — neither written to env nor used. Fixed in 9e00f35: if AWS_REGION_NAME is absent from env and a settings region is present, it is now exported immediately. Auto-resolve via session.region_name only runs when both sources are absent.

ID:3037062418 (raise openai.APIError from e): Pre-existing code (present in main at lines 426 and 456). My commits shifted its line number but did not introduce it. Out of scope for this PR.

…l; fail-fast on refresh error

- Extract repeated frozen-creds-to-env-var writing into a static helper
  _write_frozen_creds_to_env(), used by both __init__ and _refresh_imds_credentials.

- _refresh_imds_credentials now returns bool (True = success, False = failure)
  so callers can react without relying on stale env state.

- chat_completion call site: if refresh returns False and static creds are
  available, activate static fallback immediately (before the Bedrock call)
  rather than making a doomed round-trip and waiting for the APIError handler.

Tests added for: refresh returning False on None guard, refresh returning False
on exception, and early static fallback activation on refresh failure.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Apr 5, 2026

Persistent review updated to latest commit d4265cc

Comment thread pr_agent/algo/ai_handlers/litellm_ai_handler.py Outdated
Comment thread pr_agent/algo/ai_handlers/litellm_ai_handler.py
get_logger().exception() captures the active traceback automatically;
binding the variable triggers Ruff F841.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ira-at-work
Copy link
Copy Markdown
Author

Third-pass bot findings:

ID:3037085791 (unused except Exception as e): Fixed in 5ddd51d. get_logger().exception() captures the active traceback itself; e was never needed.

ID:3037085793 (async nullcontext): False positive for this project. contextlib.nullcontext gained __aenter__/__aexit__ in Python 3.10. The project requires python >= 3.12 (pyproject.toml), so async with ... contextlib.nullcontext() is valid. test_refresh_not_called_for_non_bedrock_model passes against it.

ID:3037074690, ID:3037074691 (stale IMDS refresh, region not exported): Already addressed in commits 9e00f35 and d4265ccf from the previous round.

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Apr 5, 2026

Persistent review updated to latest commit 5ddd51d

@ira-at-work
Copy link
Copy Markdown
Author

@naorpeled fully tested on our runners, and worked nicely with a few Bedrock models I tried.

תודה מראש!

@naorpeled
Copy link
Copy Markdown
Member

/agentic_review

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Apr 11, 2026

Persistent review updated to latest commit 5ddd51d

Copy link
Copy Markdown
Member

@naorpeled naorpeled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🙏
Once my comments are addressed I'll gladly merge this

Sorry for the delayed response and thanks for your patience!

self.streaming_required_models = STREAMING_REQUIRED_MODELS

@staticmethod
def _write_frozen_creds_to_env(frozen) -> None:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
let's rename to _write_frozen_aws_creds_to_env 🙏

elif "AWS_SESSION_TOKEN" in os.environ:
del os.environ["AWS_SESSION_TOKEN"]

def _refresh_imds_credentials(self) -> bool:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:
let's rename to _refresh_aws_imds_credentials 🙏

_write_frozen_creds_to_env → _write_frozen_aws_creds_to_env
_refresh_imds_credentials  → _refresh_aws_imds_credentials

Requested by maintainer review (naorpeled) on PR The-PR-Agent#2307.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Apr 12, 2026

Persistent review updated to latest commit 1a21ab2

Comment on lines +119 to 122
elif get_settings().get("aws.AWS_ACCESS_KEY_ID"):
assert get_settings().aws.AWS_SECRET_ACCESS_KEY and get_settings().aws.AWS_REGION_NAME, "AWS credentials are incomplete"
os.environ["AWS_ACCESS_KEY_ID"] = get_settings().aws.AWS_ACCESS_KEY_ID
os.environ["AWS_SECRET_ACCESS_KEY"] = get_settings().aws.AWS_SECRET_ACCESS_KEY
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Static token not exported 🐞 Bug ≡ Correctness

When AWS_USE_IMDS is not set, the static-credentials path exports access key/secret/region but
never exports aws.AWS_SESSION_TOKEN, so STS-derived static credentials (that require a session
token) will fail.
Agent Prompt
### Issue description
In the non-IMDS path (static AWS keys), `AWS_SESSION_TOKEN` from settings is ignored, breaking temporary/STS static credentials.

### Issue Context
IMDS mode already handles session token for both ambient creds and static fallback; only the non-IMDS static branch is missing it.

### Fix Focus Areas
- pr_agent/algo/ai_handlers/litellm_ai_handler.py[119-123]

### Implementation notes
- Read `aws.AWS_SESSION_TOKEN` from settings in this branch and set `os.environ["AWS_SESSION_TOKEN"]` when present.
- If absent, consider clearing `AWS_SESSION_TOKEN` from env to avoid a stale token interfering with static long-lived keys (match `_write_frozen_aws_creds_to_env` / `_activate_static_aws_fallback` behavior).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Ira Abramov and others added 2 commits April 12, 2026 13:45
Static keys configured without AWS_USE_IMDS never set or cleared
AWS_SESSION_TOKEN, breaking STS-derived credentials that require a
session token. Apply the same set/clear pattern used by
_write_frozen_aws_creds_to_env and _activate_static_aws_fallback.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When IMDS resolution fails and static credentials without a session
token are activated, a session token previously written by IMDS would
remain in the process environment. Add the missing elif clear to match
the pattern in _write_frozen_aws_creds_to_env and
_activate_static_aws_fallback.

Also align the _static_aws_settings test fixture so AWS_SESSION_TOKEN
appears on the .aws attribute object when provided, consistent with
how the other three credential fields are exposed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Apr 12, 2026

Persistent review updated to latest commit f42f1d6

@naorpeled
Copy link
Copy Markdown
Member

/agentic_review

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Apr 15, 2026

Persistent review updated to latest commit f42f1d6

Comment thread pr_agent/algo/ai_handlers/litellm_ai_handler.py
Comment thread pr_agent/algo/ai_handlers/litellm_ai_handler.py
Comment thread pr_agent/algo/ai_handlers/litellm_ai_handler.py
Comment thread tests/unittest/test_litellm_imds.py Outdated
Ira Abramov and others added 2 commits April 15, 2026 10:52
…ed var and test exceptions

- Replace bare `except Exception` with `except botocore.exceptions.BotoCoreError` in
  both the __init__ credential resolution block and _refresh_aws_imds_credentials,
  so unexpected programming errors surface instead of being swallowed
- Update matching test side_effect values from plain Exception to
  CredentialRetrievalError so they exercise the now-narrowed catch clause
- Fix _frozen_creds helper signature to hanging-indent style
- Remove unused `mock_boto3` alias (ruff F841)

Resolves review comments from The-PR-Agent#2307

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…or tests

botocore.exceptions.ClientError is not a BotoCoreError subclass (by design:
it represents a successful HTTP roundtrip with a service-level error), so it
escaped both IMDS catch sites. This matters for AssumeRole/IRSA credential
providers which call STS and can raise AccessDenied or ExpiredTokenException.

- Extend both catches to (BotoCoreError, ClientError) with explanatory comment
- Add test_imds_sts_client_error_does_not_crash for __init__ path
- Add test_refresh_returns_false_on_sts_client_error for refresh path

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Apr 15, 2026

Persistent review updated to latest commit 0acbbc7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants