Skip to content

Update uv.lock for version 0.0.116 https://claude.ai/code/session_01YQE91SdLeXwBXy835P9UwA#432

Open
vitali87 wants to merge 6 commits intomainfrom
claude/uv-run-tests-mtAJA
Open

Update uv.lock for version 0.0.116 https://claude.ai/code/session_01YQE91SdLeXwBXy835P9UwA#432
vitali87 wants to merge 6 commits intomainfrom
claude/uv-run-tests-mtAJA

Conversation

@vitali87
Copy link
Owner

@vitali87 vitali87 commented Mar 7, 2026

Summary

Type of Change

  • Bug fix
  • New feature
  • Performance improvement
  • Refactoring (no functional changes)
  • Documentation
  • CI/CD or tooling
  • Dependencies

Related Issues

Test Plan

  • Unit tests pass (make test-parallel or uv run pytest -n auto -m "not integration")
  • New tests added
  • Integration tests pass (make test-integration, requires Docker)
  • Manual testing (describe below)

Checklist

  • PR title follows Conventional Commits format
  • All pre-commit checks pass (make pre-commit)
  • No hardcoded strings in non-config/non-constants files
  • No # type: ignore, cast(), Any, or object type hints
  • No new comments or docstrings (code should be self-documenting)

claude added 6 commits March 7, 2026 08:41
Introduces cross-language/cross-service call graph tracking by:
- Adding Service and ApiEndpoint node types to the graph schema
- Adding EXPOSES_ENDPOINT, HANDLES_ENDPOINT, and CALLS_ENDPOINT relationships
- Parsing OpenAPI (v2/v3) specs and protobuf service definitions to discover API endpoints
- Detecting HTTP client calls (requests, axios, fetch, HttpClient, etc.) across 6 languages
- Matching HTTP calls to API endpoints via URL path fuzzy matching
- Linking handler functions to endpoints via operationId matching
- Integrating as Pass 3b in the GraphUpdater pipeline

This enables querying cross-service dependencies like:
"Which modules call the UserService API?" or
"What endpoints does this microservice expose?"

https://claude.ai/code/session_01Y3DTmZcZuDfNnVLz2DASaf
Massively expands test coverage for all 3 new modules:

test_api_schema_parser.py (93 tests, 9 classes):
- APIEndpoint/GRPCMethod/ServiceSpec dataclass tests (creation, defaults, frozen, equality, hashing)
- is_openapi_file: all extensions (.yaml/.yml/.json), all patterns (openapi/swagger/api-spec/api_spec), case insensitivity, rejection of non-matching files
- is_proto_file: .proto detection, case insensitivity, rejection
- _load_yaml_or_json: valid JSON/YAML, invalid files, nonexistent files, empty files, nested data, PyYAML unavailable fallback
- parse_openapi_spec: OpenAPI 3.0, Swagger 2.0, all 7 HTTP methods, missing keys, non-dict data, non-dict paths/path_items/operations, non-HTTP methods skipped, missing optional fields, title fallback to stem, large specs (200 endpoints), YAML specs, source_path, qualified_name format
- parse_proto_file: simple/multiple services, stream responses, field values, qualified names, whitespace variations, comments, empty/missing services, many RPCs, source_path
- _sanitize_service_name: spaces, special chars, empty, unicode, dots, slashes, hyphens, underscores, long names

test_http_call_detector.py (79 tests, 5 classes):
- HTTPCallSite dataclass (creation, frozen, equality, inequality)
- _extract_url_path: full HTTPS/HTTP URLs, path-only, colon/:id/angle/<id>/curly/{id} params, mixed styles, empty, trailing slash, host-only, deep paths, ports, relative paths, hyphens, underscores, dots, numeric segments, localhost, IP addresses, multiple params
- _normalize_http_method: all 7 standard methods, all UNKNOWN mappings (request/send/fetch/do/execute/exchange/newrequest/newcall), Spring aliases (getForObject/postForObject), case insensitivity, empty string, unknown method
- _find_http_calls_in_text: requests.get/post/put/delete/patch/head/options, single/double quotes, multiline source, multiple calls, alias matching, no URL, no match, caller_qn propagation, line numbers, httpx/axios modules, URL params, spaces before paren, empty source, regex escaping, Java RestTemplate, Go http.Get, path-only URLs

test_cross_service_linker.py (82 tests, 11 classes):
- _normalize_url_path: simple, strip slashes, params, multiple params, root, empty, complex paths
- _paths_match_fuzzy: exact match, param wildcards, different lengths, different segments, mixed static/param, root, single segment, many segments, empty
- CrossServiceLinker init and services property
- discover_api_specs: OpenAPI, proto, both, subdirectories, no specs, invalid specs, multiple files, directory skipping
- _register_service: service nodes, endpoint nodes, relationships, gRPC, storage
- _register_endpoint: all properties, path indexing, multiple same path
- _register_grpc_method: properties, relationships
- link_http_calls: exact/fuzzy match, method filter, UNKNOWN method, empty/root URL, multiple calls, relationship properties, mixed match/no-match
- link_handler_functions: function/method match by operationId, type labels, no match, empty opId, multiple matches, deep QN extraction
- _match_call_to_endpoint: exact, method filter, UNKNOWN, empty/root, no match, fuzzy with params, method mismatch fallback
- Integration: full OpenAPI workflow, full proto workflow, mixed REST+gRPC

https://claude.ai/code/session_01Y3DTmZcZuDfNnVLz2DASaf
Adds exhaustive cross-language test coverage for _find_http_calls_in_text
across all 6 supported languages and every module/method in HTTP_CLIENT_PATTERNS:

Python (5 libraries, 27 tests):
- requests: get/post/put/delete/patch/head/options/request
- httpx: get/post/put/delete/patch/request + client alias
- aiohttp: get/post/delete via session alias
- urllib3: request
- urllib.request: get

JavaScript (6 libraries, 19 tests):
- axios: get/post/put/delete/patch/head/request
- got: get/post/delete
- superagent: get/post/put
- undici: request/fetch
- node-fetch: fetch/get
- fetch: standalone (no dot pattern)

TypeScript (4 tests):
- Same modules as JS: axios get/post, got get, superagent delete

Java (5 libraries, 11 tests):
- HttpClient: send
- RestTemplate: getForObject/postForObject/exchange/execute
- WebClient: get/post
- OkHttpClient: newCall
- Retrofit: execute

Go (2 libraries, 7 tests):
- net/http: Get/Post/Do/NewRequest
- resty: Get/Post/Do

Rust (3 libraries, 12 tests):
- reqwest: get/post/put/delete/send/request
- hyper: get/request/send
- surf: get/post/delete

Meta-tests (5 tests):
- Constants validation: all 6 languages present, all modules/methods match
- Cross-language URL extraction consistency
- Path parameter normalization across param styles

https://claude.ai/code/session_01Y3DTmZcZuDfNnVLz2DASaf
Tests realistic microservice architectures where services in different
languages call each other's API endpoints:

TestCrossLanguagePythonCallingJavaService (6 tests):
- Python requests/httpx/aiohttp calling Java-defined OpenAPI endpoints
- GET, POST, DELETE, parameterized paths, nested resources
- Multiple Python HTTP libraries hitting the same Java endpoint

TestCrossLanguageGoCallingPythonService (5 tests):
- Go net/http and resty calling Python FastAPI/Flask endpoints
- Parameterized endpoints, UNKNOWN method from http.Do()

TestCrossLanguageTypeScriptCallingGoService (6 tests):
- TypeScript axios/got/superagent calling Go backend API
- PATCH, DELETE with path params, nested order items

TestCrossLanguageJavaCallingRustService (4 tests):
- Java RestTemplate/WebClient/HttpClient/OkHttpClient calling Rust auth service
- UNKNOWN method fallback matching

TestCrossLanguageRustCallingTypeScriptService (4 tests):
- Rust reqwest/hyper calling TypeScript notification service
- Parameterized DELETE, batch endpoints

TestCrossLanguageMultiServiceMesh (8 tests):
- 4-service mesh: Java user-svc + Go order-svc + Rust payment-svc + TS notif-svc
- Python gateway calling all 4 services (10 calls linked)
- Go->Java, Java->Rust, Rust->TS, TS->Go cross-service calls
- Handler functions across all 4 languages (11 matched)
- Full mesh: 8 cross-service calls + 4 handlers in one test
- Library field tracks source language (requests/net/http/RestTemplate/reqwest/axios)

TestCrossLanguageProtoServices (5 tests):
- gRPC proto service + REST gateway coexisting
- Python/Go calling REST gateway that fronts gRPC service
- Handler linking to both REST and gRPC definitions

https://claude.ai/code/session_01Y3DTmZcZuDfNnVLz2DASaf
…ions and TS generics

- Strip trailing underscores/hyphens from sanitized service names so all-special-char
  inputs correctly fall back to "unknown_service"
- Fix HTTP call regex to capture all quoted args and skip HTTP method strings (GET, POST, etc.)
  when extracting URLs, fixing urllib3 and Go net/http detection
- Support TypeScript generic syntax (e.g. axios.get<User[]>(...)) in HTTP call pattern matching
- Update test expectation for trailing underscore stripping

https://claude.ai/code/session_01YQE91SdLeXwBXy835P9UwA
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant new capability to model and understand inter-service communication within a codebase. It enables the system to automatically discover API definitions from OpenAPI and Protobuf specifications, identify where these APIs are called from different parts of the codebase, and link these calls to the defined endpoints and their corresponding handler functions. This enhancement provides a clearer, more structured view of how different services interact, which is crucial for architectural analysis, impact assessment, and understanding complex microservice environments.

Highlights

  • New Graph Entities for API Modeling: Introduced new NodeLabel enums (SERVICE, API_ENDPOINT) and RelationshipType enums (EXPOSES_ENDPOINT, HANDLES_ENDPOINT, CALLS_ENDPOINT) to represent services, API endpoints, and their interactions within the code graph.
  • Cross-Service API Linking Feature: Added a comprehensive feature for discovering API specifications (OpenAPI, Protobuf), detecting HTTP/gRPC calls in source code across multiple languages, and linking these calls to their corresponding API endpoints and handler functions in the graph.
  • API Schema Parsing: Implemented a new module (api_schema_parser.py) capable of parsing OpenAPI (YAML/JSON) and Protobuf files to extract service and endpoint definitions, including HTTP methods, URL paths, operation IDs, and gRPC method details.
  • HTTP Call Detection: Developed a new module (http_call_detector.py) to identify HTTP calls within various programming languages (Python, JavaScript, TypeScript, Java, Go, Rust) by recognizing common HTTP client libraries and their method patterns, and extracting relevant call details like URL patterns and HTTP methods.
  • Integration into Graph Update Process: Integrated the new API discovery and linking logic into the main graph update workflow (graph_updater.py), ensuring that cross-service communication is automatically mapped and represented in the code graph.
  • Extensive Test Coverage: Added new, detailed unit tests for API schema parsing, HTTP call detection, and the cross-service linking logic, including scenarios for various languages and complex microservice interactions.
Changelog
  • codebase_rag/constants.py
    • Added new NodeLabel enums: SERVICE and API_ENDPOINT.
    • Added new RelationshipType enums: EXPOSES_ENDPOINT, HANDLES_ENDPOINT, and CALLS_ENDPOINT.
    • Introduced new constants for API schema file extensions, OpenAPI spec keys, Protobuf service definition keys, HTTP client library patterns for various languages, API endpoint property keys, and API protocol values.
  • codebase_rag/graph_updater.py
    • Imported HTTPCallSite and detect_http_calls_in_source from the new http_call_detector module.
    • Added a new _process_cross_service_calls method to orchestrate API specification discovery and linking of HTTP calls and handler functions.
    • Integrated the _process_cross_service_calls method into the main run method, ensuring it executes as part of the graph update process.
  • codebase_rag/parsers/api_schema_parser.py
    • Added a new file defining data structures (APIEndpoint, GRPCMethod, ServiceSpec) for representing API endpoints and services.
    • Implemented functions (is_openapi_file, is_proto_file) to identify API specification files.
    • Developed parsing logic (parse_openapi_spec, parse_proto_file) to extract detailed information from OpenAPI and Protobuf definitions.
  • codebase_rag/parsers/cross_service_linker.py
    • Added a new file containing the CrossServiceLinker class, responsible for discovering API specifications, registering services and endpoints in the graph, and linking detected HTTP calls and handler functions to these API endpoints.
  • codebase_rag/parsers/factory.py
    • Imported CrossServiceLinker.
    • Added _cross_service_linker to the __slots__ of ProcessorFactory.
    • Added a new property cross_service_linker to ProcessorFactory to provide a singleton instance of CrossServiceLinker.
  • codebase_rag/parsers/http_call_detector.py
    • Added a new file defining the HTTPCallSite dataclass to store details about detected HTTP calls.
    • Implemented detect_http_calls_in_source to scan source code using Tree-sitter and regex for HTTP client library calls.
    • Developed helper functions (_find_http_calls_in_text, _extract_url_path, _normalize_http_method) to extract URL patterns and normalize HTTP methods from detected calls.
  • codebase_rag/tests/test_api_schema_parser.py
    • Added a new file with unit tests for APIEndpoint, GRPCMethod, ServiceSpec dataclasses.
    • Included tests for is_openapi_file, is_proto_file, _load_yaml_or_json, parse_openapi_spec, parse_proto_file, and _sanitize_service_name.
  • codebase_rag/tests/test_cross_service_linker.py
    • Added a new file with unit tests for _normalize_url_path, _paths_match_fuzzy, and the CrossServiceLinker class.
    • Included comprehensive integration tests covering discovery, registration, and linking of API specs and calls across multiple languages and microservice scenarios.
  • codebase_rag/tests/test_http_call_detector.py
    • Added a new file with unit tests for HTTPCallSite dataclass.
    • Included tests for _extract_url_path, _normalize_http_method, and _find_http_calls_in_text.
    • Provided extensive cross-language tests to ensure HTTP call detection works correctly for Python, JavaScript, TypeScript, Java, Go, and Rust client libraries.
  • uv.lock
    • Updated the version of code-graph-rag from 0.0.115 to 0.0.116.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant new feature for discovering and linking cross-service API calls. It adds parsers for OpenAPI and Protobuf specifications, detects HTTP call sites across multiple languages, and creates new nodes (Service, ApiEndpoint) and relationships (CALLS_ENDPOINT, HANDLES_ENDPOINT, etc.) in the graph. The implementation is well-structured and includes a comprehensive suite of new tests. The overall approach is solid and the feature is a great addition. I have a couple of suggestions to improve the robustness of the regex-based parsing for protobuf files and multi-line HTTP calls.

Comment on lines +157 to +159
rpc_pattern = re.compile(
r"rpc\s+(\w+)\s*\(\s*(\w+)\s*\)\s*returns\s*\(\s*(stream\s+)?(\w+)\s*\)"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The regex pattern for parsing RPC definitions assumes that request and response types are simple, single-word identifiers (\w+). This will fail to parse RPCs that use qualified type names, such as google.protobuf.Empty or my.package.MyRequest. To make this more robust, the pattern should allow for dots in the type names.

Suggested change
rpc_pattern = re.compile(
r"rpc\s+(\w+)\s*\(\s*(\w+)\s*\)\s*returns\s*\(\s*(stream\s+)?(\w+)\s*\)"
)
rpc_pattern = re.compile(
r"rpc\s+(\w+)\s*\(\s*([\w.]+)\s*\)\s*returns\s*\(\s*(stream\s+)?([\w.]+)\s*\)"
)

Comment on lines +61 to +113
def _find_http_calls_in_text(
source_text: str,
module_name: str,
method_name: str,
module_qn: str,
file_path: str,
) -> list[HTTPCallSite]:
results: list[HTTPCallSite] = []

escaped_mod = re.escape(module_name)
escaped_method = re.escape(method_name)

# Match patterns like: requests.get("url"), httpx.post(url), axios.get(url)
# Also match: client.get("url") where client was imported from the module
# Allow optional TypeScript generics between method name and opening paren
pattern = re.compile(
rf"(?:{escaped_mod}|[a-zA-Z_]\w*)\.{escaped_method}\s*(?:<[^>]*>)?\s*\("
rf"([^)]*)\)",
re.MULTILINE,
)

# Extract all quoted strings from an argument list
_quoted_re = re.compile(r"""["']([^"']*)["']""")

# HTTP method names that should be skipped when looking for URLs
_http_methods = {"GET", "POST", "PUT", "DELETE", "PATCH", "HEAD", "OPTIONS"}

lines = source_text.split("\n")
for line_num, line in enumerate(lines, start=1):
for match in pattern.finditer(line):
args_text = match.group(1) or ""
# Find the first quoted string that looks like a URL (not an HTTP method)
url = ""
for qm in _quoted_re.finditer(args_text):
candidate = qm.group(1)
if candidate.upper() not in _http_methods:
url = candidate
break
url_path = _extract_url_path(url)
http_method = _normalize_http_method(method_name)

results.append(
HTTPCallSite(
caller_qualified_name=module_qn,
http_method=http_method,
url_pattern=url_path,
library=module_name,
line_number=line_num,
file_path=file_path,
)
)

return results
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation for detecting HTTP calls iterates through the source code line by line. This means it will not detect calls that are formatted to span multiple lines, which is a common pattern for improving readability. To handle these cases, the regex pattern should be applied to the entire source text at once, rather than on a per-line basis. The line number for each match can then be calculated from the match's starting position in the full text.

def _find_http_calls_in_text(
    source_text: str,
    module_name: str,
    method_name: str,
    module_qn: str,
    file_path: str,
) -> list[HTTPCallSite]:
    results: list[HTTPCallSite] = []

    escaped_mod = re.escape(module_name)
    escaped_method = re.escape(method_name)

    # Match patterns like: requests.get("url"), httpx.post(url), axios.get(url)
    # Also match: client.get("url") where client was imported from the module
    # Allow optional TypeScript generics between method name and opening paren
    pattern = re.compile(
        rf"(?:{escaped_mod}|[a-zA-Z_]\w*)\.{escaped_method}\s*(?:<[^>]*>)?\s*\("
        rf"([^)]*)\)",
        re.MULTILINE,
    )

    # Extract all quoted strings from an argument list
    _quoted_re = re.compile(r"""["']([^"]*)["']""")

    # HTTP method names that should be skipped when looking for URLs
    _http_methods = {"GET", "POST", "PUT", "DELETE", "PATCH", "HEAD", "OPTIONS"}

    for match in pattern.finditer(source_text):
        line_num = source_text.count('\n', 0, match.start()) + 1
        args_text = match.group(1) or ""
        # Find the first quoted string that looks like a URL (not an HTTP method)
        url = ""
        for qm in _quoted_re.finditer(args_text):
            candidate = qm.group(1)
            if candidate.upper() not in _http_methods:
                url = candidate
                break
        url_path = _extract_url_path(url)
        http_method = _normalize_http_method(method_name)

        results.append(
            HTTPCallSite(
                caller_qualified_name=module_qn,
                http_method=http_method,
                url_pattern=url_path,
                library=module_name,
                line_number=line_num,
                file_path=file_path,
            )
        )

    return results

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 7, 2026

Greptile Summary

This PR introduces cross-service API detection as a new analysis pass ("Pass 3b") in the graph-building pipeline, despite the title only mentioning a uv.lock update. Three new source files are added — an OpenAPI/Protobuf schema parser, an HTTP call-site detector, and a linker that creates SERVICE, API_ENDPOINT, EXPOSES_ENDPOINT, HANDLES_ENDPOINT, and CALLS_ENDPOINT graph relationships — along with thorough unit tests.

Key findings:

  • PR title does not follow Conventional Commits format — "Update uv.lock for version 0.0.116" does not describe the actual changes and lacks a type prefix (e.g. feat:). The required regex is ^(build|chore|ci|docs|feat|fix|perf|p?refactor|revert|style|test)(\([a-zA-Z0-9_-]+\))?!?: .+$.
  • Critical false-positive in HTTP detection regex (http_call_detector.py:77) — the [a-zA-Z_]\w* catch-all alternative causes the detector to match any .get(), .post(), etc. call in a file, not just calls on HTTP client objects, flooding the graph with spurious CALLS_ENDPOINT edges.
  • Hardcoded property-key strings (cross_service_linker.py:136-137) — "library" and "line_number" should be KEY_* constants in constants.py.
  • Inline comments without (H) marker — numerous comments in both new implementation files violate the no-inline-comments policy.
  • Loose dict type for HTTP_CLIENT_PATTERNS in constants.py should use a TypedDict per the strict-typing standard.

Confidence Score: 2/5

  • Not safe to merge — the HTTP detection regex produces broad false positives that would corrupt the graph with spurious relationships, and the PR title violates the required Conventional Commits convention.
  • The false-positive logic bug in _find_http_calls_in_text is a real functional defect that would affect every repository analysed (any .get() or .post() call anywhere would be flagged as an HTTP client call). The hardcoded strings and comment-policy violations are policy issues that the pre-commit hooks should also catch. The PR title mismatch with Conventional Commits is a process violation.
  • codebase_rag/parsers/http_call_detector.py (regex false-positive), codebase_rag/parsers/cross_service_linker.py (hardcoded strings), codebase_rag/constants.py (loose dict type)

Important Files Changed

Filename Overview
codebase_rag/parsers/http_call_detector.py New file that detects HTTP call sites via regex. The detection regex has a critical false-positive issue: the [a-zA-Z_]\w* catch-all alternative matches any object method (e.g. dict.get()), not just HTTP client calls. Also contains multiple inline comments without (H) prefix.
codebase_rag/parsers/cross_service_linker.py New file that registers API services/endpoints in the graph and links HTTP calls to endpoints. Logic is sound, but uses two hardcoded property-key strings ("library", "line_number") and contains inline comments without the required (H) prefix.
codebase_rag/parsers/api_schema_parser.py New file parsing OpenAPI/Swagger JSON/YAML specs and Protobuf files into ServiceSpec dataclasses. Logic is clean; minor style note that all logger.* calls use hardcoded strings rather than importing from logs.py.
codebase_rag/parsers/factory.py Adds a lazily-initialised cross_service_linker property to ProcessorFactory, following the same pattern as existing processors. Clean change.
codebase_rag/graph_updater.py Adds a new "Pass 3b" in the analysis pipeline that discovers API specs, detects HTTP calls across the AST cache, and links them. The log message "--- Pass 3b: Cross-service API linking ---" is a hardcoded string; it should live in logs.py.
codebase_rag/constants.py Adds new NodeLabel/RelationshipType enum members, KEY_* property constants, and HTTP_CLIENT_PATTERNS. The HTTP_CLIENT_PATTERNS uses a loose dict[str, dict[str, list[str]]] type that should be a TypedDict.
codebase_rag/tests/test_api_schema_parser.py Comprehensive test file for the new API schema parser. Tests cover all major paths, edge cases, and happy paths with good parametrisation.
codebase_rag/tests/test_cross_service_linker.py Tests for the cross-service linker. Coverage looks solid for registration and matching logic.
codebase_rag/tests/test_http_call_detector.py Tests for HTTP call detection. Given the overly-broad regex in the production code, some tests may pass while hiding false-positive behaviour in real codebases.

Sequence Diagram

sequenceDiagram
    participant GU as GraphUpdater
    participant CSL as CrossServiceLinker
    participant ASP as api_schema_parser
    participant HCD as http_call_detector
    participant ING as Ingestor (Neo4j)

    GU->>CSL: discover_api_specs()
    CSL->>ASP: is_openapi_file() / is_proto_file()
    ASP-->>CSL: ServiceSpec (endpoints / grpc_methods)
    CSL->>ING: ensure_node_batch(SERVICE)
    CSL->>ING: ensure_node_batch(API_ENDPOINT)
    CSL->>ING: ensure_relationship_batch(EXPOSES_ENDPOINT)

    GU->>HCD: detect_http_calls_in_source(file, AST, language, module_qn)
    HCD-->>GU: list[HTTPCallSite]

    GU->>CSL: link_http_calls(all_http_calls)
    CSL->>CSL: _match_call_to_endpoint() [exact + fuzzy path match]
    CSL->>ING: ensure_relationship_batch(CALLS_ENDPOINT)

    GU->>CSL: link_handler_functions(function_registry_items)
    CSL->>ING: ensure_relationship_batch(HANDLES_ENDPOINT)
Loading

Last reviewed commit: 61f8ca9

Comment on lines +76 to +80
pattern = re.compile(
rf"(?:{escaped_mod}|[a-zA-Z_]\w*)\.{escaped_method}\s*(?:<[^>]*>)?\s*\("
rf"([^)]*)\)",
re.MULTILINE,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overly broad regex matches any object method, not just HTTP clients

The alternation (?:{escaped_mod}|[a-zA-Z_]\w*) allows any arbitrary identifier before the dot. This means when scanning a Python file for requests.get, the pattern will also match my_dict.get("key"), config.get("host"), os.environ.get("VAR"), and any other .get() call in the codebase — not just HTTP client calls.

For example, with module_name="requests" and method_name="get", the compiled pattern is:

(?:requests|[a-zA-Z_]\w*)\.get\s*...\(([^)]*)\)

This matches os.environ.get("BASE_URL") because os is [a-zA-Z_]\w*. Every .get(), .post(), .delete(), etc. call in the entire codebase will be reported as an HTTP call site, flooding the graph with false CALLS_ENDPOINT relationships.

To reduce false positives the pattern should require that the identifier before the dot was previously matched as an import from one of the target modules (e.g. import requests → accept requests.get), or at minimum require the module name to appear literally: (?:{escaped_mod})\.{escaped_method} without the catch-all [a-zA-Z_]\w* branch.

Prompt To Fix With AI
This is a comment left during a code review.
Path: codebase_rag/parsers/http_call_detector.py
Line: 76-80

Comment:
**Overly broad regex matches any object method, not just HTTP clients**

The alternation `(?:{escaped_mod}|[a-zA-Z_]\w*)` allows any arbitrary identifier before the dot. This means when scanning a Python file for `requests.get`, the pattern will also match `my_dict.get("key")`, `config.get("host")`, `os.environ.get("VAR")`, and any other `.get()` call in the codebase — not just HTTP client calls.

For example, with `module_name="requests"` and `method_name="get"`, the compiled pattern is:
```
(?:requests|[a-zA-Z_]\w*)\.get\s*...\(([^)]*)\)
```

This matches `os.environ.get("BASE_URL")` because `os` is `[a-zA-Z_]\w*`. Every `.get()`, `.post()`, `.delete()`, etc. call in the entire codebase will be reported as an HTTP call site, flooding the graph with false `CALLS_ENDPOINT` relationships.

To reduce false positives the pattern should require that the identifier before the dot was previously matched as an import from one of the target modules (e.g. `import requests` → accept `requests.get`), or at minimum require the module name to appear literally: `(?:{escaped_mod})\.{escaped_method}` without the catch-all `[a-zA-Z_]\w*` branch.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +134 to +138
properties={
cs.KEY_HTTP_METHOD: call.http_method,
"library": call.library,
"line_number": call.line_number,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded string literals should be constants

"library" and "line_number" are raw string literals used as property keys in a non-constants file. Per the project coding standards, almost no string literals should appear in files outside constants.py/logs.py/etc. These should be defined in constants.py alongside the other KEY_* property constants (e.g. KEY_HTTP_METHOD) and imported via the cs alias.

Suggested change
properties={
cs.KEY_HTTP_METHOD: call.http_method,
"library": call.library,
"line_number": call.line_number,
},
properties={
cs.KEY_HTTP_METHOD: call.http_method,
cs.KEY_LIBRARY: call.library,
cs.KEY_LINE_NUMBER: call.line_number,
},

Add KEY_LIBRARY = "library" and KEY_LINE_NUMBER = "line_number" in constants.py.

Rule Used: ## Technical Requirements

Agentic Framework

-... (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: codebase_rag/parsers/cross_service_linker.py
Line: 134-138

Comment:
**Hardcoded string literals should be constants**

`"library"` and `"line_number"` are raw string literals used as property keys in a non-constants file. Per the project coding standards, almost no string literals should appear in files outside `constants.py`/`logs.py`/etc. These should be defined in `constants.py` alongside the other `KEY_*` property constants (e.g. `KEY_HTTP_METHOD`) and imported via the `cs` alias.

```suggestion
                    properties={
                        cs.KEY_HTTP_METHOD: call.http_method,
                        cs.KEY_LIBRARY: call.library,
                        cs.KEY_LINE_NUMBER: call.line_number,
                    },
```
Add `KEY_LIBRARY = "library"` and `KEY_LINE_NUMBER = "line_number"` in `constants.py`.

**Rule Used:** ## Technical Requirements

### Agentic Framework
-... ([source](https://app.greptile.com/review/custom-context?memory=d4240b05-b763-467a-a6bf-94f73e8b6859))

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +73 to +75
# Match patterns like: requests.get("url"), httpx.post(url), axios.get(url)
# Also match: client.get("url") where client was imported from the module
# Allow optional TypeScript generics between method name and opening paren
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline comments without (H) marker

Multiple inline comments in both new files lack the required (H) prefix. Per the project's comment policy, all inline comments must either be (H)-prefixed, type-annotation comments, or appear before any code.

All occurrences in codebase_rag/parsers/http_call_detector.py:

  • http_call_detector.py:73-75 — pattern description block
  • http_call_detector.py:82# Extract all quoted strings from an argument list
  • http_call_detector.py:85# HTTP method names that should be skipped when looking for URLs
  • http_call_detector.py:92# Find the first quoted string that looks like a URL (not an HTTP method)
  • http_call_detector.py:120# Strip protocol and host to get just the path
  • http_call_detector.py:123-126 — normalize path parameter comments
  • http_call_detector.py:128-129 — f-string and concatenation comments

All occurrences in codebase_rag/parsers/cross_service_linker.py:

  • cross_service_linker.py:103# Index by normalized path for matching
  • cross_service_linker.py:186# Exact path match
  • cross_service_linker.py:195# Fuzzy path match (ignore path parameters)
  • cross_service_linker.py:209# Replace path parameters with a placeholder

Either prefix each with # (H) or remove them entirely if the code is self-documenting.

Rule Used: ## Technical Requirements

Agentic Framework

-... (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: codebase_rag/parsers/http_call_detector.py
Line: 73-75

Comment:
**Inline comments without `(H)` marker**

Multiple inline comments in both new files lack the required `(H)` prefix. Per the project's comment policy, all inline comments must either be `(H)`-prefixed, type-annotation comments, or appear before any code.

All occurrences in `codebase_rag/parsers/http_call_detector.py`:
- `http_call_detector.py:73-75` — pattern description block
- `http_call_detector.py:82``# Extract all quoted strings from an argument list`
- `http_call_detector.py:85``# HTTP method names that should be skipped when looking for URLs`
- `http_call_detector.py:92``# Find the first quoted string that looks like a URL (not an HTTP method)`
- `http_call_detector.py:120``# Strip protocol and host to get just the path`
- `http_call_detector.py:123-126` — normalize path parameter comments
- `http_call_detector.py:128-129` — f-string and concatenation comments

All occurrences in `codebase_rag/parsers/cross_service_linker.py`:
- `cross_service_linker.py:103``# Index by normalized path for matching`
- `cross_service_linker.py:186``# Exact path match`
- `cross_service_linker.py:195``# Fuzzy path match (ignore path parameters)`
- `cross_service_linker.py:209``# Replace path parameters with a placeholder`

Either prefix each with `# (H)` or remove them entirely if the code is self-documenting.

**Rule Used:** ## Technical Requirements

### Agentic Framework
-... ([source](https://app.greptile.com/review/custom-context?memory=d4240b05-b763-467a-a6bf-94f73e8b6859))

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

PROTO_RPC_KEYWORD = "rpc"

# (H) HTTP client library patterns per language for detecting cross-service calls
HTTP_CLIENT_PATTERNS: dict[str, dict[str, list[str]]] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loose dict type should be a TypedDict

dict[str, dict[str, list[str]]] is a loose, untyped shape. Per the project's strict-typing standard, this should be expressed as a TypedDict so the "modules" and "methods" keys are known to the type checker:

# In types_defs.py or constants.py
class HTTPClientPatterns(TypedDict):
    modules: list[str]
    methods: list[str]

HTTP_CLIENT_PATTERNS: dict[str, HTTPClientPatterns] = { ... }

This also means the patterns["modules"] / patterns["methods"] accesses in http_call_detector.py (lines 41–42) become type-safe.

Rule Used: ## Technical Requirements

Agentic Framework

-... (source)

Prompt To Fix With AI
This is a comment left during a code review.
Path: codebase_rag/constants.py
Line: 1631

Comment:
**Loose dict type should be a TypedDict**

`dict[str, dict[str, list[str]]]` is a loose, untyped shape. Per the project's strict-typing standard, this should be expressed as a TypedDict so the `"modules"` and `"methods"` keys are known to the type checker:

```python
# In types_defs.py or constants.py
class HTTPClientPatterns(TypedDict):
    modules: list[str]
    methods: list[str]

HTTP_CLIENT_PATTERNS: dict[str, HTTPClientPatterns] = { ... }
```

This also means the `patterns["modules"]` / `patterns["methods"]` accesses in `http_call_detector.py` (lines 41–42) become type-safe.

**Rule Used:** ## Technical Requirements

### Agentic Framework
-... ([source](https://app.greptile.com/review/custom-context?memory=d4240b05-b763-467a-a6bf-94f73e8b6859))

How can I resolve this? If you propose a fix, please make it concise.

@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 98.52876% with 33 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
codebase_rag/parsers/http_call_detector.py 73.52% 18 Missing ⚠️
codebase_rag/graph_updater.py 42.85% 12 Missing ⚠️
codebase_rag/parsers/api_schema_parser.py 98.29% 2 Missing ⚠️
codebase_rag/parsers/cross_service_linker.py 99.05% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@sonarqubecloud
Copy link

sonarqubecloud bot commented Mar 7, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants