fix: tighten types and move textcase to module scope in custom delimiter loading by Copilot · Pull Request #277 · knitli/codeweaver

Copilot · 2026-03-25T13:51:13Z

Review feedback on the custom delimiter loading implementation flagged loose typing (list[object]) and inline textcase imports on hot paths.

`delimiter.py`

_load_custom_delimiters() return type: list[object] → list[Delimiter]; local variable typed accordingly — removes the # type: ignore[assignment] at the call site in _load_delimiters_for_language
import textcase, import logging, ConfigLanguage, SemanticSearchLanguage moved to module scope; logger = logging.getLogger(__name__) promoted to module level

`selector.py`

import textcase moved to module scope
textcase: object parameter removed from _match_custom_ext_pair — callers no longer pass the module around; all # type: ignore[attr-defined] dropped
Fixed enum normalization for delim_lang: was textcase.snake(str(delim_lang)) which produces config_language_bash for ConfigLanguage.BASH; now uses .variable consistent with the _normalize helper:

# before — wrong for enum types
return textcase.snake(str(delim_lang))  # type: ignore[attr-defined]

# after
if isinstance(delim_lang, SemanticSearchLanguage | ConfigLanguage):
    return textcase.snake(delim_lang.variable)
return textcase.snake(str(delim_lang))

isinstance union syntax intentionally kept — CodeWeaver/ruff preferred idiom.

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

Summary by Sourcery

Tighten typing and module-level setup for custom delimiter loading and selection, and correct language normalization when using enum-based custom delimiters.

Bug Fixes:

Fix language normalization for enum-based custom delimiter languages so that their snake_case names are derived from the enum variable instead of its string representation.

Enhancements:

Narrow the return type of custom delimiter loading to use the Delimiter model explicitly and remove the need for type-ignore annotations.
Promote textcase, logging, and language-related imports and logger initialization to module scope in delimiter and selector modules to avoid repeated imports and parameter plumbing.
Simplify custom extension matching by no longer threading the textcase module through function parameters and relying on the shared module-level import instead.

…ter loading Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com> Agent-Logs-Url: https://github.com/knitli/codeweaver/sessions/0cc04151-09d9-4c2f-9eb2-9a1f233a187f

sourcery-ai · 2026-03-25T14:44:21Z

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Tightens typing in custom delimiter loading, promotes imports/logging to module scope to reduce hot-path overhead, and fixes enum normalization when resolving languages from custom delimiter extensions.

Class diagram for tightened delimiter loading types

classDiagram
    class Delimiter {
    }

    class SemanticSearchLanguage {
        string variable
    }

    class ConfigLanguage {
        string variable
    }

    class Governor {
        object settings
    }

    class Settings {
        list~object~ custom_delimiters
    }

    class ChunkerDelimiterModule {
        +Governor _governor
        +list~Delimiter~ _load_custom_delimiters(normalized_language str, language str)
        +list~Delimiter~ _load_delimiters_for_language(language str)
    }

    ChunkerDelimiterModule --> Governor : uses
    Governor --> Settings : has
    ChunkerDelimiterModule --> Delimiter : returns
    ChunkerDelimiterModule ..> SemanticSearchLanguage : normalizes
    ChunkerDelimiterModule ..> ConfigLanguage : normalizes

File-Level Changes

Change	Details	Files
Tighten typing for custom delimiter loading and remove ignore markers.	Change _load_custom_delimiters return type from list[object] to list[Delimiter] and type its local delimiters list accordingly. Remove the type: ignore[assignment] at the _load_delimiters_for_language call site now that types align. Keep the _normalize helper using SemanticSearchLanguage	ConfigLanguage union isinstance pattern for enum handling.
Move imports and logger creation to module scope to avoid repeated work on hot paths.	Hoist imports for textcase, logging, ConfigLanguage, and SemanticSearchLanguage from inside _load_custom_delimiters to the module level. Promote logger = logging.getLogger(name) to module level instead of creating it inside the function. Remove inline textcase import in _load_delimiters_for_language; rely on the module-scope import.	`src/codeweaver/engine/chunker/delimiter.py`
Simplify custom extension matching API and fix enum normalization in selector.	Move textcase import to module scope, stop threading a textcase parameter through _match_custom_ext_pair, and update _detect_language_from_custom_ext caller accordingly. Update delim_lang normalization to use delim_lang.variable when it is a SemanticSearchLanguage or ConfigLanguage enum, matching the _normalize helper semantics. Drop type: ignore[attr-defined] comments by using correctly-typed enum handling and direct textcase.snake calls.	`src/codeweaver/engine/chunker/selector.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey - I've found 2 issues, and left some high level feedback:

The SemanticSearchLanguage | ConfigLanguage normalization logic using textcase.snake now appears in both delimiter._normalize and _match_custom_ext_pair; consider extracting this into a shared helper to avoid divergence if the enum shape changes again.
Now that textcase is imported at module scope, you may want to handle the case where it is an optional dependency (e.g., by guarding import errors or providing a fallback) if consumers can run without it.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The `SemanticSearchLanguage | ConfigLanguage` normalization logic using `textcase.snake` now appears in both `delimiter._normalize` and `_match_custom_ext_pair`; consider extracting this into a shared helper to avoid divergence if the enum shape changes again.
- Now that `textcase` is imported at module scope, you may want to handle the case where it is an optional dependency (e.g., by guarding import errors or providing a fallback) if consumers can run without it.

## Individual Comments

### Comment 1
<location path="src/codeweaver/engine/chunker/delimiter.py" line_range="1292-1295" />
<code_context>

-        logger = logging.getLogger(__name__)
-
         def _normalize(lang: object) -> str:
             if isinstance(lang, SemanticSearchLanguage | ConfigLanguage):
                 return textcase.snake(lang.variable)
             return textcase.snake(str(lang))

-        delimiters: list[object] = []
</code_context>
<issue_to_address>
**suggestion:** Consider centralizing language normalization logic used across delimiter loading and selector matching.

This helper duplicates the normalization logic in `_match_custom_ext_pair` (handling `SemanticSearchLanguage | ConfigLanguage` via `lang.variable`, otherwise `str(lang)`). Please consider extracting a shared utility (e.g., in a small internal helper module) so the chunker and selector stay in sync if one of them changes.
</issue_to_address>

### Comment 2
<location path="src/codeweaver/engine/chunker/selector.py" line_range="255-264" />
<code_context>
                 continue
             delim_lang = getattr(custom_delim, "language", None)
             if delim_lang is not None:
-                return textcase.snake(str(delim_lang))  # type: ignore[attr-defined]
+                if isinstance(delim_lang, SemanticSearchLanguage | ConfigLanguage):
+                    return textcase.snake(delim_lang.variable)
+                return textcase.snake(str(delim_lang))
             pair_lang = getattr(pair, "language", None)
             if pair_lang is not None:
                 if isinstance(pair_lang, SemanticSearchLanguage | ConfigLanguage):
                     return pair_lang
-                return textcase.snake(str(pair_lang))  # type: ignore[attr-defined]
+                return textcase.snake(str(pair_lang))
         return None

</code_context>
<issue_to_address>
**question:** The return shape for `language` is asymmetric between `custom_delim.language` and `pair.language`, which may be surprising for callers.

Here, `delim_lang` of type `SemanticSearchLanguage | ConfigLanguage` is converted to a normalized `str`, while `pair_lang` of the same types is returned as the enum instance. Given the shared return type `SemanticSearchLanguage | ConfigLanguage | str | None`, this inconsistency can complicate caller logic. Consider either returning normalized strings in both cases or returning enum instances in both and handling normalization at a higher layer. If you keep this behavior, documenting the intent or tightening the return type would help clarify the contract.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-03-25T14:44:22Z

src/codeweaver/engine/chunker/delimiter.py

        def _normalize(lang: object) -> str:
            if isinstance(lang, SemanticSearchLanguage | ConfigLanguage):
                return textcase.snake(lang.variable)
            return textcase.snake(str(lang))


suggestion: Consider centralizing language normalization logic used across delimiter loading and selector matching.

This helper duplicates the normalization logic in _match_custom_ext_pair (handling SemanticSearchLanguage | ConfigLanguage via lang.variable, otherwise str(lang)). Please consider extracting a shared utility (e.g., in a small internal helper module) so the chunker and selector stay in sync if one of them changes.

src/codeweaver/engine/chunker/selector.py

Copilot

Pull request overview

This PR tightens typing and reduces repeated hot-path imports in the chunker custom-delimiter loading and selection logic.

Changes:

Tighten _load_custom_delimiters() typing (list[object] → list[Delimiter]) and remove the downstream # type: ignore[assignment].
Move textcase (and related imports/logger) to module scope in delimiter.py and selector.py.
Adjust enum normalization for custom delimiter language detection in ChunkerSelector.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
src/codeweaver/engine/chunker/selector.py	Moves `textcase` import to module scope and updates custom extension→language normalization.
src/codeweaver/engine/chunker/delimiter.py	Promotes imports/logger to module scope and tightens return types for custom delimiter loading.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/codeweaver/engine/chunker/selector.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Adam Poulemanos <89049923+bashandbone@users.noreply.github.com>

…and new-language detection (#275) * feat: Implement delimiter conversion for custom delimiters from settings Complete delimiter family system integration by implementing proper conversion from DelimiterPattern to Delimiter objects for custom delimiters loaded from user settings. Changes: - Remove TODO placeholder at delimiter.py:1289-1290 - Initialize delimiters list before custom delimiter processing - Add conversion loop using Delimiter.from_pattern() for custom delimiters - Maintain consistent pattern with family delimiter conversion The keyword matching system blocker (documented in previous analysis) has been resolved - _match_keyword_delimiters() was implemented at delimiter.py:417, enabling proper handling of delimiters with empty end strings and keyword-based structure detection. Resolves #99 Co-authored-by: Adam Poulemanos <bashandbone@users.noreply.github.com> * fix: address review feedback on custom delimiter loading Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com> Agent-Logs-Url: https://github.com/knitli/codeweaver/sessions/66279302-fae6-4887-8c87-7b6eadfac587 * fix: remove unused type: ignore comment in test Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com> Agent-Logs-Url: https://github.com/knitli/codeweaver/sessions/66279302-fae6-4887-8c87-7b6eadfac587 * fix: tighten types and move textcase to module scope in custom delimiter loading (#277) * Initial plan * fix: tighten types and move textcase to module scope in custom delimiter loading Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com> Agent-Logs-Url: https://github.com/knitli/codeweaver/sessions/0cc04151-09d9-4c2f-9eb2-9a1f233a187f * Update src/codeweaver/engine/chunker/selector.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Adam Poulemanos <89049923+bashandbone@users.noreply.github.com> --------- Signed-off-by: Adam Poulemanos <89049923+bashandbone@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/codeweaver/engine/chunker/selector.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Adam Poulemanos <89049923+bashandbone@users.noreply.github.com> --------- Signed-off-by: Adam Poulemanos <89049923+bashandbone@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Adam Poulemanos <bashandbone@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Initial plan

87bc81f

Copilot AI assigned Copilot and bashandbone Mar 25, 2026

Copilot AI mentioned this pull request Mar 25, 2026

feat: Implement custom delimiter loading with override, type safety, and new-language detection #275

Merged

Copilot started work on behalf of bashandbone March 25, 2026 13:51 View session

fix: tighten types and move textcase to module scope in custom delimi…

2eaecdd

…ter loading Co-authored-by: bashandbone <89049923+bashandbone@users.noreply.github.com> Agent-Logs-Url: https://github.com/knitli/codeweaver/sessions/0cc04151-09d9-4c2f-9eb2-9a1f233a187f

Copilot AI changed the title ~~[WIP] Implement custom delimiter loading with override and type safety~~ fix: tighten types and move textcase to module scope in custom delimiter loading Mar 25, 2026

Copilot AI requested a review from bashandbone March 25, 2026 14:02

Copilot finished work on behalf of bashandbone March 25, 2026 14:02

bashandbone approved these changes Mar 25, 2026

View reviewed changes

bashandbone marked this pull request as ready for review March 25, 2026 14:42

Copilot AI review requested due to automatic review settings March 25, 2026 14:42

Copilot started reviewing on behalf of bashandbone March 25, 2026 14:43 View session

sourcery-ai bot reviewed Mar 25, 2026

View reviewed changes

Copilot AI reviewed Mar 25, 2026

View reviewed changes

src/codeweaver/engine/chunker/selector.py Outdated Show resolved Hide resolved

Update src/codeweaver/engine/chunker/selector.py

29f23d2

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Adam Poulemanos <89049923+bashandbone@users.noreply.github.com>

bashandbone merged commit e0be205 into claude/issue-99-20260319-0103 Mar 25, 2026
6 checks passed

bashandbone deleted the copilot/sub-pr-275 branch March 25, 2026 19:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: tighten types and move textcase to module scope in custom delimiter loading#277

fix: tighten types and move textcase to module scope in custom delimiter loading#277
bashandbone merged 3 commits intoclaude/issue-99-20260319-0103from
copilot/sub-pr-275

Copilot AI commented Mar 25, 2026 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot commented Mar 25, 2026

Reviewer's Guide

Class diagram for tightened delimiter loading types

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Mar 25, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Mar 25, 2026 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

delimiter.py

selector.py

Summary by Sourcery

Uh oh!

sourcery-ai bot commented Mar 25, 2026

Reviewer's Guide

Class diagram for tightened delimiter loading types

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Mar 25, 2026 •

edited by sourcery-ai bot

Loading

`delimiter.py`

`selector.py`