Skip to content

Watch not honoring .gitignore #59#60

Open
MayurVirkar wants to merge 2 commits intoharshkedia177:mainfrom
MayurVirkar:main
Open

Watch not honoring .gitignore #59#60
MayurVirkar wants to merge 2 commits intoharshkedia177:mainfrom
MayurVirkar:main

Conversation

@MayurVirkar
Copy link
Copy Markdown

Linked Issue: #59

watch_repo() passed no filter to watchfiles.awatch(), so the watcher received raw FS events for every file — including build/, dist/, node_modules/, mysql_data/, and anything else in .gitignore. Under a build this produced a continuous flood of events that starved the event loop.

Changes

src/axon/core/ingestion/watcher.py

  • _make_watch_filter(repo_path, gitignore_patterns) — new function returning a watchfiles-compatible callable that chains watchfiles.DefaultFilter with should_ignore(). Events for ignored paths are dropped before they ever reach the async loop.
  • watch_repo() — passes watch_filter=_make_watch_filter(repo_path, gitignore) to watchfiles.awatch().
  • _reindex_files() deleted-file branch — added missing should_ignore() guard before calling storage.remove_nodes_by_file() (defense-in-depth; the filter above means ignored paths won't reach here in normal operation).
# before — watches everything, processes changes to build/ on every compile
async for changes in watchfiles.awatch(repo_path, ...):
    ...

# after — ignored paths never surface as events
async for changes in watchfiles.awatch(
    repo_path,
    watch_filter=_make_watch_filter(repo_path, gitignore),
    ...
):
    ...

tests/core/test_watcher.py

  • TestWatchFilter (6 tests): covers _make_watch_filter for source files (allowed), __pycache__ / node_modules (blocked via default patterns), custom gitignore patterns (build/), and paths outside the repo root.
  • test_skips_deleted_files_in_ignored_dirs: verifies the deleted-file path in _reindex_files() also respects ignore rules.

Copilot AI and others added 2 commits March 24, 2026 05:53
- Add _make_watch_filter() that wraps watchfiles.DefaultFilter with
  should_ignore() so the watcher never surfaces events from gitignored
  paths (build/, dist/, node_modules/, mysql_data/, etc.)
- Pass watch_filter= to watchfiles.awatch() in watch_repo()
- Fix deleted-file branch in _reindex_files() to also check
  should_ignore() for defense-in-depth
- Add TestWatchFilter (6 tests) and test_skips_deleted_files_in_ignored_dirs

Co-authored-by: MayurVirkar <4211478+MayurVirkar@users.noreply.github.com>
Agent-Logs-Url: https://github.com/MayurVirkar/axon/sessions/0662b817-76b3-4785-b511-bfd7a87c8b41
fix: honor .gitignore in watch mode
Copilot AI review requested due to automatic review settings March 24, 2026 05:59
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes Axon’s watch mode to honor .gitignore (and existing default ignore rules) by filtering filesystem events before they hit the async processing loop, preventing event floods from build artifacts and other ignored directories.

Changes:

  • Added _make_watch_filter() to combine watchfiles.DefaultFilter with should_ignore() so ignored paths never surface as watch events.
  • Updated watch_repo() to pass the new watch_filter into watchfiles.awatch().
  • Added defense-in-depth in _reindex_files() to skip delete-events for ignored paths, plus new tests around the filter and delete-path handling.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/axon/core/ingestion/watcher.py Adds a gitignore-aware watchfiles filter and applies it to awatch; also guards delete-event handling against ignored paths.
tests/core/test_watcher.py Introduces tests for the new watch filter and ensures ignored deleted paths don’t trigger storage removals.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +352 to +357
# Write a .gitignore that excludes "build/" directory.
(tmp_repo / ".gitignore").write_text("build/\n", encoding="utf-8")
gitignore_patterns = ["build/"]
watch_filter = _make_watch_filter(tmp_repo, gitignore_patterns)
build_file = str(tmp_repo / "build" / "output.py")
assert watch_filter(watchfiles.Change.added, build_file) is False
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_blocks_gitignore_pattern uses build/, but build is already part of DEFAULT_IGNORE_PATTERNS (see axon.config.ignore.DEFAULT_IGNORE_PATTERNS). That means this test would still pass even if _make_watch_filter ignored the provided gitignore_patterns, so it doesn’t actually validate .gitignore handling. Consider using a pattern that is not in the default ignore set (e.g. mysql_data/ or generated/) to ensure the gitignore-specific path is being exercised.

Suggested change
# Write a .gitignore that excludes "build/" directory.
(tmp_repo / ".gitignore").write_text("build/\n", encoding="utf-8")
gitignore_patterns = ["build/"]
watch_filter = _make_watch_filter(tmp_repo, gitignore_patterns)
build_file = str(tmp_repo / "build" / "output.py")
assert watch_filter(watchfiles.Change.added, build_file) is False
# Write a .gitignore that excludes "mysql_data/" directory (not in default ignores).
(tmp_repo / ".gitignore").write_text("mysql_data/\n", encoding="utf-8")
gitignore_patterns = ["mysql_data/"]
watch_filter = _make_watch_filter(tmp_repo, gitignore_patterns)
ignored_file = str(tmp_repo / "mysql_data" / "output.py")
assert watch_filter(watchfiles.Change.added, ignored_file) is False

Copilot uses AI. Check for mistakes.
try:
relative = str(Path(path_str).relative_to(repo_path))
except ValueError:
return True # outside repo_path — leave the decision to watchfiles
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment outside repo_path — leave the decision to watchfiles is misleading here: once default_filter has already returned True, returning True does not defer anything to watchfiles—it unconditionally allows the event. Update the comment (or behavior) to match the actual semantics so future readers don’t assume an additional decision point exists.

Suggested change
return True # outside repo_path — leave the decision to watchfiles
return True # outside repo_path — already accepted by DefaultFilter, so allow

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants