Skip to content

extract: --source fs walker does not respect isSyncable prefix exclusions #202

@jamebobob

Description

@jamebobob

Bumped into this last night setting up a quarantine namespace (_pending/) for ambient captures from a signal-detector-style plugin. Extended isSyncable() in src/core/sync.ts to exclude the prefix and sync behaved. Then I ran gbrain extract all --source fs to check the graph layer and got "154 pages walked" on a brain that only has 61 authoritative pages. Spent a while convinced we had a quarantine leak before noticing extract uses a different walker.

src/commands/extract.ts::walkMarkdownFiles() skips leading-dot directories and leading-underscore files, but not leading-underscore directories. So _pending/originals/foo.md gets walked even when sync correctly skips it.

Nothing is getting corrupted. Extract writes don't land as real links because auto-link only creates edges between endpoints that exist in the DB, and quarantined content isn't indexed. But the page count is misleading and extract is doing work on content I asked it not to touch.

One-line fix, matching the file-level behavior at dir level:

for (const entry of readdirSync(d)) {
  if (entry.startsWith('.')) continue;
  if (entry.startsWith('_')) continue;  // new
  // ...
}

Or, less fragile long-term: lift a shared shouldWalk(path) helper out of isSyncable and have both walkers call it, so they can't drift apart again.

Related:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions