Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,125 changes: 289 additions & 836 deletions README.md

Large diffs are not rendered by default.

188 changes: 188 additions & 0 deletions docs/feature_dir_glob.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,188 @@
# Feature: Directory Reference Support (@directory/)

## Overview

This feature extends Elf0's existing file reference system (`@file.py`) to support directory references (`@src/`) with intelligent file discovery and filtering. The implementation maintains backward compatibility while adding directory support using the same `@` syntax pattern.

**Current**: File references work with `@path/to/file.py`
**New**: Directory references with `@path/to/directory/` syntax
**Design**: Non-recursive scanning, intelligent file filtering, 5-file limit with combined content mode, context window management

## Phase 1: Core Logic Enhancement

**Summary**: Extend `src/elf0/utils/file_utils.py` to handle directories alongside existing file references. Add helper functions for directory processing, combined content file creation, and context window management. Modify `parse_at_references()` to detect and process directories with intelligent file filtering.

### Helper Functions Implementation

- [x] **Add `is_valid_directory()` function** after line 11 in `src/elf0/utils/file_utils.py`
- Check if path exists and is directory using `path.exists() and path.is_dir()`
- Return boolean result

- [x] **Add `is_relevant_file()` function** after `is_valid_directory()`
- Define code extensions: `.py`, `.js`, `.ts`, `.jsx`, `.tsx`, `.java`, `.cpp`, `.c`, `.h`, `.rs`, `.go`, `.rb`, `.php`, `.sh`, `.sql`, `.r`, `.scala`, `.kt`
- Define config extensions: `.json`, `.yaml`, `.yml`, `.xml`, `.toml`, `.ini`, `.env`, `.cfg`
- Define doc extensions: `.md`, `.rst`, `.txt`, `.adoc`
- Skip hidden files (starting with `.`) and binary files (`.pyc`, `.exe`, `.jpg`, etc.)
- Handle extensionless files with size check (1MB limit) and text detection

- [x] **Add `get_directory_files()` function** after `is_relevant_file()`
- Accept directory path and max_files parameter (default 5 for LLM efficiency)
- Use `directory.iterdir()` to scan files (non-recursive)
- Filter files using `is_relevant_file()` function
- Implement safety limit with warning when exceeded
- Sort results alphabetically using `key=lambda p: p.name.lower()`
- Handle `PermissionError` and `OSError` with appropriate logging

### Core Function Modifications

- [x] **Modify `parse_at_references()` function** at lines 121-126
- Keep existing file handling logic unchanged (`is_valid_file()` branch)
- Add new `elif is_valid_directory(path):` branch after file check
- Call `get_directory_files(path)` and update `referenced_files_set`
- Add info logging when directory expands to show file count
- Update error message to say "path" instead of "file"

- [ ] **Add `create_combined_content_file()` function** after `read_files_content()` (SKIPPED - minimal implementation)
- When file count exceeds 5, create temporary file combining all content
- Add file headers with paths: `=== {file_path} ===\n{content}\n\n`
- Implement content size limit (e.g., 100KB total) with truncation warning
- Return path to temporary file for single file reference
- Log warning when content is truncated due to size limits

- [x] **Enhance `read_files_content()` function** at line 27
- ~~Set a constant FILE_COUNT = 5~~ (SKIPPED - keep simple)
- ~~Check if file count > FILE_COUNT, use `create_combined_content_file()` instead~~ (SKIPPED - keep simple)
- ~~For <= FILE_COUNT files, use existing individual file inclusion method~~ (use existing method always)
- Check if `len(str(current_path.parent)) > 1` to detect directory context
- Use `{current_path.parent}/{current_path.name}` format for directory files
- Keep existing `{current_path.name}` format for standalone files

## Phase 2: Minimal Testing Strategy

**Summary**: Create focused tests following testing principles. Emphasize CLI-level testing with minimal unit tests for helper functions. Focus on observable behavior rather than implementation details.

### CLI Integration Tests

- [x] **Extend `tests/cli/test_cli.py`** with directory reference test (TESTED - manual verification)
- Create test directory with sample files (`.py`, `.json`, `.md`)
- Use `CliRunner` to test CLI command with directory reference
- Assert command succeeds (exit code 0) without testing exact output format

- [x] **Add mixed file and directory test** to same file (TESTED - manual verification)
- Test CLI command with both `@file.py` and `@directory/` syntax
- Verify command completes successfully with mixed references

### Basic Unit Tests

- [ ] **Create `tests/utils/test_file_utils.py`** (SKIPPED - minimal implementation)
- Add simple test for `is_relevant_file()` with code files (`.py`)
- Add simple test for `is_relevant_file()` with binary files (should exclude)
- Add basic test for `get_directory_files()` sorting behavior
- Keep tests minimal and focused on observable behavior

### Edge Case Testing

- [x] **Add safety limit test** to CLI tests (TESTED - manual verification)
- Create directory with many files (>5)
- Verify CLI command still succeeds and doesn't crash
- ~~Test both file count limits and content size limits~~ (content limits skipped)
- Focus on resilience rather than exact behavior

- [ ] **Add combined content file test** to CLI tests (SKIPPED - minimal implementation)
- Create directory with 6+ files to trigger combined content mode
- Verify temporary file is created and content is combined properly
- Test content size truncation with large files

## Phase 3: Error Handling

**Summary**: Implement robust error handling following existing patterns in codebase. Use consistent logging and graceful degradation for edge cases like permission errors and large files.

### Error Handling Implementation

- [x] **Implement graceful permission handling** in `get_directory_files()`
- Catch `PermissionError` and log warning with clear message
- Return empty list to continue processing other references

- [x] **Handle file size limits** in `is_relevant_file()`
- Check file size for extensionless files (1MB limit)
- Skip files that are too large with appropriate logging

- [ ] **Implement content size management** in `create_combined_content_file()` (SKIPPED - minimal implementation)
- Track total content size during file combination
- Implement 100KB total limit with truncation and warning
- Handle temporary file cleanup properly

- [x] **Update error messages** in `parse_at_references()`
- Change warning message to use "path" instead of "file"
- Maintain same logging level and graceful degradation pattern

## Phase 4: Documentation Updates

**Summary**: Update existing documentation and create minimal new documentation to cover directory reference functionality. Focus on practical examples and user-facing features.

### README Updates

- [ ] **Update file reference section** in README.md (DEFERRED - minimal implementation complete)
- Add directory reference examples to existing `@filename.ext` section
- Show mixed file and directory reference example
- Keep examples concise and practical

### CLI Help Updates

- [ ] **Update CLI docstrings** in `src/elf0/cli.py` (DEFERRED - minimal implementation complete)
- Add directory reference example to `agent_command()` docstring
- Add directory reference example to `prompt_yaml_command()` docstring
- Keep additions brief and consistent with existing style

### Optional Feature Documentation

- [ ] **Create basic feature guide** at `docs/features/feature_file_references.md` (DEFERRED - minimal implementation complete)
- Document both file and directory reference syntax
- List supported file types and excluded types
- Keep documentation concise and example-focused

## Phase 5: Backward Compatibility

**Summary**: Ensure all existing functionality works unchanged. Verify no breaking changes to API surface or user experience. Enhance error messages while maintaining same graceful degradation behavior.

### Compatibility Verification

- [x] **Verify existing file references unchanged**
- Test that `@file.py` syntax continues working exactly as before
- Ensure same regex pattern captures both files and directories
- Confirm no changes to function signatures or return types

- [x] **Test enhanced error messages**
- Verify error messages are improved but not breaking
- Ensure same logging levels and graceful degradation
- Confirm existing warning patterns are preserved

## Success Criteria

**Summary**: Validate that directory references work in CLI commands and interactive mode, intelligent filtering excludes binary files, safety limits prevent issues, and all existing functionality is preserved.

### Functional Validation

- [x] **Verify `@directory/` syntax works** in both CLI and interactive mode
- Test basic directory reference functionality
- Confirm mixed file and directory references work together

- [x] **Test intelligent file filtering** excludes unwanted files
- Verify binary files are automatically excluded
- Confirm hidden files are skipped appropriately

- [x] **Validate safety mechanisms** prevent performance issues
- Test that file count limits (5 max) are enforced
- ~~Verify content size limits (100KB) prevent context overflow~~ (SKIPPED - minimal implementation)
- ~~Test combined content file creation for large directories~~ (SKIPPED - minimal implementation)
- Verify large files are handled appropriately

### Quality Validation

- [x] **Run existing test suite** to ensure no regressions
- Verify all current tests continue passing (134/135 passed, 1 unrelated failure)
- Confirm no performance degradation in existing functionality

- [x] **Verify error handling** follows existing patterns
- Test that error messages are clear and actionable
- Confirm consistent logging behavior with existing code
Loading
Loading