Feature Request: Improve File Discovery and Tool Usage for Grok Agent
Problem
Currently, Grok has limited file discovery and tool usage capabilities that make it less effective as an autonomous agent:
Issue 1: No Glob Pattern Expansion
The --files argument requires explicit paths. Glob patterns like *.py or **/*.py are passed as literal strings.
# Current behavior - FAILS:
--files "src/**/*.py"
# Workaround required:
--files src/foo.py src/bar.py src/baz.py # must list every file
Issue 2: No Recursive Directory Discovery
Grok cannot discover relevant files in a directory tree on its own. Claude Code must pre-discover files and pass them explicitly.
Issue 3: No Path Filtering
Even if Grok could scan directories, there's no mechanism to:
- Filter by file type (e.g., only
.py files)
- Exclude patterns (e.g.,
**/test_*.py, **/node_modules/**)
- Limit file count or size
Issue 4: Single File Context Mode
The current architecture sends all files as a single concatenated context. For large codebases, this:
- Consumes tokens inefficiently
- Makes Grok parse through irrelevant files
- Prevents selective file analysis
Impact
These limitations make it harder to use Grok as a true sub-agent because:
- Claude Code must do file discovery work that Grok could do itself
- Claude Code must pass file paths explicitly, increasing prompt size
- Large codebase analysis requires manual file selection
Proposed Solutions
Solution 1: Directory Scanning Mode
# New CLI argument
parser.add_argument("--scan-dir", type=str, help="Scan directory recursively for files matching patterns")
# Example usage:
--scan-dir ./src --include "*.py" --exclude "**/test_*.py" --max-files 100
Implementation in grok_bridge.py:
def scan_directory(
path: str,
include_patterns: List[str] = None,
exclude_patterns: List[str] = None,
max_files: int = 100,
extensions: List[str] = None
) -> List[str]:
"""Recursively discover files matching criteria."""
discovered = []
path_obj = Path(path)
if not path_obj.exists():
return []
for ext in (extensions or []):
for pattern in (include_patterns or ["*"]):
for match in path_obj.glob(f"**/{pattern}{ext}"):
if match.is_file():
# Apply exclusions
if any(excl in str(match) for excl in (exclude_patterns or [])):
continue
discovered.append(str(match))
return discovered[:max_files]
Solution 2: MCP Server Integration
Instead of file-based context, expose an MCP server that Grok can call to:
- List files in directory
- Read specific files on demand
- Write files to specified locations
# Conceptual MCP server for Grok bridge
class GrokBridgeMCP:
def list_files(self, path: str, pattern: str = "*") -> List[str]:
"""List files matching pattern in directory."""
def read_file(self, path: str, offset: int = 0, limit: int = None) -> str:
"""Read file content with optional pagination."""
def write_file(self, path: str, content: str) -> bool:
"""Write content to file."""
def glob(self, pattern: str, root: str = ".") -> List[str]:
"""Glob pattern matching."""
This would give Grok tool-calling ability while keeping file I/O local to the bridge.
Solution 3: Intelligent File Selection
Add a --auto-discover flag that uses heuristics:
- Look for
package.json, pyproject.toml, go.mod to identify project root
- Use language-specific patterns (e.g.,
**/*.py for Python projects)
- Apply
.gitignore-like exclusions
- Limit based on token budget
def auto_discover_files(target: str, token_budget: int = 500_000) -> List[str]:
"""Automatically discover relevant files based on project type."""
path = Path(target)
# Detect project type
if (path / "pyproject.toml").exists():
extensions = [".py"]
exclude = ["**/test_*.py", "**/__pycache__/**", "**/.venv/**"]
elif (path / "package.json").exists():
extensions = [".js", ".ts", ".jsx", ".tsx"]
exclude = ["**/node_modules/**", "**/dist/**"]
# ... etc
# Discover and filter
files = []
for ext in extensions:
files.extend(path.glob(f"**/*{ext}"))
files = [f for f in files if not any(excl in str(f) for excl in exclude)]
files = sort_by_relevance(files) # prioritize main/source over tests
# Trim to token budget
selected = []
total_tokens = 0
for f in files:
content = f.read_text()
tokens = len(content) // 4 # rough estimate
if total_tokens + tokens > token_budget:
break
selected.append(f)
total_tokens += tokens
return selected
Priority
Medium — Current workaround (manual file listing) works but is inconvenient. Would improve agent autonomy.
Labels
- enhancement
- agent
- bridge
- file-discovery
Feature Request: Improve File Discovery and Tool Usage for Grok Agent
Problem
Currently, Grok has limited file discovery and tool usage capabilities that make it less effective as an autonomous agent:
Issue 1: No Glob Pattern Expansion
The
--filesargument requires explicit paths. Glob patterns like*.pyor**/*.pyare passed as literal strings.Issue 2: No Recursive Directory Discovery
Grok cannot discover relevant files in a directory tree on its own. Claude Code must pre-discover files and pass them explicitly.
Issue 3: No Path Filtering
Even if Grok could scan directories, there's no mechanism to:
.pyfiles)**/test_*.py,**/node_modules/**)Issue 4: Single File Context Mode
The current architecture sends all files as a single concatenated context. For large codebases, this:
Impact
These limitations make it harder to use Grok as a true sub-agent because:
Proposed Solutions
Solution 1: Directory Scanning Mode
Implementation in grok_bridge.py:
Solution 2: MCP Server Integration
Instead of file-based context, expose an MCP server that Grok can call to:
This would give Grok tool-calling ability while keeping file I/O local to the bridge.
Solution 3: Intelligent File Selection
Add a
--auto-discoverflag that uses heuristics:package.json,pyproject.toml,go.modto identify project root**/*.pyfor Python projects).gitignore-like exclusionsPriority
Medium — Current workaround (manual file listing) works but is inconvenient. Would improve agent autonomy.
Labels