Skip to content

Advanced CLI tool for LLM-optimized repository analysis with word counting, binary detection, multi-key sorting, and structured YAML output. Perfect for AI-powered code analysis and project overview generation.

License

Notifications You must be signed in to change notification settings

CedarVerse/cedarmapper

CedarMapper: Advanced Repository Analysis for LLM Consumption

Tests Python Code Style

CedarMapper is a powerful CLI tool for code analysis and repository visualization that generates comprehensive project analytics for LLM consumption. Perfect for developer tools workflows, this directory analysis solution provides code statistics and code overview capabilities with multi-key sorting and structured YAML output.

✨ Key Features: Repository Analysis & Code Statistics

  • 🎯 Structured YAML Output with front matter for programmatic consumption
  • πŸ“Š Advanced Analytics Columns: node depth, file/directory counts, average statistics
  • πŸ”„ 13-Key Advanced Sorting: comprehensive multi-key sorting capabilities
  • βœ‚οΈ Git-Style Line Limiting: top-N results with familiar -N syntax
  • ⚑ Performance Optimized: Linux tool integration and conditional computation
  • 🌳 Rich Tree Views: numbered indentation and prefix separation
  • πŸ” Binary File Detection: intelligent content analysis and caching

πŸš€ Quick Start: CLI Tool Installation & Usage

Installation

pip install cedarmapper

Basic Usage

# Basic repository overview
cedarmapper ls .

# Advanced analytics with all new columns
cedarmapper ls src/ --tree --node-depth --file-count --dir-count --avg-words --avg-size

# Structured YAML output for LLM consumption
cedarmapper ls src/render/ --yaml

# Find largest directories quickly
cedarmapper ls . --sort "-sf" --max-depth 3

# Top 10 largest files (git-style syntax)
cedarmapper ls . -10 --sort "-s"

🎯 Structured YAML Output: LLM-Optimized Data Format

Generate machine-readable repository analysis with comprehensive metadata:

$ cedarmapper ls src/cedarmapper/render/ --yaml

Output:

---
metadata:
  command: cedarmapper ls --yaml
  timestamp: '2025-11-28T15:39:55.681996'
  root_path: /home/ecc/IdeaProjects/cedarmapper/src/cedarmapper/render
summary:
  total_files: 12
  total_directories: 1
  total_bytes: 84846
  total_words: 3824
  binary_files: 6
  text_files: 6
---
entries:
- name: render
  path: ./
  type: dir
  size_bytes: 84846
  mtime: '2025-11-28T15:17:22.909272'
  depth: 0
  word_count: 3824
  file_count: 0
  dir_count: 0
  avg_words_per_file: 0
  avg_size_per_file: 0
- name: flat.py
  path: flat.py
  type: file
  size_bytes: 7847
  mtime: '2025-11-28T15:14:42.546386'
  depth: 1
  word_count: 703
  is_binary: false
- name: utils.py
  path: utils.py
  type: file
  size_bytes: 10546
  mtime: '2025-11-28T15:17:16.915884'
  depth: 1
  word_count: 989
  is_binary: false
# ... more entries

Perfect for:

  • LLM repository analysis workflows
  • Automated documentation generation
  • CI/CD pipeline integration
  • Code review automation

πŸ“Š Advanced Analytics: Project Statistics & Directory Analysis

Display comprehensive statistics about your repository structure:

$ cedarmapper ls src/ --tree --node-depth --file-count --dir-count --avg-words --avg-size --max-depth 2

Output:

  Words         Size Depth  Files  Dirs    Avg-W     Avg-S            Modified Path
    0      0     1                       6852       149893 2025-11-28T15:20:49 src/
    0      1     4     12.0   149,893    6852       149893 2025-11-28T15:20:49 └── cedarmapper/
    0      2     1    823.0    16,772    1646        33545 2025-11-28T15:20:49    β”œβ”€β”€ cli/
    0      6     1    637.3    14,141    3824        84846 2025-11-28T15:17:22    β”œβ”€β”€ render/
    0      4     1    342.5     7,783    1370        31131 2025-11-28T15:08:10    β”œβ”€β”€ core/
------- ------------ ----- ------ ----- -------- --------- ------------------- --------------------
   6852       149893                                       2025-11-28T15:20:49 TOTAL

Columns Explained:

  • Depth: Node depth in directory tree
  • Files: Number of files in directory
  • Dirs: Number of subdirectories
  • Avg-W: Average words per file (text files only)
  • Avg-S: Average size per file in bytes

πŸ”„ 13-Key Advanced Sorting

Comprehensive multi-key sorting for any analysis need:

Sort Keys Reference

Key Description Example
w Word count --sort -w (most words first)
s File size --sort -s (largest first)
d Modification date --sort -d (newest first)
i Node depth --sort i (shallowest first)
n File/directory name --sort n (alphabetical)
p Full path --sort p (path alphabetical)
f File count (dirs only) --sort -f (most files first)
r Directory count (dirs only) --sort -r (most subdirs first)
a Average words per file --sort -a (highest avg first)
z Average size per file --sort -z (largest avg first)
b Binary flag (files only) --sort b (text files first)

Advanced Sorting Examples

# Find directories with most files, then by size
cedarmapper ls . --sort "-sf" --max-depth 3

# Complex multi-key: depth asc, size desc, date asc
cedarmapper ls . --sort "i-sd" -15

# Find content-heavy files (text files, most words first)
cedarmapper ls . --sort "wb" --max-depth 4

# Quick repository overview: depth, then size descending
cedarmapper ls . --sort "i-s" -10

βœ‚οΈ Git-Style Line Limiting

Get top-N results with familiar git-style syntax:

# Show top 10 largest items
cedarmapper ls . -10 --sort "-s"

# Tree view with limited results
cedarmapper ls . --tree -10 --max-depth 1

Output:

Words         Size            Modified Path
  94294     14947634 2025-11-28T15:25:22 cedarmapper/
   5222       278743 2025-11-28T15:25:22 β”œβ”€β”€ .git/
   6852       149893 2025-11-28T15:20:49 β”œβ”€β”€ src/
    163         6775 2025-11-28T15:19:37 β”œβ”€β”€ .pytest_cache/
   3820        49719 2025-11-28T15:19:37 β”œβ”€β”€ coverage.xml
  44960       791406 2025-11-28T15:19:37 β”œβ”€β”€ htmlcov/
      -        98304 2025-11-28T15:19:36 β”œβ”€β”€ .coverage
    333         3136 2025-11-28T15:05:48 β”œβ”€β”€ pyproject.toml
   7066        60627 2025-11-28T14:59:57 β”œβ”€β”€ planning/
   4601       225001 2025-11-27T21:13:44 β”œβ”€β”€ tests/
------- ------------ ------------------- --------------------
  94294     14947634 2025-11-28T15:25:22 TOTAL

⚑ Performance Features: High-Speed Directory Analysis

Linux Tool Integration

CedarMapper automatically leverages Linux tools for maximum performance:

  • file command for intelligent binary detection
  • wc command for ultra-fast word counting
  • stat command for rapid file size analysis

Conditional Computation

Only compute what you need for optimal performance:

  • Count calculations only when count columns are displayed
  • Average calculations only when average columns are requested
  • Depth computation only when needed for sorting

Performance Examples

# Fast analysis (skip word counting)
cedarmapper ls . --skip-word-count

# Quick overview with word counts enabled
cedarmapper ls . --show-word-count

# Analysis targeting specific directories
cedarmapper ls src/ --max-depth 2 --file-count --dir-count

πŸ”§ Practical Workflows: Real-World Developer Tools Usage

Repository Overview for LLMs

# Complete structured analysis for AI consumption
cedarmapper ls . --yaml > repo_overview.yaml

# Quick project statistics
cedarmapper ls . --node-depth --file-count --dir-count --avg-words --avg-size

Code Review Preparation

# Find recently changed, large files
cedarmapper ls . --sort "-sd" -20

# Identify code-heavy directories
cedarmapper ls . --sort "-af" --max-depth 3

# Binary file analysis
cedarmapper ls . --sort "b" --max-depth 4

Documentation Analysis

# Focus on content-rich directories
cedarmapper ls docs/ --sort "-aw" --max-depth 2

# Average file size analysis for documentation planning
cedarmapper ls docs/ --avg-size --avg-words --tree

Performance Auditing

# Identify largest files and directories
cedarmapper ls . --sort "-s" -10

# Find directories with many files
cedarmapper ls . --sort "-f" --max-depth 3

# Analyze file size distribution
cedarmapper ls . --sort "z" --avg-size --max-depth 2

πŸ“Š Feature Comparison: CedarMapper vs Other Code Analysis Tools

Feature CedarMapper tree fd/find llm-repo-tools
LLM-Optimized βœ… ❌ ❌ βœ…
YAML Output βœ… ❌ ❌ βœ…
Word Counting βœ… ❌ ❌ βœ…
Binary Detection βœ… ❌ ❌ βœ…
13 Sort Keys βœ… ❌ ❌ 3-6
Statistical Columns βœ… ❌ ❌ ❌
Line Limiting βœ… ❌ βœ… ❌
Linux Tool Integration βœ… ❌ ❌ ❌
Conditional Computation βœ… ❌ ❌ ❌
Multi-Key Sorting βœ… ❌ ❌ Limited

πŸ› οΈ Installation & Development

Installation

# Install from PyPI
pip install cedarmapper

# Development installation
pip install -e ".[dev]"

Development Setup

# Clone repository
git clone https://github.com/your-username/cedarmapper.git
cd cedarmapper

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode
make install

Testing

# Run full test suite (55 tests passing)
make test

# Quick tests without coverage
make quick

# Run specific test file
make test-file FILE=tests/test_sort.py

# View coverage report
make coverage-open

Code Quality

# Auto-format code
make format

# Run linting and type checking
make lint

# Full CI pipeline
make ci

πŸ“‹ Command Reference

Core Options

Option Short Description
--max-depth N -d N Limit display depth (0=root only)
--tree -t Tree-like nested output
--numbered-indent -n Numbered depth prefixes
--tree-only -T Show only paths (tree-only mode)
--short -S Tree-only + word count

Analytics Options

Option Description
--node-depth Show node depth column
--skip-node-depth Hide node depth column
--file-count Show file count column
--dir-count Show directory count column
--avg-words Show average words per file
--avg-size Show average size per file

Performance Options

Option Description
--skip-word-count Skip word counting for speed
--show-word-count Force word count display
--follow-symlinks Follow symbolic links

Output Options

Option Description
--yaml -Y
--date-format FORMAT Date format: 'seconds', 'day', 's', 'd'
--skip-date Hide date column
--skip-header Hide column headers
--skip-totals Hide totals footer

Sorting & Limiting

Option Description
--sort SPEC Sort specification (see sort keys)
--line-limit N -N

❓ Troubleshooting & FAQ

Common Questions

Q: How do I get a quick overview of my repository?

cedarmapper ls . --node-depth --file-count --dir-count --max-depth 2

Q: How do I find the largest files in my project?

cedarmapper ls . --sort "-s" -10

Q: How do I generate input for an LLM?

cedarmapper ls . --yaml > repo_analysis.yaml

Q: How do I focus on code files only?

cedarmapper ls . --sort "wb" --max-depth 3

Performance Tips

  • Use --skip-word-count for very large repositories
  • Apply --max-depth to limit analysis scope
  • Use line limiting (-N) for quick overviews
  • Consider YAML output for automated workflows

Linux Tool Requirements

CedarMapper automatically falls back to Python implementations if Linux tools are unavailable:

  • file command for binary detection
  • wc command for word counting
  • stat command for file metadata

πŸ“„ License

Apache License 2.0 - See LICENSE file for details

🀝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Add tests for new functionality
  5. Ensure all tests pass (make test)
  6. Submit a pull request

CedarMapper: Transform repository complexity into clear, actionable insights for AI-powered development workflows.

About

Advanced CLI tool for LLM-optimized repository analysis with word counting, binary detection, multi-key sorting, and structured YAML output. Perfect for AI-powered code analysis and project overview generation.

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Contributors 2

  •  
  •