Advanced Markdown Chunker for Dify

Intelligent Markdown document chunking for RAG systems with structural awareness

📋 Table of Contents

Overview
Features
Data & Privacy
Quick Start in Dify UI
Output Format
Chunking Strategies
Configuration
API Reference
Architecture
Performance
Usage Examples
Troubleshooting
Development
Compatibility

Overview

Advanced Markdown Chunker is a Dify plugin that intelligently splits Markdown documents into semantically meaningful chunks optimized for RAG (Retrieval-Augmented Generation) systems. Powered by the chunkana engine, it provides advanced structural awareness that goes beyond simple text splitting.

Primary Use Case: RAG Systems

This plugin is designed specifically for RAG (Retrieval-Augmented Generation) workflows where document chunks are embedded and stored in vector databases for semantic search. Built on the robust chunkana library, it provides enterprise-grade chunking capabilities through a user-friendly Dify interface. By default, each chunk includes embedded metadata (header paths, content type, line numbers) directly in the chunk text, which improves retrieval quality by providing additional context for vector representations.

Note for Model Training: If you need clean text without metadata (e.g., for fine-tuning language models), set include_metadata: false or post-process chunks to remove the <metadata> block.

Why Use This Plugin?

Simple Chunking Problems	Advanced Markdown Chunker Solution
Breaks code blocks mid-function	Preserves code blocks as atomic units
Loses header context	Maintains hierarchical section structure
Splits tables and lists	Keeps tables and lists intact
One-size-fits-all approach	4 adaptive strategies based on content
No overlap support	Smart overlap for better retrieval
Destroys list hierarchies	Smart list grouping with context binding
Breaks nested code examples	Handles nested fencing (````, ``````, ~~~~)
Code examples lose explanatory context	Enhanced code-context binding with pattern recognition
Before/After comparisons split apart	Intelligent Before/After pairing
Code and output separated	Automatic Code+Output binding
Mathematical formulas split	LaTeX formula preservation ( $...$ , environments)

✨ Features

🎯 Adaptive Chunking

4 intelligent strategies — automatic selection based on content analysis
Adaptive Chunk Sizing — automatic size optimization based on content complexity
- Code-heavy content → larger chunks (up to 1.5x base size)
- Simple text → smaller chunks (down to 0.5x base size)
- Configurable complexity weights and scaling bounds
- Optional feature (disabled by default for backward compatibility)
Hierarchical Chunking — parent-child relationships between chunks
- Multi-level retrieval support (overview vs. detail)
- Programmatic navigation (siblings, ancestors, children)
- O(1) chunk lookup performance
- Backward compatible with flat chunking
Streaming Processing — memory-efficient processing for large files
- Process files >10MB with <50MB RAM usage
- Configurable buffer management (100KB default window)
- Progress tracking support for long-running operations
- Maintains quality through smart window boundary detection
List-Aware Strategy — preserves nested list hierarchies and context (unique competitive advantage)
Nested Fencing Support — correctly handles quadruple/quintuple backticks and tilde fencing for meta-documentation (unique capability)
Enhanced Code-Context Binding — intelligently binds code blocks to explanations, recognizes Before/After patterns, Code+Output pairs, and sequential examples (unique competitive advantage)
LaTeX Formula Handling — preserves mathematical formulas as atomic blocks
- Display math ( $...$ ) never split across chunks
- Environment blocks (\begin{equation}, \begin{align}) preserved complete
- Supported in all 4 chunking strategies
- Critical for scientific papers and technical documentation
Table Grouping Option — groups related tables in same chunk for better retrieval
- Configurable proximity threshold (max_distance_lines)
- Section boundary awareness (require_same_section)
- Size and count limits (max_group_size, max_grouped_tables)
- Perfect for API documentation with Parameters/Response/Error tables
Structure preservation — headers, lists, tables, and code stay intact
Adaptive overlap — context window scales with chunk size (up to 35%)

🔍 Deep Content Analysis

AST parsing — full Markdown syntax analysis
Content type detection — code-heavy, text-heavy, mixed
Complexity scoring — optimizes strategy selection

🛡️ Reliability

473 tests — comprehensive test coverage with property-based testing (97 plugin tests + 376 chunkana library tests)
Property-Based Testing — formal correctness guarantees with Hypothesis
Automatic fallback — graceful degradation on errors
Performance benchmarks — automated performance regression detection

🔌 Integration

Dify Plugin — ready-to-use in Dify workflows
Python Library — standalone usage
REST API Ready — adapters for API integration

🔒 Data & Privacy

Local Processing Only
The plugin processes all Markdown content locally within your Dify instance. No data is transmitted to external services.

What the Plugin does:

✅ Parses Markdown structure using local AST analysis
✅ Generates chunks based on document structure
✅ Adds metadata for improved retrieval quality

What the Plugin does NOT do:

❌ Send data to external APIs
❌ Store data outside of Dify's standard mechanisms
❌ Log or track user content
❌ Collect analytics or telemetry

For complete details, see PRIVACY.md.

When to Use

✅ Perfect for:

Technical documentation with code and tables
API documentation with examples
User guides with structured content
Legal documents with articles and clauses
Changelogs with nested change lists

❌ Not recommended for:

Simple text without structure
Short documents (< 1000 characters)
Documents where exact chunk size is critical

Quick Start in Dify UI

Step 1: Install Plugin

Download the .difypkg file from Releases
In Dify: Settings → Plugins → Install Plugin
Upload the .difypkg file
Plugin is ready to use

Requirements: Dify version 1.9.0 or higher

Step 2: Create Knowledge Base

Create new Knowledge Base
- Go to Knowledge section
- Click "Create Knowledge"
- Select "Text" type
Configure Data Source
- Add your Markdown files
- Choose "File Upload" or "Web Crawling"
Configure Text Processing
- Text Splitter: select "Advanced Markdown Chunker"
- Configure parameters (see below)

Step 3: Parameter Configuration

Parameter	Type	Default	Description
`max_chunk_size`	number	4096	Maximum size of each chunk in characters. Larger values create bigger chunks with more context.
`chunk_overlap`	number	200	Characters to overlap between chunks (default: 200, 0 to disable). With include_metadata=true, overlap is in metadata fields. With include_metadata=false, overlap is embedded in chunk text.
`strategy`	select	auto	Document chunking strategy (default: auto - automatically detect best strategy based on content analysis) (auto/code_aware/list_aware/structural/fallback)
`include_metadata`	boolean	true	Embed metadata in text (default: true). When enabled, chunks have block with content_type, header_path, line numbers; overlap stays in metadata. When disabled, overlap is embedded into text: previous_content + main + next_content.
`enable_hierarchy`	boolean	false	Create parent-child relationships between chunks (default: false). When enabled, returns hierarchical structure with navigation metadata (parent_id, children_ids, level). Useful for multi-level retrieval and context navigation.
`debug`	boolean	false	Enable debug mode (default: false). When enabled with enable_hierarchy=true, returns all chunks (root, intermediate, and leaf). By default, only leaf chunks are returned. Future: will also control metadata field filtering.
`leaf_only`	boolean	false	Return only leaf chunks in hierarchical mode (default: false). When enabled, excludes internal nodes (sections with children). Recommended for vector DB indexing where you want only content chunks, not structural headers.

Step 4: Recommended Settings

For technical documentation:

max_chunk_size: 3000
strategy: code_aware
include_metadata: true

For legal documents:

max_chunk_size: 2500
strategy: structural
enable_hierarchy: true

For API documentation:

max_chunk_size: 2000
strategy: code_aware
include_metadata: true

Output Format

With Metadata (include_metadata: true)

Each chunk includes a <metadata> block with content information:

<metadata>
{
  "content_type": "text",
  "header_path": "/Installation/Requirements",
  "start_line": 45,
  "end_line": 52,
  "strategy": "structural",
  "chunk_index": 2
}
</metadata>
# Requirements

Python 3.12 or higher...

Key metadata fields:

content_type — content type (text, code, table, list, mixed)
header_path — hierarchical path of section headers
start_line / end_line — line numbers in source file
strategy — chunking strategy used
chunk_index — sequential chunk number
previous_content / next_content — overlap context from adjacent chunks

Without Metadata (include_metadata: false)

Chunks contain only clean Markdown content with embedded overlap:

...end of previous chunk...

# Requirements

Python 3.12 or higher...

...start of next chunk...

Chunking Strategies

The system automatically selects the optimal strategy based on content analysis:

Strategy	Priority	Activation Conditions	Best For
Code-Aware	1 (highest)	code ≥ 30% OR has code blocks/tables	Technical docs, API docs
List-Aware	2	lists > 40% OR list count ≥ 5	Changelogs, feature lists
Structural	3	≥3 headers with hierarchy	Documentation, guides
Fallback	4 (default)	Always applicable	Simple text, mixed content

Understanding chunk_overlap

Chunk overlap controls how many characters of context are shared between consecutive chunks to preserve semantic continuity.

Behavior depends on include_metadata:

`include_metadata`	Overlap Behavior
`true` (default)	Overlap stored in metadata fields `previous_content` / `next_content`. Chunk content stays clean.
`false`	Overlap embedded directly into chunk text: `previous_content + "\n" + main + "\n" + next_content`

⚙️ Configuration

Basic Parameters

from chunkana import ChunkConfig

config = ChunkConfig(
    # Size limits
    max_chunk_size=4096,      # Maximum chunk size (chars)
    min_chunk_size=512,       # Minimum chunk size
    
    # Overlap (adaptive sizing)
    overlap_size=200,         # Base overlap size (0 = disabled)
                              # Actual max = min(overlap_size, chunk_size * 0.35)
    
    # Behavior
    preserve_atomic_blocks=True,  # Keep code blocks and tables intact
    extract_preamble=True,        # Extract content before first header
    
    # Strategy selection thresholds
    code_threshold=0.3,           # Code ratio for CodeAwareStrategy
    structure_threshold=3,        # Min headers for StructuralStrategy
    list_ratio_threshold=0.40,    # List ratio for ListAwareStrategy
    list_count_threshold=5,       # Min list blocks for ListAwareStrategy
    
    # Code-Context Binding (NEW)
    enable_code_context_binding=True,   # Enable enhanced code-context binding
    max_context_chars_before=500,       # Max chars for backward explanation search
    max_context_chars_after=300,        # Max chars for forward explanation search
    related_block_max_gap=5,            # Max line gap for related block detection
    bind_output_blocks=True,            # Auto-bind output blocks to code
    preserve_before_after_pairs=True,   # Keep Before/After pairs together
    
    # Adaptive Chunk Sizing (NEW)
    use_adaptive_sizing=False,          # Enable adaptive chunk sizing
    adaptive_config=None,               # AdaptiveSizeConfig instance (see below)
    
    # Override
    strategy_override=None,   # Force specific strategy (code_aware/list_aware/structural/fallback)
)

Table Grouping Configuration

Group related tables in the same chunk for better retrieval quality:

from chunkana import ChunkConfig, TableGroupingConfig

# Enable table grouping
config = ChunkConfig(
    group_related_tables=True,
    table_grouping_config=TableGroupingConfig(
        max_distance_lines=10,    # Max lines between tables to group
        max_grouped_tables=5,     # Max tables per group
        max_group_size=5000,      # Max chars for grouped content
        require_same_section=True # Only group within same header section
    )
)

chunker = MarkdownChunker(config)
chunks = chunker.chunk(api_docs)

# Grouped table chunks have metadata:
# - is_table_group: True
# - table_group_count: number of tables in group

When to Use:

✅ API documentation with Parameters/Response/Error tables
✅ Data reports with related comparison tables
✅ Technical specs with multiple related tables
❌ Documents where tables are independent

Adaptive Chunk Sizing Configuration

Enable automatic size optimization based on content complexity:

from chunkana import ChunkConfig, AdaptiveSizeConfig

# Enable with default settings
config = ChunkConfig(
    use_adaptive_sizing=True,
    adaptive_config=AdaptiveSizeConfig(
        base_size=1500,           # Base chunk size for medium complexity
        min_scale=0.5,            # Minimum scaling factor (0.5x = 750 chars)
        max_scale=1.5,            # Maximum scaling factor (1.5x = 2250 chars)
        
        # Complexity weights (must sum to 1.0)
        code_weight=0.4,          # Weight for code ratio
        table_weight=0.3,         # Weight for table ratio
        list_weight=0.2,          # Weight for list ratio
        sentence_length_weight=0.1,  # Weight for average sentence length
    )
)

chunker = MarkdownChunker(config)
chunks = chunker.chunk(text)

# Chunks now have adaptive sizing metadata:
# - adaptive_size: calculated optimal size
# - content_complexity: complexity score (0.0-1.0)
# - size_scale_factor: applied scale factor

Quick Enable with Profile:

# Use pre-configured adaptive sizing profile
config = ChunkConfig.with_adaptive_sizing()
chunker = MarkdownChunker(config)

How It Works:

Content Analysis - Calculates code ratio, table ratio, list ratio, avg sentence length
Complexity Scoring - Weighted sum of factors produces score 0.0-1.0
Size Calculation - optimal_size = base_size * (min_scale + complexity * scale_range)
Chunk Application - Chunks respect calculated size while preserving atomic blocks

Behavior:

Code-heavy documents (high complexity) → larger chunks (up to 1.5x base size)
Simple text (low complexity) → smaller chunks (down to 0.5x base size)
Mixed content → size close to base

Configuration Profiles

Profile	Use Case	Max Size	Overlap
`for_dify_rag()`	RAG systems	4096	200
`for_code_heavy()`	Technical documentation	3072	150
`for_search_indexing()`	Search indexing	2048	100
`minimal()`	Fine-grained chunking	1024	50

Overlap Handling

Two modes for overlap handling:

Metadata mode (include_metadata: true):

Overlap stored in previous_content / next_content fields
Main chunk content stays clean
Perfect for RAG systems with vector representations

Embedded text mode (include_metadata: false):

Overlap physically embedded into chunk text
Format: previous + "\n" + main + "\n" + next
Suitable for sliding window processing

Troubleshooting

Frequently Asked Questions

Q: Why are chunks too large/small? A: Adjust max_chunk_size. For technical docs, recommend 2000-4000 characters, for simple text — 1000-2000.

Q: Code is split in the middle of functions A: Ensure code_aware strategy is used (automatically activated when code blocks are present).

Q: Lists are broken incorrectly A: For documents with many lists, use list_aware strategy or auto.

Q: Metadata interferes with results A: Set include_metadata: false to get clean text.

Q: Need only content chunks without headers A: Use enable_hierarchy: true and leaf_only: true.

Markdown Limitations

For best results, follow these recommendations:

Headers: use #, ##, ### (not "visual" headers without #)
Lists: a./b. often not recognized as ordered list — use 1./2.
Tables: use GitHub-flavored markdown format
Code: use triple backticks with language specification

Configuration Recipes

Legal documents:

strategy: structural
enable_hierarchy: true
include_metadata: true

API documentation:

strategy: code_aware
max_chunk_size: 2500

General documentation:

strategy: auto
include_metadata: true

Usage Examples

Detailed examples with input files, configurations, and results are available in the examples/ folder:

examples/inputs/ — sample input files
examples/configs/ — configurations for each example
examples/outputs/ — reference results

📚 API Reference

MarkdownChunker

Main class for chunking Markdown documents.

from chunkana import MarkdownChunker, ChunkConfig

# Create with default settings
chunker = MarkdownChunker()

# Create with custom configuration
config = ChunkConfig(
    max_chunk_size=2048,
    overlap_size=100,
    strategy_override="code_aware"
)
chunker = MarkdownChunker(config)

Core Methods

chunk(text: str, **kwargs) -> List[Chunk]

# Simple chunking
chunks = chunker.chunk(markdown_text)

# With analysis
result = chunker.chunk(markdown_text, include_analysis=True)
print(f"Strategy used: {result.strategy_used}")

chunk_hierarchical(text: str, **kwargs) -> HierarchicalResult

# Hierarchical chunking
result = chunker.chunk_hierarchical(markdown_text)

# Navigate hierarchy
root = result.get_chunk(result.root_id)
children = result.get_children(result.root_id)
leaf_chunks = result.get_flat_chunks()

chunk_file_streaming(file_path: str, config: StreamingConfig) -> Iterator[Chunk]

# Streaming processing for large files
streaming_config = StreamingConfig(buffer_size=100_000)
for chunk in chunker.chunk_file_streaming("large_doc.md", streaming_config):
    process_chunk(chunk)

Configuration Profiles

# Pre-configured profiles
config = ChunkConfig.for_code_heavy()      # For code documentation
config = ChunkConfig.for_dify_rag()        # For RAG systems in Dify
config = ChunkConfig.for_search_indexing() # For search indexing
config = ChunkConfig.with_adaptive_sizing() # With adaptive sizing

Chunk

Class representing a single document chunk.

class Chunk:
    content: str           # Text content of the chunk
    start_line: int        # Starting line in source document
    end_line: int          # Ending line in source document
    size: int              # Size in characters
    content_type: str      # Content type (text, code, table, list, mixed)
    strategy: str          # Strategy used
    metadata: Dict[str, Any]  # Additional metadata

Metadata Fields

chunk_index — sequential chunk number
header_path — hierarchical path of headers
code_language — programming language (for code blocks)
previous_content / next_content — overlap context
adaptive_size — calculated optimal size (when adaptive sizing enabled)
content_complexity — complexity score 0.0-1.0
code_role — code block role (example, setup, output, before, after, error)
has_related_code — whether chunk contains related code blocks
code_relationship — relationship type (before_after, code_output, sequential)

Helper Functions

from chunkana import chunk_text, chunk_file

# Direct text chunking
chunks = chunk_text("# My Document\n\nContent...")

# Chunk from file
chunks = chunk_file("README.md")

🏗️ Architecture

Component Overview

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   Dify Plugin   │───▶│  Chunkana Engine │───▶│   Strategies    │
│   (Adapter)     │    │   (Core Logic)   │    │   (Algorithms)  │
└─────────────────┘    └──────────────────┘    └─────────────────┘
         │                       │                       │
         ▼                       ▼                       ▼
┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│  Input/Output   │    │   AST Parser     │    │  Content Types  │
│   Validation    │    │   & Analysis     │    │   Detection     │
└─────────────────┘    └──────────────────┘    └─────────────────┘

Processing Flow

Input Validation — parameter and content validation
AST Parsing — Markdown parsing into syntax tree
Content Analysis — content type and complexity detection
Strategy Selection — automatic or forced algorithm selection
Chunking — applying selected strategy
Post-processing — adding metadata and overlaps
Output Formatting — preparing result for Dify

Chunking Strategies

CodeAwareStrategy

Goal: Preserve code blocks and tables
Algorithm: Detects fenced block boundaries, groups related code
Activation: code_ratio ≥ 30% OR presence of code blocks/tables

ListAwareStrategy

Goal: Preserve list hierarchies
Algorithm: Analyzes list nesting, groups by levels
Activation: list_ratio > 40% OR list_count ≥ 5

StructuralStrategy

Goal: Split by headers
Algorithm: Uses header hierarchy as chunk boundaries
Activation: ≥3 headers with hierarchy

FallbackStrategy

Goal: Universal chunking
Algorithm: Sentence-based splitting with size consideration
Activation: Always applicable as fallback

Adaptive Capabilities

Adaptive Sizing

optimal_size = base_size * (min_scale + complexity * scale_range)

Analyzes content complexity (code, tables, lists)
Scales chunk size from 0.5x to 1.5x base size
Preserves atomic blocks regardless of size

Smart Overlap

max_overlap = min(overlap_size, chunk_size * 0.35)

Adaptive overlap limit up to 35% of chunk size
Context-dependent placement (in metadata or text)

⚡ Performance

Benchmarks

Test Environment: Windows 11, Intel Core i7, 16GB RAM, SSD

Document Size	Processing Time	Memory	Chunks
10KB (article)	15ms	12MB	3-5
100KB (manual)	45ms	14MB	25-35
1MB (API docs)	180ms	18MB	180-220
10MB (large documentation)	1.2s	35MB	1500-2000

Optimizations

Streaming Processing:

Files >10MB processed using <50MB RAM
100KB window buffering with smart boundary detection
Progress tracking support for long-running operations

AST Caching:

Reuse parsed tree for different configurations
Incremental analysis for large documents

Memory:

Base usage: 12.3MB + 0.14MB per KB input
Streaming mode: fixed usage regardless of file size

Performance Monitoring

Note: Performance data based on benchmarks from docs/research/07_benchmark_results.md conducted on Windows 11, Intel Core i7, 16GB RAM, SSD. Actual performance may vary depending on system configuration, document complexity, and content type.

🧪 Development

Testing

The project uses pytest for testing. The test suite is optimized and includes:

Test Structure:

tests/plugin/ — 97 Dify plugin tests
tests/chunkana/ — 376 chunkana library tests
Property-based tests — formal correctness guarantees with Hypothesis
Benchmarks — automated performance regression detection

Running Tests:

# All tests
make test

# Plugin tests only
pytest tests/plugin/ -v

# Property-based tests only
pytest tests/ -k "property" -v

# With coverage
pytest --cov=. --cov-report=html

Test Categories:

Unit tests — individual components and functions
Integration tests — component interactions
Property tests — universal correctness properties
Performance tests — performance regressions
Golden tests — reference inputs/outputs

Development Setup

# Clone repository
git clone https://github.com/asukhodko/dify-markdown-chunker.git
cd dify-markdown-chunker

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install development dependencies
pip install -r requirements.txt

# Verify installation
make test

Dependencies

Core:

chunkana>=2.1.7 — chunking engine
dify_plugin==0.5.0b15 — Dify integration

Development:

pytest>=8.0.0 — testing
hypothesis>=6.0.0 — property-based testing
pytest-cov — code coverage
black — code formatting
flake8 — linting

Building Plugin

# Build .difypkg file
make package

# Verify before building
make verify

# Clean build artifacts
make clean

Contributing

Fork the repository
Create feature branch (git checkout -b feature/amazing-feature)
Commit changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing-feature)
Open Pull Request

PR Requirements:

All tests must pass
Code coverage must not decrease
Code must follow style (black, flake8)
Documentation must be updated

Compatibility

Tested on:

Dify versions 1.9.0, 1.9.1, 1.9.2
Python 3.12+
Windows 11, macOS 14+, Ubuntu 22.04+

Expected compatibility:

Dify versions 1.9.x and higher
Python 3.12 and higher

Support

Documentation: docs/
Questions and discussions: GitHub Discussions
Bug reports: GitHub Issues

License

MIT License — see LICENSE

Author: Aleksandr Sukhodko (@asukhodko)
Repository: https://github.com/asukhodko/dify-markdown-chunker

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
.github/workflows		.github/workflows
.qoder/repowiki/en		.qoder/repowiki/en
_assets		_assets
docs		docs
examples		examples
provider		provider
scripts		scripts
tests		tests
tools		tools
.difyignore		.difyignore
.env.example		.env.example
.flake8		.flake8
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
DEVELOPMENT.md		DEVELOPMENT.md
LICENSE		LICENSE
Makefile		Makefile
PRIVACY.md		PRIVACY.md
README.md		README.md
adapter.py		adapter.py
input_validator.py		input_validator.py
main.py		main.py
manifest.yaml		manifest.yaml
output_filter.py		output_filter.py
package_official.sh		package_official.sh
pytest.ini		pytest.ini
requirements.txt		requirements.txt
run_test.sh		run_test.sh

Folders and files

Latest commit

History

Repository files navigation

Advanced Markdown Chunker for Dify

📋 Table of Contents

Overview

Primary Use Case: RAG Systems

Why Use This Plugin?

✨ Features

🎯 Adaptive Chunking

🔍 Deep Content Analysis

🛡️ Reliability

🔌 Integration

🔒 Data & Privacy

When to Use

Quick Start in Dify UI

Step 1: Install Plugin

Step 2: Create Knowledge Base

Step 3: Parameter Configuration

Step 4: Recommended Settings

Output Format

With Metadata (include_metadata: true)

Without Metadata (include_metadata: false)

Chunking Strategies

Understanding chunk_overlap

⚙️ Configuration

Basic Parameters

Table Grouping Configuration

Adaptive Chunk Sizing Configuration

Configuration Profiles

Overlap Handling

Troubleshooting

Frequently Asked Questions

Markdown Limitations

Configuration Recipes

Usage Examples

📚 API Reference

MarkdownChunker

Core Methods

Configuration Profiles

Chunk

Metadata Fields

Helper Functions

🏗️ Architecture

Component Overview

Processing Flow

Chunking Strategies

CodeAwareStrategy

ListAwareStrategy

StructuralStrategy

FallbackStrategy

Adaptive Capabilities

Adaptive Sizing

Smart Overlap

⚡ Performance

Benchmarks

Optimizations

Performance Monitoring

🧪 Development

Testing

Development Setup

Dependencies

Building Plugin

Contributing

Compatibility

Support

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 12

Contributors

Uh oh!