feat: add processing events and callbacks system#217
feat: add processing events and callbacks system#217Jah-yee wants to merge 2 commits intoHKUDS:mainfrom
Conversation
Made-with: Cursor
|
…hread-safe dispatch Made-with: Cursor
Summary
This PR adds a lightweight event and callbacks system for document processing, so users can observe progress, collect metrics, and hook into key stages of the pipeline without changing core logic.
Motivation
For large batches, it is currently hard to see what RAG-Anything is doing: there are no structured events for logging/metrics and no safe extension points for custom side effects. A small callbacks layer provides a clearer story for monitoring and observability.
Changes
raganything/callbacks.pyProcessingEventdataclass for immutable, structured events (e.g. parse_start, parse_complete, text_insert, multimodal phases, query/batch stages).ProcessingCallbackbase class with multiple overridable hooks, all acceptingProcessingEventplus**kwargsfor forward compatibility.MetricsCallbackthat tracks document/block counts, errors, and durations, and exposes a read-only metrics snapshot.CallbackManagerto register/unregister callbacks, dispatch events, optionally keep a bounded event log, and isolate callback exceptions so a failing callback does not affect others.tests/test_callbacks.pyTesting
pytestlocally includingtests/test_callbacks.py; all tests passed.Thanks for your work on RAG-Anything—if you’d prefer different hook names or event structure, I’m glad to adjust this design.