TabPro is a Python-based tool for efficient processing of tabular data.
- CSV
- TSV
- Excel
- JSON
- JSON Lines
- Bidirectional conversion between all supported formats
-
Table Conversion
- Convert between different formats
- Customize output format settings
- Filter and transform data
-
Table Merging
- Merge tables based on common columns
- Handle multiple table merging
- Support for staging and version control
-
Table Aggregation
- Data aggregation based on grouping
- Statistical calculations
- Duplicate detection
-
Table Sorting
- Sort by multiple columns
- Custom sort order
-
Table Comparison
- Detect differences between tables
- Data consistency checking
- Detailed comparison reports
- Python 3.10 or higher
- pip (Python package installer)
pip install tabprotabpro [command] [options]tabpro convert [options] <input_file1> [<input_file2>...] --output <output_file>
# or
tabpro-convert ...
convert-tables ...Options:
--output-file-filtered-out,--output-filtered-out,-f: Path to the output file for filtered out rows--config,-c: Path to the configuration file--pick-columns,--pick: Pick specific columns--do-actions,--actions,--do: Actions to perform on the data--ignore-file-rows,--ignore-rows,--ignore: Ignore specific rows--no-header: Treat CSV/TSV data as having no header row
tabpro merge [options] --previous <previous_file1> [<previous_file2> ...] --new <modification_file1> [<modification_file2> ...] --keys <key1> [<key2> ...]
# or
tabpro-merge ...
merge-tables ...Options:
--allow-duplicate-conventional-keys: Allow duplicate keys in previous files--allow-duplicate-modification-keys: Allow duplicate keys in modification files--output-base-data-file: Path to output base data file--output-modified-data-file: Path to output modified data file--output-remaining-data-file: Path to output remaining data file--merge-fields: Fields to merge--merge-staging: Merge staging fields from modification files--use-staging: Use staging fields files
tabpro aggregate [options] <input_file> --output <aggregated_json_path>
# or
tabpro-aggregate ...
aggregate-tables ...Options:
--keys-to-show-duplicates: Keys to show duplicates--keys-to-show-all-count: Keys to show all count--keys-to-expand: Keys to expand--show-count-threshold,--count-threshold,-C: Show count threshold (default: 50)--show-count-max-length,--count-max-length,-L: Show count max length (default: 100)
tabpro sort [options] <input_file1> [<input_file2> ...] --sort-keys <key1> [<key2> ...] --output <output_file>
# or
tabpro-sort ...
sort-tables ...Options:
--output-file,--output,-O: Path to output file--reverse,-R: Reverse the sort order
tabpro compare [options] <input_file1> <input_file2> --query <query_key1> [<query_key2> ...] --output <output_file>
# or
tabpro-compare ...
tabpro-diff ...
compare-tables ...Options:
--compare-keys,--compare,-C: Keys for comparison
--verbose,-v: Enable verbose logging--version,-V: Show version information
- Simple and user-friendly command-line interface
- Flexible data processing options
- Handles large datasets efficiently
- Extensible design