__ ___ __ _______ __ __ ______ __ __
/ |/ /__ / /_____ _ / ____(_)__ / /___/ /____ / ____/ __/ /__________ ______/ /_____ _____
/ /|_/ / _ \/ __/ __ `/ / /_ / / _ \/ / __ / ___/ / __/ | |/_/ __/ ___/ __ `/ ___/ __/ __ \/ ___/
/ / / / __/ /_/ /_/ / / __/ / / __/ / /_/ (__ ) / /____> </ /_/ / / /_/ / /__/ /_/ /_/ / /
/_/ /_/\___/\__/\__,_/ /_/ /_/\___/_/\__,_/____/ /_____/_/|_|\__/_/ \__,_/\___/\__/\____/_/
- Description
- Getting Started
- Prerequisites
- Installation
- Usage
- Project Structure
- Contributing
- License
- Acknowledgements
The Script Meta-Fields Extractor is a Python-based tool designed to extract metadata (field names, data types, and example values) from a variety of file formats, including:
- CSV
- Excel (
.xls,.xlsx) - JSON
- XML
- Parquet
- QVD (with the version 1.1)
The tool processes data files located in the inputs directory and generates metadata reports saved in the outputs directory.
Ensure you have the following installed:
- Python 3.6+
- pip (Python package manager)
-
Clone this repository:
git clone <repository_url> cd script-meta-fields-extractor
-
Install the required Python libraries:
pip install -r requirements.txt
-
Set up the folder structure:
- Ensure the
inputsfolder exists and place your data files inside it. - The script will automatically create the
outputsfolder if it doesn't exist.
- Ensure the
- Place your data files (e.g.,
sample.csv,data.json) in theinputsfolder. - Run the script:
python data_info_extractor.py
- Follow the prompts to select a file for analysis.
- The tool will display the metadata (field names, types, examples) in the terminal and save the output to the
outputsdirectory.
SCRIPT-META-FIELDS-EXTRACTOR/
├── inputs/ # Input folder containing data files (CSV, JSON, etc.)
│ ├── sample.csv
│ ├── sample.json
│ ├── sample.parquet
│ ├── sample.xls
│ └── sample.xml
├── outputs/ # Output folder for processed metadata reports
│ ├── .gitignore # Ignores unnecessary files
├── data_info_extractor.py # Main Python script for metadata extraction
├── LICENSE # License information
├── README.md # Project documentation
├── requirements.txt # Python dependencies
We welcome contributions to improve this project! To contribute:
- Fork the repository.
- Create a new branch (
feature/your-feature-name). - Commit your changes with clear and concise messages.
- Push to your branch and open a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
- OpenAI's o1 for assisting with the refactoring and creating this README.