Skip to content

Anaburiak/CSV-parser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📋 CSV-parser

A Python-based CSV data analysis pipeline that parses, cleans, validates, and analyzes tabular data. The project generates a structured, human-readable report with key statistics and insights.

image

Project Goals

  • Practice working with real-world CSV data
  • Build a full data pipeline using pure Python
  • Understand data validation, cleaning, and analysis
  • Generate automated analytical reports

Dataset

  • Source: Student Social Media & Relationships dataset
  • Format: CSV
  • Each row represents one student's anonymized survey response

Pipeline Overview

  1. Load CSV data into Python dictionaries
  2. Validate structure and detect missing values
  3. Clean and convert data types
  4. Perform statistical analysis
  5. Generate a formatted text report

🚀 Key Features

  • CSV parsing using Python standard library
  • Data validation and missing-value detection
  • Safe type conversion with error handling
  • Statistical analysis (averages, top-N values)
  • Grouping by categorical fields (country)
  • Automated report generation

Technologies Used

  • Python 3.10+
  • csv (standard library)
  • collections (defaultdict)

How to Run

  1. Clone the repository
  2. Ensure Python 3.10+ is installed
  3. Run:
    python main.py
    

Project Status

This project was completed as a learning mini-project after the first month of my Machine Learning self-study plan. The project was completed in several iterations, with correction of logical and analytical errors.

Possible Improvements

  • Refactor into multiple modules
  • Add pandas-based implementation
  • Add visualizations
  • Extend to machine learning tasks

About

CSV Data Analysis Pipeline - A Python tool for parsing, cleaning, and analyzing CSV datasets with automated report generation. Handles data validation, type conversion, statistical analysis, and produces human-readable reports.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages