Skip to content

Shayokh144/Commodity_Price_Analysis_BD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,356 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Commodity Price Analysis (BD)

Collects and analyzes daily commodity prices in Bangladesh. The repo includes a scraper that pulls product listings, CSV data storage, notebooks for quick visualizations, and a SwiftUI app for richer charts.

What this repo contains

  • Automated price scraping from data/URLList.csv
  • Normalized CSV datasets under data/price_data/
  • Data cleanup utility for prices like 100+20
  • Python notebooks for basic charts
  • A SwiftUI visualization app in visualization/app/VisualReport/

Data layout

  • data/URLList.csv: list of category URLs to scrape
    Columns: url_name, url_type, url_value, provider_name
  • data/price_data/<category>/<category>.csv: collected price history
  • data/log.txt: scraper logs
  • data/debug/: saved HTML when rendering fails (created at runtime)

CSV columns generated by the scraper: product_name, weight_raw, weight_value, weight_unit, price_raw, price, discount_price_raw, discount_price, date, time

Scripts

Scrape prices

script/DataScrapper.py scrapes each URL and appends data to the matching CSV.

Install dependencies:

python3 -m pip install --upgrade pip
python3 -m pip install beautifulsoup4 lxml playwright
python3 -m playwright install --with-deps chromium

Run the scraper:

python3 script/DataScrapper.py

Fix raw price values

script/PriceFixer.py updates rows where price_raw contains + and writes the computed total into the price column.

Run the fixer:

python3 script/PriceFixer.py

Notebooks and visuals

  • visualization/notebook/: Jupyter notebooks for quick charts
  • visualization/22_23_24_Nov_Dec/ and visualization/LastTwoMonth/: exported charts
  • visualization/app/VisualReport/: SwiftUI app with line and bar charts

Web app

The web app at https://commoditypricetrackerbd.vercel.app/ provides interactive visualizations built on the datasets in this repository.

Automation

GitHub Actions workflow python-code-runner.yml runs the scraper daily (scheduled at 8:00 AM BD time) and opens a PR with new data.

Notes

  • The scraper first tries static HTML and falls back to headless Chromium rendering if products are not detected.
  • If you extend data/URLList.csv, a new category folder and CSV will be created automatically on the next run.

License

See LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors