This project is designed to scrape data from a website that provides information about electric vehicle charging stations, including station type, location, and current status (available, occupied, not working). The purpose of this project is to collect and analyze data to estimate the company's financial performance based on charging rates and station occupancy times.
The scraper performs the following tasks:
- Fetches the webpage with station statuses every minute.
- Parses and validates the data according to the required fields.
- Stores the data in a TimescaleDB (Postgres-based) database.
Additionally, the scraper logs the date when a new station appears on the list.
- Python: The core programming language for this project.
- BeautifulSoup4: For parsing HTML content.
- SQLAlchemy: ORM for database interaction.
- Alembic: For database migrations.
- TimescaleDB: A PostgreSQL-based database optimized for time-series data.
- Asyncio: For asynchronous data storage.
- Docker: For containerizing the application and database.
We chose TimescaleDB, an extension of PostgreSQL, due to its optimized handling of time-series data. This allows efficient storage and querying of timestamped data, which is crucial for analyzing station statuses over time.
- Data Collection: The scraper fetches the webpage every minute.
- Data Parsing: Extracts relevant information using BeautifulSoup4.
- Data Validation: Ensures data integrity and formats it for storage.
- Data Storage: Asynchronously stores data in TimescaleDB using SQLAlchemy.
The collected data will be analyzed using Microsoft Power BI to evaluate financial performance metrics such as revenue based on charging station occupancy times.
-
Clone the repository:
git clone git clone https://github.com/Srmrlt/ChargeStationParser.git cd ChargeStationParser -
Environment Setup:
Copy the example environment file to create your own environment variables file:
cp .env.example .env
Edit the
.envfile to suit your environment settings like database credentials. -
Build and Run with Docker Compose:
Use Docker Compose to build and run the services defined in the
docker-compose.ymlfile:docker compose up --build -d
This command will start all the required services in detached mode.
-
Stopp the Service:
To stop all services, use the following Docker Compose command:
docker compose down -v