- Create and activate a virtual environment:
# Create virtual environment
python -m venv venv
# Activate virtual environment
# On macOS/Linux:
source venv/bin/activate
# On Windows:
# .\venv\Scripts\activate- Install required packages:
pip install -r requirements.txt- Create a directory
datasetand putad_events.csv,campaigns.csvandusers.csvfiles to the folder
docker-compose up -dWhat this file does:
- Starts a MySQL container named set_mysql.
- Creates a DB called set_db.
- Runs all .sql files in
./ddl_scriptsas initialization scripts in alphabetical order. - Persists MySQL data in a volume so it's not lost when the container is stopped or deleted.
- Create a volume for all csv files we need to insert data to the DB
Stop and remove all containers, networks, and volumes defined in docker-compose.yml:
docker-compose down -v# Using default connection parameters
python run.py
# Or specify custom connection parameters
python run.py --host localhost --database set_db --user set_user --password set_passwordThe script will:
- Create the
dataset_normalizeddirectory if it doesn't exist - Process each dataset in sequence
- Save the normalized data to the output directory
- Transform all data in a way they match sql tables
- Save these files to the docker volume
- Insert data to the DB
- Remove unnecessary csv files
python py_scripts/mysql/performance_report.pydocker-compose up -dWhat this file does:
- Starts a MySQL and MongoDB containers.
- Creates a MongoDB database called set_db.
- Runs all .js files in
./mongo_initas initialization scripts in alphabetical order. - Persists MongoDB data in a volume so it's not lost when the container is stopped or deleted.
I used csv files to insert data into MongoDB. To do this you can run this script:
python py_scripts/mongo_db/insert_data_to_mongo.pypython py_scripts/mongo_db/performance_report_mongo.pydocker-compose up -dWhat this file does:
- Starts a MySQL, MongoDB and Dassandra containers.
- Creates a MongoDB database called set_db.
- Runs all .js files in
./mongo_initas initialization scripts in alphabetical order. - Persists MongoDB data in a volume so it's not lost when the container is stopped or deleted.
I've added cql queries to cql_scripts/create_schemas.cql to create schemas.
You can run this python script to create these schemas:
python py_scripts/cassandra/create_schemas.pyRun this script to insert data from csv files to cassandra tables:
python py_scripts/cassandra/insert_data.pyRun this script to run queries to answer key business questions:
python py_scripts/cassandra/performance_report.pydocker-compose up -dWhat this file does:
- Starts a MySQL, MongoDB, Cassandra and Redis containers.
- Run API on http://0.0.0.0:8000/
I've added app/main.py file with all necessary code for API. It works with
MySQL DB and Redis cache
Run this script to test API and get performance statistic:
python py_scripts/api/test_api.py