Curated list of experiment data pipelines used to build datasets across domains
- Python 3.x+
- Jupyter Lab
- Web scraping
- Consuming RESTful APIs
- Pandas
- Polars
The project uses Pip to keep track of its dependencies. To install it, you can follow the instructions here.
Once Pip has been installed, you can run the following commands to set up the project in your local:
git clone git@github.com:nathanbaleeta/data-engineering-notebooks.git
python3 -m venv venv
source venv/bin/activate
pip install --quiet pandas requests jupterlabpip freeze > requirements.txt
pip install -r requirements. txt
jupyter lab