Skip to content

An introduction on how to build effective data pipelines for machine learning projects.

Notifications You must be signed in to change notification settings

acceleratescience/data-school-Autumn23

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction to Data Pipelines Autumn 2023

Welcome to the 2-day course on data pipelines.

Setup

Throughout the course, all content will be delivered using MacOS and VSCode. If you're using Windows or Linux, or if you want to use another IDE, that's no problem.

In order to minimize time spent debugging setup and package issues, please download and install python and VSCode using the instructions from their website.

You'll also need to install the python and jupyter extensions in VSCode. It would also be a good idea to set up a GitHub account if you don't already have one.

Next, download and install Poetry using the instructions from their website.

Next, navigate to the github repo for the course and download everything. You can either clone the repo or just download the zip file, up to you. If you want to download, click the green ' <> Code ' drop down and select "Download ZIP".

We'll be using virtual environments for each of the labs but there is a list of packages that we'll be using in the requirements.txt file, just in case you want to get familiar with them before Monday.

About

An introduction on how to build effective data pipelines for machine learning projects.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published