Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
983 changes: 983 additions & 0 deletions Final Task/A4_200698.ipynb

Large diffs are not rendered by default.

219 changes: 219 additions & 0 deletions Resources/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@
Let’s start with the installation of Python’s IDE - Jupyter. Alternatively, you can use Google Colab for this purpose. You will not be needed to install any library on your local machine. Everything will work on Google’s server, and it’s best suited for Machine Learners. You can explore it out by googling, agar comfortable lage, then go with it.
Here’s a quick view of Google Colab - https://www.youtube.com/watch?v=oCngVVBSsmA


Installing Jupyter Notebook
Open the command prompt and install using the following command.

To get comfortable with the Jupyter notebook, you can just skim through this article https://www.dataquest.io/blog/jupyter-notebook-tutorial/.
Also, you can use this Jupyter cheatsheet (given below) to get a hold of it.
https://www.datacamp.com/community/blog/jupyter-notebook-cheat-sheet


Getting started with Python
You need not have any prior programming knowledge to start using Python. We only expect from you is seriousness while learning a new language. Honestly, working with Python will seem intuitive, but don't ever SKIP anything. Also, try to implement every line of code on Jupyter/colab to get confidence in Python and get familiar with the Jupyter/Colab notebook’s working.
Start learning Python with https://www.w3schools.com/python/default.asp
Complete every topic serial-wise in the list, up to Python Modules.
Start learning libraries (next section)



Python Libraries
Python is considered the best language for Machine Learning and many other domains because of the extensive support it offers through its libraries. Think of them as a piece of code that someone else wrote to ease your work, and you just need to call that code in a single line and get your job done. Isn’t that fun?
There are thousands of libraries to explore in Python depending upon your work and domain, but primarily we will be dealing with the three most commonly used ones. In the later part of your journey, learning a library will become your everyday task xD.
So let's get started!



Numpy or Numerical Python (All about array manipulation)
To start learning and working with software or a library, it is a good habit to go through its official documentation first. Documentation is most accurate and would best guide you if you encounter bugs. To learn Numpy, it is highly recommended to go through its doc.
Documentation: https://numpy.org/devdocs/user/quickstart.html (Must do)
Tutorial: https://www.w3schools.com/python/numpy/default.asp (A quick watch)
Practice: https://www.machinelearningplus.com/python/101-numpy-exercises-python/ (Must do. Consider discussing these problems with your friends and in groups)

Pandas (Python’s data analysis library)
Starter video to get the feel and motivation to learn pandas
https://www.youtube.com/watch?v=dcqPhpY7tWk (Quick watch)
Attached below is a notebook on Pandas which you must do completely to get comfortable with the library
https://drive.google.com/file/d/1E9BIQjJxVRiWTuPOe_AJsPRj3aQX3c0I/view?usp=sharing (Must do)
If you ever find difficulty or want to search for something specific in Pandas, as always, refer to their official documentation.
Documentation - https://pandas.pydata.org/docs/user_guide/index.html

Matplotlib (All about graphs)
It is the most commonly used plotting library in python. Easy to use and offers a wide range of functions for better visualization of your data. We will only go through the most basic plots for now. Here's an excellent tutorial to get you started:
https://matplotlib.org/2.0.2/users/pyplot_tutorial.html (MUST DO).

You can also watch these videos to get a good grasp https://youtube.com/playlist?list=PLeo1K3hjS3uu4Lr8_kro2AqaO6CFYgKOl.



Machine Learning begins now!
First of all, a little bit of motivation to get started with this domain. It won't be long, but it will get you going at your best pace!

Sundar Pichai’s short speech (2.5 minutes watch)
https://www.youtube.com/watch?v=5cFUZ03Sbhc
https://machinelearningmastery.com/why-get-into-machine-learning/ (5 min read)
Tadeo Corradi’s (researcher) TEDx video (10 minutes, can be watched at 1.5x)
https://www.youtube.com/watch?v=4qCzxo2wPCw

So yea, let's get started with the journey.
In the workshop, we defined the basic workflow of a Machine learner, that is, firstly, we create at random, we iterate and iterate, and again iterate to find the best-suited results. So in those iterations, we come up with specific algorithms that ML enthusiasts commonly use. Let us go through them one by one.
At the end of this document, we will be providing resources to some relevant course materials and notes to start from scratch in a structured manner, but it is highly recommended to follow the steps as mentioned.

Linear regression (firstly read the article carefully, then watch video on your pace) -https://towardsdatascience.com/linear-regression-detailed-view-ea73175f6e86
-https://www.youtube.com/watch?v=1-OGRohmH2s

Logistic regression (firstly read the article carefully, then watch video on your pace)
-https://www.youtube.com/watch?v=yIYKR4sgzI8
-https://towardsdatascience.com/introduction-to-logistic-regression-66248243c148

Decision tree classifier
The attached article is the best visualization you can have for a DT.
http://www.r2d3.us/visual-intro-to-machine-learning-part-1/ (Must)

To get more comfortable with the mechanism, consider watching this at your pace.
https://www.youtube.com/watch?v=7VeUPuFGJHk

Random forest classifier
https://williamkoehrsen.medium.com/random-forest-simple-explanation-377895a60d2d
https://www.youtube.com/watch?v=v6VJ2RO66Ag

KNN
-https://medium.com/swlh/k-nearest-neighbor-ca2593d7a3c4 (only up to ‘Brute force’)
-https://www.youtube.com/watch?v=HVXime0nQeI

A visualization to Neural networks (deep learning)
https://www.youtube.com/watch?v=aircAruvnKk


[Advice] It's highly recommended to discuss all these algorithms with friends, in groups, and on the Discord server (if needed) to get the most out of the given material.


Listed below are some courses and materials to start your DS journey in a very structured way. All the prerequisites and basic understanding required to crack the given material below are already provided in this document. Make sure to follow everything in a step-wise manner.



Lecture notes for Machine Learning (Stanford’s CS229) - For readers
https://sgfin.github.io/files/notes/CS229_Lecture_Notes.pdf

Lecture videos for Machine Learning (Stanford’s CC229) - For watchers
https://www.youtube.com/playlist?list=PLoROMvodv4rMiGQp3WXShtMGgzqpfVfbU

Book (for enthusiastic readers, this is a great book to get started)
https://drive.google.com/file/d/1hFPxorU1AMDXE_02HBwM0hAfcwsdOCeo/view?usp=sharing

Andrew Ng’s Deep Learning Specialisation (Coursera course)
It is a highly recommended course series for Deep Learning (a set of 5 paid courses, but you can get them for free after applying for Financial Aid, regarding this you can contact any of the ICG secretaries).
You can get started with this course series if you have familiarity with python and a basic understanding of ML algorithms (the ones covered in this document).

Course link: https://www.coursera.org/specializations/deep-learning

[Advice] Consider avoiding Standford’s Machine Learning course by Andrew Ng (on Coursera) because it's outdated and uses Octave for programming (a pain).


That's it for now, we will update the Discord server with further steps to your ML journey.
Feel free to ping us for any doubts and keep the server active!
All the best, keep learning 👍

As decided, we'll be covering some basic stuff before we move on to Deep Learning. Here are resources for the same:

Basic Python: https://www.w3schools.com/python/ (you might be familiar with most of it by now, but it might be handy to have something in case you forget anything)
Numpy: https://www.w3schools.com/python/numpy/default.asp (go only through the Basic part i.e skip Random & ufunc)
Pandas: https://www.w3schools.com/python/pandas/default.asp (go only through the Basic part, Cleaning Data won't be required as of now)
Matplotlib: https://www.w3schools.com/python/matplotlib_intro.asp

You don't need to mug up the syntax. You just need to know what utilities exist and where you can use them. You'll get used to the important syntax as we proceed.

Next, we move on to the basics of Machine Learning. The following article is a great introduction to it. It is quite exhaustive but not that elaborative. So, if you don't get something, feel free to ask here or Google it up.
Machine Learning —Fundamentals: https://towardsdatascience.com/machine-learning-basics-part-1-a36d38c7916

We'll be starting with Deep Learning next week. Since our aim is to classify images, we won't go deep into how we deal with traditional data. But if you want, you can always dig something that interests you, it may help 🙂

We hope you have gone through this week's reading material.

Here is your first assignment: https://colab.research.google.com/drive/1lHTeY0ieI9TWcR88yhQV1EwsX7n5PSJq?usp=sharing

You are supposed to make a copy of this notebook and work on that copy. All other details are in the notebook. The details on how to submit assignments would be conveyed to you soon.

Deadline: 5th June 2022

In the second week, we will be learning about Deep Learning and TensorFlow.
These are a few short articles that will explain what Deep Learning is:
https://medium.com/free-code-camp/want-to-know-how-deep-learning-works-heres-a-quick-guide-for-everyone-1aedeca88076

https://towardsdatascience.com/an-introduction-to-deep-learning-af63448c122c

https://www.analyticsvidhya.com/blog/2021/05/beginners-guide-to-artificial-neural-network/

https://medium.com/the-theory-of-everything/understanding-activation-functions-in-neural-networks-9491262884e0

https://medium.com/towards-data-science/overfitting-vs-underfitting-a-conceptual-explanation-d94ee20ca7f9

Kindly give these a thorough read.


Furthermore, to learn about TensorFlow, kindly refer to the following links:

https://www.tensorflow.org/tutorials/keras/regression
https://www.tensorflow.org/tutorials
(Beginner Quickstart and Keras Basics)

If you have finished Assignment 1, you can start with Assignment 2.
The deadline for Assgn2 is Sunday 12th June.
https://colab.research.google.com/drive/1ptX8WQ4CFpsuB3FySAo-Ofzinil-gfw_?usp=sharing

We hope you are working on Assignment 2. Here are the resources for the next week. You could try to read them before tomorrow's meet.
We now dive into the heart of the project: CNNs.

Simple Introduction to Convolutional Neural Networks: https://towardsdatascience.com/simple-introduction-to-convolutional-neural-networks-cdf8d3077bac
This article covers all the basic concepts related to CNNs. Go through it. As suggested earlier as well, google the concepts you don't get (as the article isn't that elaborative). You will find plenty of resources. And it goes without saying, if you have any specific doubts, you can ask us here.

CNN in Tensorflow (intro): https://www.tensorflow.org/tutorials/images/cnn & https://www.tensorflow.org/tutorials/images/classification
This will help you in your next assignment. You can skim through it, and just know what utilities Tensorflow provides. You would get a better idea when you implement it through the assignment.

few of you have reached out to me asking for other resources for TensorFlow.
https://youtu.be/tPYj3fFJGjk
You can follow this YouTube video. Refer to Module 3 and 4. You can refer to Module 1 and 2 as well, but that is related to the Week 1 topics.

Refer to this link for help and try to understand the code:
https://www.tensorflow.org/tutorials/keras/regression

If you have any doubts, you can contact us.

https://keras.io/api/optimizers/
Keras documentation: Optimizers
Image
Refer to this for Gradient Descent in TensorFlow

Here is your next assignment: https://colab.research.google.com/drive/1asGptVW1kSOD-sm44G3PJr-Ad9lPwQFR?usp=sharing

Before you start it, it is preferable to look at some of the popular CNN architectures to get an idea of what has worked for people over the years. Then, you are to design your own model.
Here is a video that will help: https://youtu.be/dZVkygnKh1M

This week you'll be covering Transfer Learning and Data Augmentation. This will be crucial for our final task of classification. Here are the resources you need to go through:

Introduction to Data Augmentation: https://youtu.be/JI8saFjK84o
Data Augmentation: https://www.tensorflow.org/tutorials/images/data_augmentation

Introduction to Transfer Learning: https://youtu.be/FQM13HkEfBk
Transfer Learning and Fine Tuning: https://www.tensorflow.org/tutorials/images/transfer_learning

Great work on the last assignment!
Now we move on to our final task: Classifying the dataset we decided upon. You'll be using Transfer Learning to do the same.

Here is the link to the dataset: https://drive.google.com/drive/folders/11TgD1-ouxCP6bd4HWnVvezo5H6c1xS0a?usp=sharing
You are to Make a copy of the folder Mask_Dataset (the above link)

Then, you are to use a copy of this notebook to write code: https://colab.research.google.com/drive/1-StHlJj6XIfC0fptgdCWJ_OT_11cDP2G?usp=sharing
The rest of the instructions are given in the notebook. Run the pre-written code as it is. If you have any doubts regarding it, feel free to post it here.

Deadline: 13th July

We know Y21s have their endsems coming up, therefore we have kept it after your exams. It shouldn't take you that much time.
We are thinking about keeping an optional assignment as well (Y20s could probably do it, more about that later).

https://www.tensorflow.org/tutorials/load_data/images
TensorFlow
Load and preprocess images | TensorFlow Core
Load and preprocess images | TensorFlow Core
Refer to the above site on how to load and split this data
Binary file added Resources/week1.pdf
Binary file not shown.
Binary file added Resources/week3.pdf
Binary file not shown.