Skip to content

Commit 248ebb9

Browse files
committed
course materials
1 parent 7b3fd44 commit 248ebb9

File tree

152 files changed

+81450
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

152 files changed

+81450
-0
lines changed

colab_utils.py

Lines changed: 479 additions & 0 deletions
Large diffs are not rendered by default.

data/100KUsedCar/car_prices.zip

1.1 MB
Binary file not shown.

data/100KUsedCar/readme.txt

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
Original dataset available at https://www.kaggle.com/datasets/adityadesai13/used-car-dataset-ford-and-mercedes
2+
3+
# About Dataset
4+
If you download/use the data set I'd appreciate an up vote, cheers.
5+
6+
# Updated
7+
Scraped data of used cars listings. 100,000 listings, which have been separated into files corresponding to each car manufacturer. I collected the data to make a tool to predict how much my friend should sell his old car for compared to other stuff on the market, and then just extended the data set. Then made a more general car value regression model.
8+
9+
# previous version
10+
Picked two fairly common cars on the British market for analysis (Ford Focus and Mercedes C Class). The hope is to find info such as: when is the ideal time to sell certain cars (i.e. at what age and mileage are there significant drops in resale value). Also can make comparisons between the two, and make a classifier for a ford or Mercedes car. Can easily add more makes and models, so comment for any request e.g. if you want a big data set of all Mercedes makes and models.
11+
12+
# Content
13+
The cleaned data set contains information of price, transmission, mileage, fuel type, road tax, miles per gallon (mpg), and engine size. I've removed duplicate listings and cleaned the columns, but have included a notebook showing the process and the original data for anyone who wants to check/improve my work.
14+
15+
# Inspiration
16+
It'd be cool to have some insights and visualisations of the data. Also, am open to ideas on how to expand the data set.

data/AGNews/agnews.zip

11.3 MB
Binary file not shown.

data/AGNews/readme.txt

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
Original dataset available at https://github.com/mhjabreel/CharCnn_Keras/tree/master/data/ag_news_csv
2+
3+
AG's News Topic Classification Dataset
4+
5+
Version 3, Updated 09/09/2015
6+
7+
8+
ORIGIN
9+
10+
AG is a collection of more than 1 million news articles. News articles have been gathered from more than 2000 news sources by ComeToMyHead in more than 1 year of activity. ComeToMyHead is an academic news search engine which has been running since July, 2004. The dataset is provided by the academic comunity for research purposes in data mining (clustering, classification, etc), information retrieval (ranking, search, etc), xml, data compression, data streaming, and any other non-commercial activity. For more information, please refer to the link http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html .
11+
12+
The AG's news topic classification dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the dataset above. It is used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015).
13+
14+
15+
DESCRIPTION
16+
17+
The AG's news topic classification dataset is constructed by choosing 4 largest classes from the original corpus. Each class contains 30,000 training samples and 1,900 testing samples. The total number of training samples is 120,000 and testing 7,600.
18+
19+
The file classes.txt contains a list of classes corresponding to each label.
20+
21+
The files train.csv and test.csv contain all the training samples as comma-sparated values. There are 3 columns in them, corresponding to class index (1 to 4), title and description. The title and description are escaped using double quotes ("), and any internal double quote is escaped by 2 double quotes (""). New lines are escaped by a backslash followed with an "n" character, that is "\n".

data/AutoMPG/auto-mpg.data

Lines changed: 398 additions & 0 deletions
Large diffs are not rendered by default.

data/AutoMPG/readme.txt

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
Original dataset available at https://archive.ics.uci.edu/dataset/9/auto+mpg
2+
3+
# Information
4+
5+
## Additional Information
6+
This dataset is a slightly modified version of the dataset provided in the StatLib library. In line with the use by Ross Quinlan (1993) in predicting the attribute "mpg", 8 of the original instances were removed because they had unknown values for the "mpg" attribute. The original dataset is available in the file "auto-mpg.data-original".
7+
8+
"The data concerns city-cycle fuel consumption in miles per gallon, to be predicted in terms of 3 multivalued discrete and 5 continuous attributes." (Quinlan, 1993)

data/ConcreteCrack/readme.txt

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
Original dataset available at https://data.mendeley.com/datasets/5y9wdsg2zt/2
2+
3+
# Concrete Crack Images for Classification
4+
5+
Published: 23 July 2019| Version 2 | DOI: 10.17632/5y9wdsg2zt.2
6+
Contributor: Çağlar Fırat Özgenel
7+
8+
# Description
9+
The dataset contains concrete images having cracks. The data is collected from various METU Campus Buildings.
10+
The dataset is divided into two as negative and positive crack images for image classification.
11+
Each class has 20000images with a total of 40000 images with 227 x 227 pixels with RGB channels.
12+
The dataset is generated from 458 high-resolution images (4032x3024 pixel) with the method proposed by Zhang et al (2016).
13+
High-resolution images have variance in terms of surface finish and illumination conditions.
14+
No data augmentation in terms of random rotation or flipping is applied.
15+
16+
If you use this dataset please cite:
17+
2018 – Özgenel, Ç.F., Gönenç Sorguç, A. “Performance Comparison of Pretrained Convolutional Neural Networks on Crack Detection in Buildings”, ISARC 2018, Berlin.
18+
19+
Lei Zhang , Fan Yang , Yimin Daniel Zhang, and Y. J. Z., Zhang, L., Yang, F., Zhang, Y. D., & Zhu, Y. J. (2016). Road Crack Detection Using Deep Convolutional Neural Network. In 2016 IEEE International Conference on Image Processing (ICIP). http://doi.org/10.1109/ICIP.2016.7533052

data/Fruits360/FOMO.tar.gz

12.1 MB
Binary file not shown.

data/Fruits360/images.zip

226 KB
Binary file not shown.

0 commit comments

Comments
 (0)