Decision-Tree

This Jupyter NoteBook contains my implementation of Decesion Tree and Random forest.

I calculate the information gain by using the Entropy function, which I also take advantage of to choose the best split.

The key part of growing tree is the following pseudocode:

Let S <- {1, 2,..., n} be set of sample point indices

GrowTree(S )
  if (yi = C for all i in S and some class C) then {
    return new leaf(C) If the leaf is pure, then predict that class C
  } else {
    choose best splitting feature j and splitting value k
    S_left = {i in S : Xij < k} 
    S_right = {i in S : Xij >= k}
    return new node(j, k, GrowTree(S_left), GrowTree(S_right))
}

#Validation I used the Titanic data to test my implementation of Decision tree and Random forest. For decision tree, the training accuracy is 0.84 and the Validation accuracy is 0.8 For random forest, the training accuracy is 0.78 and the Validation accuracy is 0.83

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
datasets		datasets
Decision Tree.ipynb		Decision Tree.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Decision-Tree

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Decision-Tree

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages