Cart-ML

A decision tree classifier written from scratch in Python, based on the CART (Classification and Regression Tree) machine learning algorithm.

Example Usage

The driver.py file gives an example of how to use the model, by applying it to an example dataset of Owl measurements. Visualising the model trained on this dataset results in an output of:

Is attribute 3 >= 3.0:
└────Y: Is attribute 4 >= 1.8:
        └────Y: Is attribute 3 >= 4.9:
                └────Y: <SnowyOwl:100.0>
                └────N: Is attribute 1 >= 3.2:
                        └────Y: <BarnOwl:100.0>
                        └────N: <SnowyOwl:100.0>
        └────N: Is attribute 3 >= 5.0:
                └────Y: Is attribute 4 >= 1.6:
                        └────Y: Is attribute 2 >= 7.2:
                                └────Y: <SnowyOwl:100.0>
                                └────N: <BarnOwl:100.0>
                        └────N: <SnowyOwl:100.0>
                └────N: Is attribute 2 >= 5.0:
                        └────Y: <BarnOwl:100.0>
                        └────N: <SnowyOwl:100.0>
└────N: <LongEaredOwl:100.0>

Each question (Node) has 2 answers (Children) leading from it. These can then be further questions or child nodes, which list the <class:certainty> pairs.

Testing the model using 'Cross Validation' results in scores like:

Round 1, Accuracy is 0.9556
Round 2, Accuracy is 0.9778
Round 3, Accuracy is 0.9111
Round 4, Accuracy is 0.9333
Round 5, Accuracy is 0.9333
Round 6, Accuracy is 0.9333
Round 7, Accuracy is 0.9778
Round 8, Accuracy is 0.9333
Round 9, Accuracy is 0.9778
Round 10, Accuracy is 0.9556

CART classifier has an accuracy of 94.89%, +/- 4.71%

These change slightly every time the Cross Validation is run, due to randomly allocating each instance to a fold (subset) as well as the probability based classification used by the algorithm.

Other datasets

The implementation works for both numerical and text data. It expects training data to be in a table format, with each row being an instance, each column being an attribute and the final column being the class.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
cart.py		cart.py
driver.py		driver.py
owls15.csv		owls15.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cart-ML

Example Usage

Other datasets

About

Languages

David-Byrne/CART-ML

Folders and files

Latest commit

History

Repository files navigation

Cart-ML

Example Usage

Other datasets

About

Topics

Resources

Stars

Watchers

Forks

Languages