Decision Tree based Classifiers for Large Datasets

Anilu Franco-Arcega, Jesús Ariel Carrasco-Ochoa, Guillermo Sánchez-Díaz, José Francisco Martínez-Trinidad


In this paper, several algorithms have been developed for building decision trees from large datasets. These algorithms overcome some restrictions of the most recent algorithms in the state of the art. Three of these algorithms have been designed to process datasets described exclusively by numeric attributes, and the fourth one, for processing mixed datasets. The proposed algorithms process all the training instances without storing the whole dataset in the main memory. Besides, the developed algorithms are faster than the most recent algorithms for building decision trees from large datasets, and reach competitive accuracy rates.


Decision trees, supervised classification, large datasets.

Full Text: PDF