Supervised Learning Applied to Real Estate Price Classification in Bogotá, Colombia
Abstract
Currently, machine learning models have gained relevance in various fields of application, with real estate market price prediction being a key application for both property sellers and potential buyers. This article implements a machine learning workflow on a Bogotá real estate dataset, evaluating three classic models (kNN, Decision Trees, and Logistic Regression) and two ensemble methods (Random Forest and AdaBoost). The CRISP-DM data mining methodology was adapted into four phases: business and data understanding; data preparation; modeling, and evaluation and deployment. Using the Orange tool, the two ensemble models achieved the best performance, with AdaBoost obtaining the highest accuracy, precision, and recall scores reaching a value of 0.720. Property type and number of rooms were identified as the most relevant attributes. This study serves as a reference for the real estate sector, providing a decision-making tool based on current market learning techniques.
Keywords
Real estate market, visual orange, price estimation, ensemble methods, machine learning