Machine Learning–Driven Carbon Monoxide Prediction: A Case Study Using the UCI Air Quality Dataset
DOI:
https://doi.org/10.00000/12ses145Keywords:
Air Quality Prediction, Carbon Monoxide (CO) Concentration, Machine Learning Regression, Gradient Boosted Trees (GBT), KNIME Analytics PlatformAbstract
Accurate prediction of air pollution is essential for mitigating its adverse effects on human health, particularly with respect to carbon monoxide (CO) exposure. This paper presents a machine learning–based approach for forecasting CO concentration using the UCI Air Quality dataset, which consists of hourly sensor measurements collected from an urban area in Italy. Multiple regression models—including Linear Regression, Decision Trees, Random Forest, and Gradient Boosted Trees (GBT)—were implemented and systematically evaluated. To capture diurnal variation in pollution levels, a temporal feature (Hour) was extracted from timestamp data and incorporated into the models. All preprocessing, feature engineering, and model development were conducted using the KNIME Analytics Platform. Experimental results demonstrate that GBT augmented with the Hour feature achieved the highest predictive accuracy, with an R² score of 0.921, while Random Forest performed poorly on this dataset. A comparative analysis with prior studies based on Delhi air quality data highlights the dataset- dependent nature of model performance. The findings underscore the importance of rigorous data preprocessing and temporal feature engineering in improving air pollution prediction accuracy.
References
Sinha, P., & Singh, A. (2022). Air Quality Index Forecasting Using Machine Learning Algorithms: Case Study of Indian Cities. International Journal for Research Trends and Innovation (IJRTI), 7(7), 2278–2290. Retrieved from
https://www.ijrti.org/papers/IJRTI2207152.pdf
Air Quality Prediction and Analysis Using Machine Learning Models (Delhi Case Study), 2021/2022.
UCI Machine Learning Repository. (n.d.). Air Quality Data Set. Retrieved from
https://archive.ics.uci.edu/ml/datasets/Air+Quality
KNIME. (n.d.). KNIME Analytics Platform. Retrieved from https://www.knime.com