Evaluation of Machine Learning in Predicting Air Quality Index

  • Abdullah Sani Abdul Rahman
  • Aizal Yusrina Idris
  • Suhaimi Abdul Rahman


Environmental pollution poses significant health risks, and Malaysia is facing a critical air pollution issue due to the rapid growth of urbanization and industrialization. The Air Quality Index (AQI) is a standard measure of air pollution, and machine learning methods have shown promise in accurately predicting AQI levels. However, there is limited research on the application of intelligent approaches to predict AQI in Malaysia. This research investigates the impact of various AQI components, including Particulate Matter 2.5 (PM2.5), Nitrogen Dioxide (NO2), Carbon Monoxide (CO), and Ozone (O3), using 125 random locations across Malaysia, ranging from the north to the southern regions. Three machine learning algorithms, namely Generalized Linear Model, Decision Tree and Support Vector Machine are used in this research. The results show that PM2.5 has the most significant impact on AQI levels among all components analyzed, and all selected machine learning algorithms exhibit high prediction accuracy, with R^ above 90% and low prediction errors (less than 2 MAE and RMSE). This research provides essential insights into predicting AQI levels using machine learning approaches and highlights the critical role of PM2.5 in determining AQI levels in Malaysia. The findings can aid authorities in obtaining rapid and accurate information to effectively manage air pollution in the country.


[1] R. Fuller et al., “Pollution and health: a progress update,” Lancet Planet. Heal., 2022.
[2] S. Egbetokun, E. Osabuohien, T. Akinbobola, O. T. Onanuga, O. Gershon, and V. Okafor, “Environmental pollution, economic growth and institutional quality: exploring the nexus in Nigeria,” Manag. Environ. Qual. An Int. J., vol. 31, no. 1, pp. 18–31, 2020.
[3] N. Egerstrom et al., “Health and economic benefits of meeting WHO air quality guidelines, Western Pacific Region,” Bull. World Health Organ., vol. 101, no. 2, pp. 130–139, 2023.
[4] IQAir AirVisual, “World Air Quality ReportRegion and City PM2.5 Ranking. USA,” 2019.
[5] Z. Deng et al., “Mining biomarkers from routine laboratory tests in clinical records associated with air pollution health risk assessment,” Environ. Res., vol. 216, p. 114639, 2023.
[6] Y. Huang, Y. Wang, T. Zhang, P. Wang, L. Huang, and Y. Guo, “Exploring health effects under specific causes of mortality based on 90 definitions of PM2. 5 and cold spell combined exposure in Shanghai, China,” Environ. Sci. Technol., vol. 57, no. 6, pp. 2423–2434, 2023.
[7] P. Kumar, “A critical evaluation of air quality index models (1960--2021),” Environ. Monit. Assess., vol. 194, no. 5, p. 324, 2022.
[8] R. Sen, A. K. Mandal, S. Goswami, and B. Chakraborty, “Prediction of Particulate Matter (PM2. 5) Across India Using Machine Learning Methods,” in Proceedings of International Conference on Data Science and Applications: ICDSA 2022, Volume 2, 2023, pp. 545–556.
[9] C.-H. Wang and C.-R. Chang, “Forecasting air quality index considering socio-economic indicators and meteorological factors: A data granularity perspective,” J. Forecast..
[10] S. K. Bamrah, S. Srivatsan, and K. S. Gayathri, “Region Classification for Air Quality Estimation Using Deep Learning and Machine Learning Approach,” in Machine Learning, Image Processing, Network Security and Data Sciences: Select Proceedings of 3rd International Conference on MIND 2021, 2023, pp. 333–344.
[11] N. N. Maltare and S. Vahora, “Air Quality Index prediction using machine learning for Ahmedabad city,” Digit. Chem. Eng., p. 100093, 2023.
[12] K. Ravindra et al., “Application of machine learning approaches to predict the impact of ambient air pollution on outpatient visits for acute respiratory infections,” Sci. Total Environ., vol. 858, p. 159509, 2023.
[13] R. S. A. Usmani, T. R. Pillai, I. A. T. Hashem, M. Marjani, R. Shaharudin, and M. T. Latif, “Air pollution and cardiorespiratory hospitalization, predictive modeling, and analysis using artificial intelligence techniques,” Environ. Sci. Pollut. Res., vol. 28, no. 40, pp. 56759–56771, 2021.
[14] N. Ramli et al., “Performance of Bayesian Model Averaging (BMA) for Short-Term Prediction of PM10 Concentration in the Peninsular Malaysia,” Atmosphere (Basel)., vol. 14, no. 2, p. 311, 2023.
[15] A. Tella and A.-L. Balogun, “GIS-based air quality modelling: Spatial prediction of PM10 for Selangor State, Malaysia using machine learning algorithms,” Environ. Sci. Pollut. Res., pp. 1–17, 2021.
[16] N. Palanichamy, S.-C. Haw, S. Subramanian, R. Murugan, K. Govindasamy, and others, “Machine learning methods to predict particulate matter PM 2.5,” F1000Research, vol. 11, no. 406, p. 406, 2022.
How to Cite
ABDUL RAHMAN, Abdullah Sani; IDRIS, Aizal Yusrina; ABDUL RAHMAN, Suhaimi. Evaluation of Machine Learning in Predicting Air Quality Index. Mathematical Sciences and Informatics Journal, [S.l.], v. 4, n. 1, p. 1-10, may 2023. ISSN 2735-0703. Available at: <https://myjms.mohe.gov.my/index.php/mij/article/view/21889>. Date accessed: 10 june 2023. doi: https://doi.org/10.24191/mij.v4i1.21889.

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.