Web Scraping and Regression Analysis based on Machine Learning for COVID-19 with Rapid Software Platform

  • Aizal Yusrina Idris
  • Razan Bamoallem
  • Mohamad Harith Azfar Mohamad Hatta

Abstract

Since the recent incidence of global COVID-19 pandemic, expertise from different domains including scientists, clinicians, and healthcare experts keep on exploring for technologies to manage the COVID-19 data. Updated and accurate data collection is very critical for them to make a more effective and efficient decision on any aspects of the emergency consequences and events. Although some of them are inexpert data scientists, the important skills and knowledges to extract the recent data on COVID-19 is web data extraction and analysis. While tremendous of literature can be referred from the academic databases, it is difficult to find the report that presents the basis and fundamental methods for implementing web data analysis in a simple way with a rapid software platform. This paper demonstrates a simple framework for implementing web data extraction or web scraping to be analyzed in a rapid software platform. Python scripting language is the simple tool to conduct the web scraping method while RapidMiner is the rapid software for implementing the data visualization and analysis. Simple linear regression based on machine learning approach has been implemented with the RapidMiner to predict COVID-19 death based on the collected data. This paper will be useful for academicians and industry practitioners to conduct a more robust data analysis to accommodate a more challenge issue such as big data analytics in any domains.

Published
2022-05-27
How to Cite
IDRIS, Aizal Yusrina; BAMOALLEM, Razan; MOHAMAD HATTA, Mohamad Harith Azfar. Web Scraping and Regression Analysis based on Machine Learning for COVID-19 with Rapid Software Platform. Mathematical Sciences and Informatics Journal, [S.l.], v. 3, n. 1, p. 75-85, may 2022. ISSN 2735-0703. Available at: <https://myjms.mohe.gov.my/index.php/mij/article/view/18278>. Date accessed: 26 mar. 2023. doi: https://doi.org/10.24191/mij.v3i1.18278.
Section
Articles

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.