Abstract
In recent years, a number of contaminants have significantly impacted the quality of water. The ecosystem and human health are directly impacted by the water quality. Effective water management is indicated by the water quality index. The ability to forecast and simulate water quality has become crucial in the battle against water pollution. The goal of the study is to create a reliable model that will classify the index value in accordance with the requirements for water quality and predict the water quality using latest ML models. The information was gathered from several sample points dispersed across rivers in India, Iraq, and Malaysia. 32 variables that have an impact on water quality, such as temperature, dissolved oxygen, pH, alkalinity, hardness, chloride, and coliform, are used to calculate the water quality index. Datasets are constructed using pre-processed data, including normalisation, outlier identification, and the resolution of any class imbalance concerns. The water quality is classified using machine learning methods such XGBoost, Naive Bayes, SVM, and Ada Boost for measuring the water quality index whereas the prediction of water performed using RF regressor, M5 Model Tree, DT regressor, EML regressor on the samples of Malaysian, Indian, and Iraqian rivers. The performance of XGBoost accurately identifies the water quality index with 93%, 92%, and 97% Accuracy, Precision and recall respectively. Whereas the performance of M5 Model Tree for WQ prediction is much better than other regression models. The developed models provide a promising result for the classification of water quality indexes and prediction.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.