Journal of Environmental Treatment Techniques  
2020, Volume 8, Issue 3, Pages: 1093-1100  
J. Environ. Treat. Tech.  
ISSN: 2309-1185  
Journal web link: http://www.jett.dormaj.com  
Artificial Intelligence Approach to Predicting River  
Water Quality: A Review  
1,3  
1,2*  
1,2  
1,2  
1,2  
Ariani Dwi Astuti , Azmi Aris , Mohd Razman Salim , Shamila Azman , Salmiati and  
Mohd Ismid Md Said1  
,2  
1
Department of Water and Environmental Engineering, School of Civil Engineering, Faculty of Engineering, Universiti Teknologi Malaysia, Johor, Malaysia  
2
Centre for Environmental Sustainability and Water Security (IPASA), Universiti Teknologi Malaysia, Johor Bahru, Malaysia  
3
Department of Environmental Engineering, Faculty of Landscape Architecture and Environmental Technology, Universitas Trisakti, Jakarta, Indonesia  
Received: 26/05/2020  
Accepted: 07/07/2020  
Published: 20/09/2020  
Abstract  
Precise prediction of the water quality time series may provide directions for early warning of water pollution and help policymakers to  
manage water resources more effectively. This prediction may reveal the proclivity of the characteristic water quality according to the most  
recent water quality, shifting, and transformation rule of the pollutant in the watershed. The predictive capability of traditional models is  
constrained due to variability, complexity, uncertainty, inaccuracy, non-stationary, and the non-linear interactions of the water quality  
parameters. Since the middle of the 20th century, Artificial Intelligence (AI) approaches have been found efficient in bridging gaps,  
simulating, complementing deficiencies, and improving the precision of the predictive models in terms of multiple evaluation measures for  
better planning, design, deployment, and handling of multiple engineering systems. This article discusses the state-of-the-art implementation  
of AI in water quality prediction, the type of AI approaches, the techniques adopted include the knowledge-based system as well as literature  
and their potential future implementation in water quality modelling and prediction. The study also discusses and presents several possibilities  
for future research.  
Keywords: Water quality simulation; Artificial Intelligence; Knowledge-based system; Review  
Introduction1  
have parameter-wide nonlinear relationships with each other.  
1
Conventional data processing cannot address this significant  
limitation (1012). Nonlinear Artificial Intelligence (AI) models,  
on the other hand, play a significant role in simulating complex  
and nonlinear processes (13). This situation creates a big gap  
between model designers and professionals. Selecting a suitable  
numerical model is a challenging task for novice application  
users. The forecast precision of traditional models is restricted  
due to the uncertainty, unpredictability, obscurity, and inaccuracy  
of water quality information. Progress in AI technology has made  
it possible over the past decade to apply the developments in  
computational modelling systems to bridge the gap, as mentioned  
above (8).  
Chemical, physical, and biological properties found in water  
are generally referred to as the quality of water (1). Accurate  
evaluation of water quality using the Water Quality Management  
program is important if decision-makers are to understand,  
interpret, and use these data to support resource management  
practices (2,3). Modelling of water quality parameters is an  
essential part of every water systems analysis. In order to properly  
manage the watershed, it is necessary to predict the quality of  
surface water so that appropriate measurements can be taken to  
avoid pollution from the permissible concentrations. Ideal  
management of water resources is based on accurate and reliable  
estimates of future changes (1,46).  
QUAL2E, Water Quality Analysis Simulation, and the U.S.  
Army Corps of Engineers Hydrological Engineering Center-5Q  
are several models commonly applied to water quality  
management (7). These models, however, are not only time  
consuming and expensive, but also lack user-friendliness and  
effective knowledge transfer in model interpretation. Therefore,  
more models, which do not suffer from these problems, are  
needed to be developed (8,9). Several scientists noted that the  
prediction of water quality is impacted by various variables that  
AI methods are currently capable of mimicking this behaviour  
(14), complementing the defect, and improving the precision of  
forecast models in terms of multiple assessment measures for  
better planning, design, operation, and management of distinct  
engineering systems (15, 16). The significant contributions of the  
present review article are 1) to categorise AI methods  
comprehensively and 2) to discuss their advanced application to  
water quality modelling and prediction.  
Corresponding author: Azmi Aris, (a) Department of Water and Environmental Engineering, School of Civil Engineering, Faculty of  
Engineering, Universiti Teknologi Malaysia, Johor, Malaysia and (b) Centre for Environmental Sustainability and Water Security (IPASA),  
Universiti Teknologi Malaysia, Johor Bahru, Malaysia. Email: azmi.aris@utm.my.  
1
093  
Journal of Environmental Treatment Techniques  
2020, Volume 8, Issue 3, Pages: 1093-1100  
extremely precise because it conveys all kinds of interactions  
expressed in the information, including fundamental physics and  
chemistry (9). Some studies (6, 11, 22-27) that explored river  
water quality modelling issues using AI methods have revealed  
encouraging outcomes in recent decades (Table 1).  
Several researchers have attempted to predict water quality  
parameters using AI-based models such as ANN, SVM, and k-  
NN. In these studies, ANN has been frequently found a stronger  
predictive model compared to conventional modelling  
techniques. In the case of 47 sources (2007-2019) reviewed,  
ANN, SVM, and k-NN have been used in 38, 10, and 1 source,  
respectively. ANN has been widely used between 2007 and 2015,  
but from 2015 to 2019, ANFIS and SVM have surpassed ANN,  
as more recent approaches of AI. Some studies made a  
comparison between the models.  
The study found that different parameters are needed to be  
used in water quality assessments using various techniques.  
Different output parameters predictions have been studied, but the  
ten most important parameters are DO, BOD, TSS, Total  
3
Nitrogen, temperature, COD, turbidity, Total Phosphate, NH ,  
and WQI. The monthly water quality data have been used most in  
many of these studies to simulate water quality parameters [4, 5,  
2
Artificial Intelligence-Based Model for River  
Water Quality Simulation  
The 1956 Dartmouth Conference was held at a time when AI  
earned its name, purpose, and first accomplishments; it was  
widely recognised as the birth of AI. Across various fields, the AI  
field is currently playing an important role, focusing on machines  
with a human-like mind (17). By incorporating descriptive  
understanding, procedural knowledge, and reasoning, AI methods  
enable researchers to simulate human knowledge in clearly  
defined domains. In addition, advances in AI techniques have  
enabled the creation of intelligent management systems through  
the use of shells under established platforms such as MathLab,  
Visual Basic, and C++. (8, 18).  
Recently, AI has achieved significant progress in multiple  
programs such as autonomous driving, big data, information  
processing, smart search, image understanding, automatic  
software development, robotics, and human-computer games,  
which will have a significant effect on human society. Some of  
the most important AI-based algorithms include artificial neural  
networks (ANNs), support vector machine (SVM), random forest  
(
RF), genetic algorithm (GA), enhanced regression tree (ERT),  
simulated annealing (SA), imperialist competitive algorithm  
ICA), and decision tree (DT). AI methods are also associated  
1
2
0, 11, 16-28], which was followed by daily water quality data (3,  
5, 35-40).  
(
with experimental design (e.g., response surface methodology,  
and standardised design) to improve the precision of the optimal  
solution prediction (19). Advances in data science and data  
mining techniques such as neural networks (NNs), supporting  
vector machines (SVMs), and k-nearest neighbours (k-NN) have  
helped to solve some complicated high-dimensional issues in  
river water quality prediction (Figure 1).  
3
Artificial Neural Network Modelling in River  
Water Quality Monitoring  
The theory of artificial neurons was first launched in 1943,  
with the implementation of the back-propagation practice (BP)  
algorithm for feedforward ANNs in 1986 (23). ANN is a recent  
method with a versatile mathematical structure that can identify  
complicated non-linear interactions between input and output  
information compared to other traditional modelling approaches  
(1, 25).  
Over the past two decades, river water pollutants have been  
considered as one of the global issues that need the full attention  
of environmental scientists. River water quality, however, is one  
of the main characteristics to which environmental scholars need  
to pay full attention. In all developing countries, water quality is  
a growing concern. Water abstraction mechanisms of domestic  
use, farming, mining, energy generation, and forestry practices  
may lead to a decline of water quality and quantity, which affects  
not only aquatic ecosystems but also the allocation of safe water  
for human consumptives (20). Thus, the assessment of surface  
water quality is important in the management of water resources  
and is very important in monitoring the concentration of  
pollutants in rivers. Monitoring water quality is costly because  
pollution control and efficient water resource management  
require large quantities of data (21). Therefore, AI can be  
recommended as an alternative technique with high prediction  
accuracy for predicting the river water quality. AI benefits from  
traditional techniques since they take account of the non-linear  
relationship between influential variables and reduce the  
complexity required to obtain experimental equations (20).  
The overall concept behind AI techniques is to explore hidden  
interactions in large quantities of information and to create  
models that represent physical procedures governing the system  
being studied. A model derived from data reflects a correlation  
between variables of input and output. Such a model can be  
ANNs are common instruments applicable to modeling  
extremely complex relations, processes, and phenomena. ANNs  
have been also widely used to predict water quality variables to  
address contaminant source uncertainty and nonlinearity of water  
quality data. Nevertheless, the issue with the initial weight  
parameter and the unbalanced training data set makes it hard to  
determine the optimal outcomes and hinders ANN modeling  
efficiency (25). ANN consists of very basic processors called  
neurons that are strongly interconnected and act together to solve  
a problem (41). A neuron is an information processing unit,  
essential for the functioning of the NN; it comprises weight and  
activation.  
From 2007 to 2019, eight types of ANN were applied by  
different researchers to the prediction of river water quality,  
namely Back Propagation NN (BPNN), Wavelet NN, Generalized  
Regression NN (GRNN), Radial Basic NN (RBNN), Feed  
Forward NN (FFNN), Multi-layer Perceptron NN (MLPNN),  
Multi-layer Feed Forward NN (MLFFNN) and Adaptive  
Network-Based Fuzzy Inference System (ANFIS). Among them,  
five most widely-used models MLPNN (10), RBNN (6), FNNN  
(5), ANFIS (5), and MLFFNN (4).  
1
094  
Journal of Environmental Treatment Techniques  
2020, Volume 8, Issue 3, Pages: 1093-1100  
Artificial Neural Network (ANN)  
Single Approach  
Support Vector Machine  
k-nearest neighbors  
Artificial Intellegency (AI)  
Technique  
WAVELET-ANFIS  
WAVELET-ANN  
ANN-ARIMA  
Hybrid Approach  
Figure 1: Classification Tree of AI Techniques Applied in Literature to River Water Quality  
Several indicators often used to evaluate the ANN model's  
simplified architecture of two weight layers with basic function  
parameters in the first layer, while the second layer contains linear  
combinations of those basic functions for the processing of the  
output and also contains parameters for water quality (1). Some  
performance are as follow: Coefficient of correlation (R) between  
the values observed and the expected values, the mean square  
error (which can be used to calculate how well the network output  
corresponds to the expected output), mean absolute error (MAE),  
root mean square error (RMSE); Coefficient of efficiency (CE),  
Mean absolute prediction error (MAPE) (which usually expresses  
accuracy as a percentage), interquartile range (IQR) (which refers  
3
researchers have predicted DO, COD, TDS, NH -N, Turbidity,  
and WQI using RBFNN. Cobaner et al. in 2009 used GRNN,  
MLP, and RBNN to forecast SS. Their results confirmed that  
RBNN performed slightly better than the others (37). In addition,  
Ahmed (2017) compared some models regarding the DO  
prediction, which again showed the superiority of the RBNN's  
performance over the rivals (33).  
th  
th  
to difference between the 25 and 75 percentile and is used to  
calculate the entire bias error between the means of the ensemble  
and the values observed), Nash-Sutcliffe coefficient (NSC), and  
Determination coefficient (DC) (Table. 2).  
FFNN propagates the data linearly from input to output; in many  
practical applications, they are the most popular and widely-used  
models (43). FFNN is used to predict DO, BOD, TN, and  
temperature. Four scientists implemented monthly data in FFNN  
to predict DO using different input parameters (11, 13, 33, 43).  
Ahmed in 2017 predicted DO using BOD and COD parameters  
as input, and obtained R = 0.936 and RMSE = 0.709. Ranković et  
al. in 2010 used FFNN to check its capability to predict DO. They  
added more variables of water quality. Their findings showed that  
pH and the water temperature are the most powerful variables in  
DO prediction. In (44), ANFIS was used to learn neural network  
algorithms and fuzzy logic was used to construct a non-linear  
mapping between inputs and outputs. Ahmed et al. (2017) and  
Khaled et al. (2018) conducted a study into BOD prediction using  
ANFIS, and they suggested that the ANFIS technique could be  
successfully applied to building models for predicting the river  
water quality (34, 55). Elkiran et al. (2018) attempted to predict  
DO at Yamuna River using ANFIS, FFN, and MLR. They found  
that even both FFNN and ANFIS were found capable of handling  
nonlinear interactions, the ANFIS model performed better than  
FFNN (13). The most predicted parameters for ANN are done,  
DO, BOD, COD and WQI, respectively 13, 5, 5, and 4 studies.  
And so far, for certain output parameters from certain inputs, each  
NN type has achieved good results.  
The fundamental MLPNN model has three layers: (i) input  
layer, (ii) hidden layer, and (iii) output layer. The input layer  
supplies the input data set, the hidden layer processes the features,  
and finally, the output layer shows the expected results. As can be  
seen in Table 2, MLPNN is widely used to predict the DO  
parameters. (15, 28, 31, 32). Moreover, MLPNN is used to predict  
BOD, COD, EC, TDS, turbidity, and WQI. In predicting the DO  
parameters, different inputs are used. However, Olyaie et al. in  
2
017 and Ay and Kişi in 2017 both used pH, EC, temperature,  
and flow as input parameters (15,31). The difference is that Olyaie  
et al. used daily data of seven years, while Ay and Kişi used  
monthly data of 15 years. Regarding the performance, Olyaie et  
2
al. obtained R = 0.955 and root mean square error (RMSE) =  
2
0
.594, while Ay and Kişi obtained R = 0.98 and RMSE = 0.52.  
The time scale of the data does not seem to have any impact on  
outcomes.  
Radial basis function neural network (RBFNN) is a type of  
NN applicable to general purposes and to various problems. The  
RBFNN model is more advantageous than other types of NN that  
have a grouping phase during training, where the hidden node's  
central location is calculated (36). RBFNN was developed as one  
of the most common three-layer neural feedforward networks  
(42) to determine parameters of water quality. This model has a  
1
095  
Journal of Environmental Treatment Techniques  
2020, Volume 8, Issue 3, Pages: 1093-1100  
Table 1: Artificial Intelligence-Based Models Applied to Water Quality Prediction  
Type of  
Approach  
Artificial  
Neural  
Methods  
Output Parameter  
River  
Authors  
Back Propagation  
neural networks  
COD, DO, NH  
Sediment  
3
,
Dahan River, Taiwan; Jishan  
Lake, China; River Suktel,  
India; Yuqiao reservoir in  
Tianjin  
Zhao et al. (2007); Xu and  
Liu (2013); Chang et al.  
(2015); Ghose and  
Network  
(ANN)  
Samantaray (2018)  
Wavelet Neural  
Network  
DO  
Jishan Lake, China  
Xu and Liu (2013)  
Generalized  
COD  
Cark Creek, Turkey  
Ay and Kisi (2014)  
Regression NN  
Radial Basic  
Neural Networks  
COD, DO, NH  
3
,
Surma River, Bangladesh;  
Yangtze River, China; Johor  
River, Malaysia; Cark Creek,  
Turkey; Kopili River, India  
Ahmed (2017); Basis et al.  
(2014); Najah et al. (2013);  
Ay and Kisi (2014); Kumar et  
al. (2016)  
TDS, Turbidity,  
Suspended  
sediment  
Feed Forward  
Neural Network  
BOD, DO, Total  
Nitrogen,  
Temperature,  
WQI  
Gomti river, India; Melen River, Singh et al. (2009); Dogan et  
Turkey; Surma River,  
al. (2009); Ahmed (2017); He  
et al. (2011); Gazzaz et al.  
(2012); Elkiran et al. (2018)  
Bangladesh; 59 rivers in Japan;  
Kinta River, Malaysia; Yamuna  
River  
Multi-layer  
Perceptron Neural  
Networks  
WQI, DO, BOD,  
COD, TDS,  
Turbidity,  
Johor River, Malaysia; Heihe  
River, China; Cark Creek,