ANN-based Approach to Model MC/DR of Some Fruits Under Solar Drying

The aim of this work was to model the moisture content (MC) and drying rate (DR) using artificial neural network (ANN) methodology. Many architectures have been tested and the best topology was selected based on a trial and error method. The dataset was randomly divided into 60, 20, and 20 % for training, test, and validation stage of the ANN model, respectively. The best topology was 10-{29-13}-2 obtained with high correlation coefficient R (%) of {99.98, 98.41} and low root mean square error RMSE (%) (0.36, 6.29) for MC and DR, respectively. The obtained ANN can be used to interpolate the MC and DR with high accuracy.


Introduction
Most fruits and vegetables contain more than 80 % water, and within fruit species, the water level varies widely depending on the environmental factors. 1 One of the most serious problems facing growers of fruits and vegetables is how to prevent these products from spoiling and thereby becoming unfit for consumption. There are various methods of accomplishing this, such as canning or freezing. 2,3 However, among all the drying methods, sun drying is a well-known method for drying agricultural commodities immediately after harvest, especially in developing countries. 4 The major objective in drying agricultural products is the reduction of the moisture content to a level which allows safe storage over an extended period. In addition, it brings about substantial reduction in weight and volume, minimizing packaging, storage, and transportation costs. 5 In spite of many disadvantages, sun drying is still practiced in many places throughout the world, such as tropical and subtropical countries. 6 The most important aspect of drying is the mathematical modelling of the drying processes. The literature shows that different approaches have been used to study the solar drying process for fruit products. 7 The drying process of apricots and apples, 8 drying kinetics of four fruits (apple, pear, kiwi and banana), 9 a new mathematical modelling of drying kinetics in the natural solar drying of banana, 10 the investigation of the behaviour of the thin layer drying of plantain banana, mango, and cassava experimentally in a direct solar dryer and secondly to perform mathematical modelling by using thin layer drying models. 11 In some cases, 9,[12][13][14] efforts were made to find the most suitable statistical model for predicting parameters, such as moisture content, drying rate, drying kinetic… etc., during the drying process. Although the researchers in all these studies performed fruitful investigations of the drying quality, the geo-meteorological parameters were not taken into consideration. Therefore, this study used accurate modelling to overcome these problems.
Recently, the artificial neural network (ANN) technique has attracted the interest of researchers because it is a fast computational approach providing an alternative and complementary method for modelling, allowing it to solve complex problems. 15 Many authors have investigated the use of an ANN to model artificial and solar drying behaviours. For example, use of ANN for prediction of the moisture ratio of apple with four input parameters, 16 estimation of the moisture content of papaya fruit with mathematical and neural network model as comparison, 17 and modelling the drying kinetics of jackfruit in a solar dryer using the neural network model. 18 It was considered that a suitable trained model could predict the drying process of jackfruit. Only a few studies used ANNs to study agricultural solar drying, with the majority focusing on artificial drying. The main objective and novelty in this study was the development of an accurate ANN model based on a large experimental database {10 parameters} of an open sun and direct solar drying process {moisture content and drying rate} of different fruits from different countries around the world and under different climates.
ber of variables input to the ANN model, while the number of output neurons depends on the number of outputs desired from the model. 21 There is no hard and fast rule to determine the number of neurons in the hidden layers. It depends on the complexity of the problem. Usually, three stages are considered in ANN applications: (i) training, (ii) validation, and (iii) testing. 22 During the training and validation stages, both input and target data are introduced to the network. However, in the testing stage, a different set of data containing only input values is fed to the network. 23 Several configurations were studied through the application of the feed-forward back-propagation network in the modelling of the drying process. Selecting the input variables is one of the most important steps during the design and training phases of an ANN. However, there was no systematic procedure for selecting the ANN inputs for the solar drying processes. Methodical approaches should be used in place of the trial-and-error method and human judgments to determine the key parameters for obtaining simple and reliable ANN models. 24 The main functions of an ANN are the multiplication, summation, and squashing operations. The squashing function is known as a threshold function or an activation function. The activation function may range from a simple step function to a sigmoid function. It restricts the applied input to within a specified range (0,1), which determines the output (or) to be applied as the input to the next layer of neurons. 25 The most commonly used functions are the logistic sigmoid transfer function for hidden layers and the purelin function for the output layer, which are reported to be better activation functions for ANNs. The learning rate is a parameter that determines the size of the weight adjustment each time the weights are changed during training. Small values for the learning rate cause small weight changes, and large values cause large changes. The best learning rate is not obvious. If the learning rate is 0.0, the network will not learn. The momentum term is a factor used to increase the speed of network training. It adds a proportion of the previous weight changes to the current weight changes. 26 Generally, in one-and two-layered networks, the best learning rate and momentum are ~0.2 and 0.3 < m ≤ 0.5, respectively, which yield the best combination of convergence and generalization. 27 The goal determines the desired accuracy for the output result, 19 and when the overall error becomes smaller than 0.001, the learning process is considered to be successful and is terminated. 28 Although there can be many performance measures for an ANN forecaster, like the modelling time and training time, the ultimate and most important measure of performance is the accuracy of the predictions it can achieve beyond the training data. However, a suitable measure of accuracy for a given problem is not universally accepted by forecasting academicians and practitioners. An accuracy measure is often defined in terms of the forecasting error, which is the difference between the actual (desired) and predicted values. 29 The most frequently used accuracy measures are the mean absolute deviation, sum of the squared error (SSE), mean squared error (MSE), root mean squared error (RMSE), and mean absolute percentage error. The training algorithm is used to find the weights that minimize some overall error measure such as the SSE or MSE. Hence, the network training is actually an unconstrained nonlinear minimization problem. 29 There are two different styles of training. In incremental training, the weights and biases of the network are updated each time an input is presented to the network. In batch training, the weights and biases are only updated after all of the inputs are presented. 30

Applications in food industry
In the last decades, ANNs tools have been slowly introduced in the field of food science and technology. They can be applied to analyse and model or predict the quality and the safety of food, like in modelling microbial growth, predict physical, chemical, and functional properties of food products during processing and distribution, and interpreting spectroscopic data.

Advantages of the use of ANN models
The ANN models have many advantages, namely, they can be used to predict the quality of food parameters to know how to preserve food for long time from mould. They can model the complex non-linear phenomena of drying phenomena using only relevant inputs and sufficient dataset in which the analytical methods are difficult to apply. They can be used for food classification based on colour, size etc., to reduce the experimental cost, and to offer the possibility to simulate the solar drying phenomena before starting the industrial processing operation.

Dataset collection
The data used in the current study were randomly collected from the experimental studies on the solar drying of fruits. The products discussed are summarized in Table 1. In most of the previous papers, the drying parameter variations during the drying period were presented by curves, whereas the main parameter variations for the solar drying process could be found. The collected data contains numerous assumptions. In addition, the emissivity of the cover surface of the drying system varied. In the current study and when the dried product was exposed directly to the sun without a cover, the emissivity was equal to one. If the dried product was covered by a material such as plastic, wood, or glass, we indicated its emissivity value. Another parameter was the slope angle used to capture the maximum radiation. In actuality, in open sun drying, this was seldom used. Instead, the preference was to place the products on a wide horizontal surface for drying. The tilt angle equalled the latitude of the geographical location in a case where the product was above a horizontal surface. 36 In this study, we examined the results of numerous studies from around the world and differentiated between them using their geographical coordinates, including the altitude, latitude, and longitude. In addition, for the temperature, which was an important parameter, two values were used, the inner and outer system drying values. When the dried product was directly exposed to the sun (no cover), these temperature values were equal. The nutritional value was used to distinguish between dried products. [37][38][39] Hence, the following ten parameters were used as input data: time, outside temperature, global solar radiation, inside temperature, inclination, emissivity, attitude, latitude, longitude, and nutritional value. These inputs were used to train and test an ANN model to predict two output parameters: the MC and DR.
A total of 824 data sets were randomly divided into two groups. The first group of 742 data sets was used for training, and the other group of 82 data sets was used for testing. In a MATLAB simulation, a feed-forward ANN model was adopted using a back-propagation training algorithm.
3 Results and discussion

ANN modelling results
An MLP network algorithm was used in this study. This network consisted of three layers, an input layer containing ten neurons, one or more hidden layers with a variable number of neurons, and an output layer containing two neurons. 40 Two phases were used to train a back-propagation network: a forward pass phase, during which information processing occurred from the input layer to the output layer, and a backward pass phase, where the error from the output layer was propagated back to the input layer and the interconnections were modified. 41 This algorithm adjusted the connection weights based on the back-propagated error computed between the observed and estimated results. 42 This was a supervised learning procedure that attempted to minimize the error between the desired and predicted outputs. The structure of the neural network used in this study is shown in Fig. 1.
This part of the paper is divided into two sections. The first section discusses the training of a neural network with one hidden layer to investigate the ability of a neural network to solve this kind of problem. Another goal was to determine the most appropriate algorithm, and investigate the impacts of the selected input parameters on the predicted outputs. The second section shows how the results discussed in the first section were used to improve the performance of the neural network by adding a second hidden layer and training the network with different database variants.

a) Section one
Before training the network, some parameters had to be set. For the data used in a neural net to be beneficial, it is The activation function used by the hidden neurons was a logistic sigmoid function, and a purelin function was used in the output layer. The learning rate was at 0.2 as an optimum value, and the momentum was set at 0.4. The number of epochs, which present the results of all the calculations made by the network like the responses of the output neurons, was set at 1000. The desired accuracy for the output result was controlled by the "goal" of the network. It was set at 1e-5, because a value smaller than 0.001 is preferred. The values of all these parameters were used for all of the training and testing steps. We started by using the 13 training algorithms available in MATLAB toolbox version 8.3.0. We observed the effect of each of these on the network performance, and the most suitable learning algorithms for the current study were obtained.
The best one was selected by trial and error. One hidden layer was adopted, and the number of neurons was varied from 1 to 30 for each tested algorithm. The performance of the network varied from one training session to another, which necessitated several repetitions of the training to obtain the right response.
The performance of the MLP network structure is clearly illustrated in Fig. 2, which shows the RMSE variation of each training algorithm during an increase in the number of neurons in the hidden layer.
Among the different training algorithms, Bayesian regularization (BR), which is shown by a red line, provided the best results and good stability during the variation of the neuron number. An RMSE of 0.1737 % and a coefficient of regression R = 99.9934 % were found with a topology of 10-27-2. The current algorithm took more time to converge than the other algorithms.
The Levenberg-Marquarardt (LM) training algorithm had an RMSE of 0.2234 % and coefficient of regression val-ue (R-value) of 99.9892 %, which was close to that of the BR algorithm. In addition, the LM algorithm took a shorter time to converge than the BR algorithm, but the stability was not good.
Regarding the other algorithms, Table 2 lists the ranks of all the training algorithms based on the min and max RMSE.
The other algorithms had fast convergence values, which varied from one to another. Nevertheless, their performances degraded.
In order to test the contributions of different variables, seven methods were used to determine the relative contributions and/or the contribution profiles of the input factors: (i) the partial derivatives ("PaD") method, (ii) "weights" method, (iii) "perturb" method, (iv) "profile" method, (v) "classical stepwise" method, (vi) "improved stepwise a" method, and (vii) "improved stepwise b" method. The procedure for partitioning the connection weights to determine the relative importance of the various inputs was first proposed by Garson 43 and repeated by 44 . The model selected as an example for the sensitivity test in the current study had a topology of 10 input neurons, 27 neurons in the hidden layer, and 2 output neurons.
In addition, the BR algorithm was used as the training algorithm. The statistical parameters (weight and biases) obtained for the network architecture used (10-27-2) are listed in Table 3.

b) Section two
Most authors use only one hidden layer for forecasting purposes. However, the use of one hidden layer may require a very large number of hidden nodes to achieve the optimal network. Therefore, determining the optimal number of hidden nodes is a crucial yet complicated step. However, using two hidden layers may give better results for some specific problems. In order to improve the performance of the network model, we made some assumptions: • we used four variants to train and test the database, as listed in Table 5; • we added a second hidden layer to provide more benefits.     decreased when the hidden nodes increased. Furthermore, the figures show that the ability of an ANN with two hidden layers is considerably better than that of the one with one hidden layer. In contrast, the presence of a single neuron in the hidden layer, in the first hidden layer or the second, gave undesirable results, as shown below.
A plot of the regression of the optimum ANN model with a topology of [10-29-13-2] is shown in Fig. 7. Mapping between the target and output data was performed at a satisfactory level because the RMSE was very small (close to zero) and the R-value was close to unity. According to the results shown in Fig. 7, it can be said that the ANN algorithm used was very effective at predicting the MC and DR.

Conclusion
In the present study, an ANN was adopted to develop a uniform and accurate model for simultaneously predicting the variation of the MC and DR of some fruits. This type of investigation has not been addressed in previous studies. Thus, this study was conducted to alleviate this deficit. The goal was to obtain a simple model with optimal performance. The best results were an R-value of 99.991 % and RMSE of 0.205 %, which were achieved using a network with a topology of 10-29-13-2 and database division of 60 %, 20 %, and 20 %. The developed model could be used to predict the MC and DR of fruits such as Amelie, Brooks Mangos, Apricot, Grape, Peach, Fig, Plum