Artificial Neural Network Models for Prediction of Density and Kinematic Viscosity of Different Systems of Biofuels and Their Blends with Diesel Fuel. Comparative Analysis

In the present article, two models based on the artificial neural network methodology (ANN) have been optimised to predict the density (ρ) and kinematic viscosity (μ) of different systems of biofuels and their blends with diesel fuel. An experimental database of 1025 points, including 34 systems (15 pure systems, 14 binary systems, and 5 ternary systems) was used for the development of these models. These models use six inputs, which are temperature (T) in the range of −10 – 200 °C, volume fractions (X1, X2, X3) in the range of 0–1, and to distinguish these systems, we used kinematic viscosity at 20 °C in the range of 0.67–74.19 mm2 s−1 and density at 20 °C in the range of 0.7560–0.9188 g cm−3. The best results were obtained with the architecture of {6-26-2: 6 neurons in the input layer – 26 neurons in the hidden layer – 2 neurons in the output layer}. Results of comparison between experimental and simulated values in terms of the correlation coefficients were: R2 = 0.9965 for density, and R2 = 0.9938 for kinematic viscosity. A 238 new database experimental of 4 systems (2 pure systems, 1 binary system, and 1 ternary system) was used to check the accuracy of the two ANN models previously developed. Results of prediction performances in terms of the correlation coefficients were: R2 = 0.9980 for density, and R2 = 0.9653 for kinematic viscosity. Comparison of validation results with those of the other studies shows that the neural network models gave far better results.


Introduction
According to the statistics, the total greenhouse gas (GHG) emissions associated with aviation will be 400 to 600 % higher in 2050 than in 2010. 1 In addition, it is imperative to reduce carbon dioxide (CO 2 ) emissions given their harmful impact on the environment. Over the last 10 to 20 years, these environmental concerns have led to the intensification of studies inherent to the search for alternatives to conventional fuels. 2 Among the developed alternatives that have significantly lower greenhouse gas emissions than conventional fuel, biofuels are presented as an attractive renewable energy source that is environmentally friendly. 3 The raw materials used for the production of biofuels are of biological origin and are therefore renewable. Non-edible oilseed crops are the main resources available. 3 Specifications and limits on physical properties of fuels are established in America by the American Society for Testing and Materials (ASTM specifications) and in Europe by the European Committee for Standardization (EN European Norm). 4 Density and viscosity of biofuel are two important physical properties that define the quality of fuels. 5 The density of a substance is defined as its mass per unit volume. It is a physical property that is used for the design of storage tanks and pipes, and that calculates the precise volume of fuel needed to provide the adequate combustion. Viscosity is the physical property of a substance characterizing its resistance to flow. Viscosity influences the lubrication properties as well as the combustion properties of the fuel.
Low viscosities lead to poor lubrication, which can cause excessive wear and leakage. Higher viscosities can cause an obstruction of the hoses or poor atomization of the fluid leading to poor combustion and an increase in exhaust gas emissions. 4 Physical and chemical properties of some compounds cannot be obtained directly or immediately from experiments due to experimental safety and efficiency. 6 The development of prediction methods are of great value in estimating properties of biofuel. Therefore, much research has been directed towards the design of models for the prediction of physical properties of different types and different systems of biofuel. For example, Baroutian et al. 7 proposed the empirical correlations to estimate viscosity and density of binary and ternary blends of palm oil + palm biodiesel + diesel fuel at different temperatures. Chavarria-Hernandez et al. 8 designed three complementary correlations to accurately predict the kinematic viscosity of FAMEs and biodiesel for a wide temperature range (263.15-373.15 K), and for a wide range of hydrocarbon chain length (C6 : 0-C24 : 0, including unsaturated FAMEs).
Few predictive studies of biofuel properties using artificial neural network models have been reported in the litera-ture. For example, Rocabruno-Valdés et al. 9 predicted the density, dynamic viscosity, and cetane number of biodiesel using artificial neural networks. The authors used an array of 16 : 1 : 1 for the density of biodiesel, sixteen entries for the temperature, and the composition of each fatty acid methyl ester, C8 : 0 to C24 : 0 in biodiesel, 16 : 1 : 1 for the dynamic viscosity of biodiesel, 16 : 2 : 2 for the density-viscosity of biodiesel, 16 : 6 : 1 for the cetane number of biodiesel. More recently, various neural networks, including single layer neural network (SLNN), deep neural network (DNN) with multi-layers, and convolution neural network (CNN) have been developed by Houet et al. 6 to predict multiple molecular properties simultaneously.
For the prediction of all 15 molecular properties at a time, DNN with 3-layers network exhibits the best results. They concluded that the number of layers in DNN play a key role in the prediction of multiple molecular properties simultaneously.
Given the importance of viscosity and density prediction models, the main objective of this work was to develop a mathematical model using artificial neural network to estimate the density and kinematic viscosity of different types and different systems of biofuels and their blends with diesel fuel. The details of the calculation method, numerical validation, statistical analysis, and comparative analysis are fully described in this work.

Density and viscosity
The general empirical correlations in the literature for the density and viscosity in function of temperature and volume fraction can be given by: where y is a kinematic viscosity (µ) in mm 2 s −1 or a density (ρ) in g cm −3 , b 0 , a 0 , a 1 , a 2, …a 11 are constants, T is a temperature in K, and X v is a volume fraction.
The values of (b 0 , a 0 , a 1 , a 2, … a 11 ) for different systems in the literature are summarized in the Table 1.

Artificial neural networks
Neural network is a widely distributed parallel processor consisting of simple processing units (nodes) that perform certain mathematical functions, usually non-linear. This type of arithmetic calculated by the system is similar to the human brain structure. The great advantage of these models is their ability to learn, circulate, or extract rules automatically from the complex data. 13 In ANN applications, three stages are considered: (a) training, (b) validation, and (c) testing. 14 In such ANN, a neuron in a hidden or an output layer has two tasks: 15 to sum the weighted inputs from several connections plus a bias value, and then to apply the transfer function to the sum ( Fig. 1): The output S k : ; k = 1, 2, …, l.
(3) Table 1 -Values of (b 0 , a 0 , a 1 , a 2 … a 11 ) for different systems in the literature

Constants Kinematic viscosity Density
Ref. 10 Ref. 11 Ref. 12 Ref. 10 Ref. 11 Ref. 12 Eq.  Combining Eqs. 2 and 3, the relation between the output (S k ) and the inputs E i of the ANN is: The output is computed by means of a transfer function (activation function). The typical activation functions which fulfil these requirements are: 15 hyperbolic tangent transfer function: , identity transfer function: where E i is the input of neuron i; w i,j is the synaptic weight of the connection from neuron i of the input layer to neuron j in the hidden layer; v k,j represents the weight of synaptic connection the neuron j of the hidden layer to neuron k of the output layer; b j and b k represent the means of neuron j in the hidden and the neuron k of the output layer; S k is the output of neuron k.

Data acquisition and analysis
Databases of this study have been formed from the results reported in the literature. A total of 1025 data points were obtained from various scientific publications 10-12,16-24 to estimate the density and kinematic viscosity of different systems of biofuels and their blends with diesel fuel.
In this case, the database was divided randomly into three groups of 70 % for training, 15 % for testing, and 15 % for validating. Temperature, volume fraction, μ (at 20 °C), and ρ (at 20 °C) were considered as input variables of the artificial neural network.
The minimum (min) and maximum (max) values are presented in Table 2 and a general representation of an ANN with "six" number of inputs is shown in Fig. 2.
The list of pure, binary, and ternary systems, and the experimental data points for the density and kinematic viscosity of both systems that have been used in this study is presented in Table 3. In this work, the STATISTICA program was used for the application of the artificial neural network.

Selection of optimal configuration
The performance of a trained network can be measured to some extent by the errors on the training, validation, and test data sets. Regression analysis has been applied to assess the network capability for density and kinematic viscosity predictions.
The coefficient of determination, R 2 (see Eq. (7)), 25,26 has been used as a measure to evaluate how the trained network estimation is correlated to the experimental data. Also, different neural network topologies have been compared using root mean square error (RMSE) (see Eq. (8)). 25,26 The RMSE and R 2 are calculated using the following equations: , , (8) where n is the number of observations; k is the number of variables; y exp and y pred are the observed and the calculated values, respectively; and and are the averaged values for the observed and the calculated values, respectively.

Internal and external validation
For the validation of the predictive power of the ANN model, two basic principles (internal validation and external validation) are available. In this study, the internal validation was used to evaluate the internal predictive ability of the developed models, and its result was defined as Q 2 LOO (see Eq. (9)) 27 and the external validation to determine both the generalizability and the true predictive ability of the ANN models for new chemicals, by splitting the available dataset into a training set and an external prediction set, and its result was defined as Q 2 ext (see Eq. (10)). 27 The Q 2 LOO and Q 2 ext are calculated using the following equations: , (9) where , , and are the experimental, predicted, and average of the experimental values for the training set, respectively. For Q 2 LOO > 0.5 is considered satisfactory, and for Q 2 where and are the experimental and predicted values for the prediction set, respectively, and is the mean experimental of the experimental values for the training set.

Results and discussion
A range of one to six neurons in the hidden layer was obtained in the ANN models to predict the properties of dif- Soybean oil, sunflower oil, rapeseed oil, grapeseed oil Corn oil BD * (n-peanut-sunflower) Soybean FAME BD * (rapeseed oil) BE * BD * (sunflower waste-frying oil), Jojoba oil Castor oil Soybean FAEE Jatropha FAME, Jatropha FAEE The topology of the neural network was determined by the following: the number of layers and neurons in each layer, and the nature of the transfer functions. The next section presents the most effective models that were found in this work. The STATISTICA program was used for the application of the artificial neural network. The best neural network configuration had one hidden layer with twenty-six (26) neurons (Table 4). Fig. 3 illustrates the correlation between the simulation results of the developed neural network and the experimental data points for the density and kinematic viscosity. The perfect fit (output equal to targets) is indicated by the solid line. The close proximity of the best linear fit to the perfect fit, as observed in Fig. 3, shows a good correlation among the network predictions and the experimental data (R 2 = 0.9965 for the density and R 2 = 0.9938 for the kinematic viscosity).

Mathematical expressions optimised neural models
From the optimised ANN, represented in Fig. 4, the density and kinematic viscosity of different systems can be expressed by a mathematical model that incorporates all inputs E i (T, X 1 , X 2 , X 3 , ρ (at 20 °C), µ (at 20 °C)) within it, as follows:  The output ρ, µ: , (12) .
The combination of Eqs. 11 and 12 leads to the mathematical formula for density taking into account all the inputs E i (T, X 1 , X 2 , X 3 , ρ(20 °C), µ(20 °C)): The combination of Eqs. 11 and 13 leads to the mathematical formula for kinematic viscosity taking into account all the inputs E i (T, X 1 , X 2 , X 3 , ρ(20 °C), µ(20 °C)): , (15) where w i,j represents the synaptic weight of the connection from neuron i of the input layer to neuron j in the hidden layer; v k,j represents the weight of synaptic connection the neuron j of the hidden layer to neuron k of the output layer; b j and b k represent the means of neuron j in the hidden and the neuron k of the output layer.

Statistical analysis on the ANN models for systems
The experimental and simulated values were compared using a linear regression model with a good agreement. Table 5 indicates the statistical analysis for the comparison between experimental and simulated values for the density and kinematic viscosity of different systems.

Interpolation and extrapolation performances
To check the accuracy of the two ANN models previously developed and optimised, one type of interpolation and extrapolation databases were used. The interpolation database contains a set of intermediate points between the experimental points of kinematic viscosity and density of soybean oil. 21 The extrapolation database contains a set of points outside the experimental points of kinematic viscosity and density of soybean oil. 21 The quality of fit of the interpolation data set and the extrapolation data set are depicted in Fig. 5 for the density and the kinematic viscosity. An excellent fit to the experimental values density and kinematic viscosity of soybean oil can be noted: y exp (at 140 °C) < y pred (at 130 °C) < y exp (at 120 °C)< ... < y pred (at 15 °C) < y exp (at 10 °C).
So, a good interpolation database for the density and the kinematic viscosity.
So, a good extrapolation database for the density and the kinematic viscosity, where y exp and y pred are the experimental and predicted values for the density and the kinematic viscosity.

Prediction performances
Prediction performance is a method to check the accuracy of the two ANN models previously developed and optimised. A 238 database experimental of 4 systems (2 pure systems, 1 binary system, and 1 ternary system) was used for the prediction. This database was obtained from the scientific publication. 16 The results of prediction performances in terms of root mean squared error (RMSE) for density and kinematic viscosity of new systems are summa-  Table 6. The quality of fit of the prediction data set is depicted in Fig. 6 for the density and kinematic viscosity. An excellent fit between the experimental values and the results obtained from the ANN model of density and kinematic viscosity of new systems (R 2 = 0.9980 for the density and R 2 = 0.9653 for the kinematic viscosity).

Comparison with unstructured kinetic models
To have a comparison between the neural networks models developed in this work with others' correlations, the diffusivity of two pure systems, a binary system, and a ternary system was estimated. Therefore, the network developed, as well as eight correlations of the different systems   proposed by Baroutian et al. 16 for the density and the kinematic viscosity, were used (Table 7). Table 8 indicates the results of this comparison and the RMS error of pure, binary, ternary, and global system for the density and the kinematic viscosity of each method separately. Fig. 7 shows the RMS error of the proposed ANN model and processes proposed by Baroutianet et al. 16 used for the prediction of the density and the kinematic viscosity for all points of the test data for the three systems. As can be seen from the figure, the proposed neural network model is better than the other method for the density and kinematic viscosity, despite the proposed ANN model formed by a large database of 1025 experimental points with several pure, binary, and ternary systems, plus it proposed ANN model to estimate the density and the kinematic viscosity at the same time. The RMS error of the ANN method for all test data of the density is 0.0007 g cm −3 , however, in the method of Baroutian et al. 16 it is 0.0034 g cm −3 .

Conclusions
New models of one artificial neural network were developed to predict the density (ρ) and kinematic viscosity (µ) of different systems of biofuels and their blends with diesel fuel from the volume fractions (X 1 , X 2 , and X 3 ) of the components, temperature (T). For difference between these systems, we used ρ (at 20 °C) and µ (at 20 °C). A set of 1025 points of experimental data for the density and the kinematic viscosity of 34 systems was used for network training. The back propagation of the neural network was done by transfer functions like tangent hyperbolic and identity for hidden layer and output layer, respectively. The BFGS algorithm was used for optimisation of the neural network. A range of one to twenty-six neurons in the hidden layer of models were obtained. In the validation stage, the results of comparison between experimental and simulated values in terms of the root mean squared error, the internal validation and the external validation for the density and the kinematic viscosity were, respectively: RMSE = 0.0020 g cm −3 , Q 2 LOO = 0.9960, Q 2 ext = 0.9966, and RMSE = 0.86 mm 2 s −1 , Q 2 LOO = 0.9924, Q 2 ext = 0.9960. The results of applying the neural network model formed for the density and the kinematic viscosity of systems indicate that the method has very good interpolation and extrapolation capabilities with the respect to the temperature. The results of applying the neural network model formed to predict the density and kinematic viscosity of new systems (prediction) indicate that the method has good prediction for 238 new databases experimental of four systems. The results of prediction performances in terms of the root mean squared error were: RMSE = 0.0005 g cm −3 for density and RMSE = 0.14 mm 2 s −1 for kinematic viscosity. Furthermore, the comparison of validation results with those correlations proposed by Baroutian et al. indicated that the ANN predicted the density and kinematic viscosity more accurately than those correlations proposed by Baroutian et al.