Prediction of Hourly Global Solar Radiation: Comparison of Neural Networks / Bootstrap Aggregating

This research work explores the use of single neural networks and bootstrap aggregated neural networks for predicting hourly global solar radiation. A database of 3606 data points were from the Renewable Energies Development Center, radiometric station ‘Shems’ of Bouzareah. The single neural networks and bootstrap aggregated neural networks were built together. The precision and durability of neural network models generated with an incomplete quantity of training datasets were improved using bootstrap aggregated neural networks. To produce numerous sets of training data points, the training data was re-sampled utilising bootstrap resampling by replacement. A neural network model was built for each of the data points. The individual neural network models were then combined to produce the bootstrap aggregated neural networks. The experimental and predicted values of global solar radiation were compared, and lower root mean squared errors (68.3968 and 62.4856 Wh m −2 ) were discovered during the testing phases for single neural networks and bootstrap aggregated neural networks, respectively. The results of these models show that the bootstrap aggregated neural networks model is more accurate and robust than single neural networks.


Introduction
Solar energy encompasses both heat and light released by the sunshine, which is the result of nuclear processes within the Sun, Earth's nearest star. The quantity of energy developed is significantly more than the world's current energy needs. If it is properly harnessed and used, it might fulfil all future energy needs. 1 The entire quantity of solar energy absorbed by the Earth's surface is defined as global solar radiation; this quantity is extremely significant in different scientific fields, such as architecture, agriculture, climatology, and solar energy production. 2,3 Currently, solar radiation and renewable energy have emerged as critical energy technologies that can aid in dealing with climate change challenges. The expansion of the use of renewable energy sources results in a decrease in CO 2 emissions, reduction in local air pollution, the development of high-value professions, and a reduction in a country's reliance on fossil energy imports. When comparing solar energy to other fossil-based energy sources, it is clear that solar energy, a significant source of energy, is not harmful to the environment and has no effect on global warming. 4,5 The solar radiation data is extremely significant for manufacturers, designers of solar energy systems, architects, and agriculturists. However, solar radiation measurements have only been taken at a few sites around the world due to various reasons, among them being installation costs, maintenance, and calibration. 6 As a result of the significance of the global solar radiation data, several approaches are being utilised to estimate solar radiation from various parameters. Several statistical approaches, such as analysis methods and statistical techniques, are feasible options for estimating solar radiation using a range of meteorological factors. [7][8][9] Artificial neural network (ANN) technology has recently received much attention as a computational approach that provides an alternative and integrated modelling method, given its ability to deal with complex and ill-defined problems in many scientific fields. In the meteorological field, many researchers have studied the use of single neural network (SNN) models for predicting global solar radiation, as seen in the literature. [10][11][12][13][14][15][16][17][18][19][20][21][22][23][24] However, ANN has been reproved for its nature, which lead to difficulties in understanding the linearity or quadratic dependency of the transfer equations. Furthermore, the computational cost, as well as the issue of overfitting, was found. 25 The available training data and the training procedure have a considerable impact on the quality and resilience Prediction of Hourly Global Solar Radiation: Comparison of Neural Networks / Bootstrap Aggregating of neural network models; otherwise, the neural network would likely overfit noise in the training data and display severe generalisation errors. Developing a variety of neural network models and then combining them is an appealing technique for increasing neural network model resilience. Several academics have applied the combination of different neural network models. 26 The bootstrap aggregated neural networks (BANN) models were developed by Hansen and Salamon. 27 It is a method for improving a model's generalisation capabilities by training multiple neural networks, which are then combined. This strategy is not only successful but simple to use, as it has been applied in a variety of settings. BANN have been demonstrated to have superior generalisation capabilities over SNNs. 28,29 Several strategies, such as collecting staking neural networks, have been proposed in order to improve the robustness and resilience of neural network models. 30 Practically all of the established models in the scientific literature contributions are made on the basis of a single artificial neural network arrangement, which is trained to predict the values of solar radiation using different climatic data gathered over the years. By collecting staking neural networks, the adoption of a single structural model can reduce estimation accuracy and slow down the training process, causing the predictive model to diverge and become unstable. 31 The aim of this research work was to improving a BANN model to predict the hourly global solar radiation received on the horizontal plane over one year in the region of Bouzareah (Algeria), using eight meteorological and climatological parameters. According to our knowledge, no studies using bootstrap-based ANN for modelling solar radiation have been described in the literature. Individual neural network (INN) and SNN models were compared with BANN.

Related studies
A number of studies and research initiatives have attempted to predict solar radiation, the most significant accomplishment in recent years. Different methods have been examined, such as smart persistence (SP), neural network (NN), and random forest (RF), which were compared and evaluated using solar data collected at a high-variability meteorological location; the nRMSE obtained was 22.57 %. 32 An applied ANN model was established to estimate tilted irradiance at different inclinations in Taiwan. The input parameters contained global horizontal irradiance, solar elevation, azimuth cosine, azimuth sine, cosine and sine of pyranometer azimuth, and dip. The model consisted of three hidden layers, each one containing six neurons, total mean nRMSE of 8.02 %. 33 It used meteorological data gathered over a 10-year period from five distinct places throughout India to train the models based on different methods for forecasting monthly average global solar radiation. 34 A backpropagation neural network (BPNN) was used to predict solar irradiance data in different spectral bands from daily time series. Three performance metrics were used to assess the proposed model's ability to forecast solar irradiance, including RMSE, mean systematic error (MBE), and correlation coefficient. The results showed that the model predicted daily solar irradiance more accurately. 35 Using weather factors, sun angles, and extra-terrestrial irradiance, a neural network model was developed to forecast global irradiance. The findings showed that the suggested model was more efficient in estimating global irradiance than the power persistence forecast model. 36 Studied was the hourly solar radiation received on the horizontal plane in Ghardaïa city (Algeria) using an SNN model with the quasi-Newton backpropagation (BFGS) as activation function, and useful weights method to find the importance of all input parameters. 15 The best results were obtained with a root mean square error (RMSE) of 4.71 %. These findings proposed that the optimised model was stable and had a strong predictive capacity.
Rezrazi et al. demonstrated how to reach an optimum ANN model for solar radiation prediction. The optimisation process was demonstrated using measured data from Ghardaïa city in 2007. The performance of ANN models was evaluated, and the results were compared with measured data by mean absolute percentage error (MAPE). The MAPE in the ANN optimum model was determined to be 1.17 %. This model also had a 14.06 % root mean square error (RMSE), and a 0.12 % MBE. The collected results showed that the optimisation technique had met practical criteria. It may be generalised to any point on Earth, and utilized in applications other than solar radiation estimation. 37 Detected was the capacity of multilayer perceptron to create very short-time irradiation estimations (5 min) in Bouzareah over 2 years. The entered parameter used declination zenith angle and azimuth. The average nRMSE was found to be 8.81 %, which is excellent accuracy for such a short time step. 3 In this context, in the present research work, we applied and developed a method of BANN based on SNN for improving robust non-linear models for predicting hourly global solar radiation in Bouzareah.

Studied region and data collection
The database of users in this research work was collected from the radiometric station 'Shems' belonging to the Center for Renewable Energy Development (CDER) of Bouzareah, in Algiers, with latitude 36.8 °N, longitude 3.17 °E, and altitude 345 m. These data were recorded over one year (January 1 -December 31, 2015). A Mediterranean climate prevails at the location, with dry, hot summers, and damp, chilly winters ( Fig. 1).
This database (DB) has 3606 points. It was used with the objective of optimising bootstrapped aggregated neural networks (BANN) parameters. In the database, we excluded all values less than 120 W m −2 (from 5 to 17 h) based on the world meteorological organisation that defines the sunshine duration when the global solar radiation values are higher than 120 W m −2 . 39

Modelling procedure 2.2.1 Single neural networks
ANNs were inspired by the way natural neurons process information. Neurons are individual cells that are combined to create a dense network of around 10-100 billion linked units in the human brain. There are four components of a biological neuron: 1) dendrites represent the principal part of the inputs of neurons that receive information and commands from other neurons; 2) cell body contains the nucleus of the nerve cell; it is the information processing centre; 3) axon is the neuron's output and the bearer of information to the rest of the brain's neurons; 4) synapses are the synaptic weights of formal neurons that link neurons with each other.
The neurons are the most significant element of neural networks and have a highly capable mathematical base and direct values. 40 SNN includes three layers: input, hidden, and output layers. Compute units make up the layers, which are connected via transfer functions. The neurons of each layer are connected with the neurons of the adjacent layer through connections called synaptic weights w ij , affecting the influence of each input on the output of the neuron. Each neuron integrates all the signals from the neurons of the previous layer according to an activation function. 41 Fig. 2 shows a technique for designing and optimising the architecture of INN and SNN.
The statistical investigation of the overall data was done in terms of the minimum "min", the average "mean", the maximum "max", "sum", variance "Var", and the standard deviation "STD" as shown in Table 1.

Bootstrap aggregated neural networks
Developing a variety of ANN models and then combining them is an appealing technique for increasing this model's robustness. Many academic researchers have looked into combining different neural network models 42,26 to construct BANN models, the training data set was re-sampled using bootstrap re-sampling with replacement 43,44 to create 30 training sets.
where y i represents the output of INN, y represents the output of the BANN, and n is the number of INN models.
Each model was given its own neural network (SNN and BANN). Every ANN contains three layers of neurons: an   input layer with eight neurons, a hidden layer with many neurons regulated during training, and an output layer with one unit that generates the value of global solar radiation prediction. The number of hidden neurons ranged from three to twenty-five in this study. The logistic sigmoid (logsig), the tangent hyperbolic (tanh), the sine function, and the exponential activation function were applied in the hidden layer. The pure-linear (purelin) activation function was utilised in the output layer. All neural networks were trained by the BFGS quasi-Newton (trainbfg).    4 shows the global solar radiation as a function of temperature for the total database. It is clear that global solar radiation increases with increasing temperature at some points.   Table 2 shows the coefficient of correlation (R) and the root mean squared error (RMSE) obtained for predicting hourly global solar radiation under the impact of the division of the database for the SNN model. The results show that Section 1 was the optimal division, as it gave better results than the other divisions for the testing phase. The INN were then developed using Section 1 of the data set.  Table 3 shows the architecture of the INN and SNN models. The INN and SNN are not harmonic, as may be seen, and are given diverse structures. Fifteen INNs applied the activation function log sigmoid (logistic), and thirteen tangent hyperbolic (tanh) activation function, the same activation that gives the SNN model its efficiency, dependability, and resilience. The activation function exponential was utilised in the hidden layer of two individual neural networks; however, the function sin was not employed in the hidden layer of neural networks. Thus, we determined the supremacy of the activation functions (tanh and sigmoid) over the functions (sin and exponential), and these findings corroborate the findings of Kiseľák et al. 45 and Ammi et al. 40 According to this analytical discussion, two types of neural network models were developed (SNN and BANN (Stacking of 30 networks) ) with the goal of predicting hourly global solar radiation on a horizontal plane. for validation data, and 0.9680 for test data). In both the SNN and BANN models, the slope is near 1 during the validation phase, and it is extremely near 1 during the testing phase. The intercept b is distant from 0 for the validation and testing phases in the two models (SNN) and (BANN). In general, the correlation coefficients are assumed excellent when (0.9000 ≤ R ≤ 1.0000) for these models (SNN and BANN); this demonstrates the resilience of existing neural network models as well as the ability to predict hourly global solar radiation.

Comparison with other models
In order to evaluate the importance of the obtained results, they were compared with similar studies developed by other researchers, especially with the models that had the same inputs as we used. It was found that, in all models, the goal was to predict solar radiation. These results confirm the strength and accuracy of BANN model for global solar radiation. The results obtained from these mentioned models, and results in this work are presented in Table 5.  The BANN model can be utilised to predict solar radiation for locations that have no measurement equipment (solarimeters/pyranometers) and relevant systems, when the actual data set available is small, in case of a group of outliers that must be excluded, or missing data, as well as to install solar-energy systems, and assess the thermal conditions in building studies in Algeria.