Issue archive

https://doi.org/10.15255/KUI.2016.004
Published: Kem. Ind. 66 (1-2) (2017) 59–68
Paper reference number: KUI-04/2016
Paper type: Professional paper
Download paper:  PDF

Simulation of Simple Linear Regression

S. Džalto and I. Gusić

Abstract

The purpose of this paper is a computer simulation of conditions relevant for simple a linear regression model and computer confirmation of its basic equations. To that end, a simple linear regression model is described and mathematical foundations of the model are discussed. Listed are the equations for the objective function (Equation 1.3), regression line parameters (Equation 1.4), estimation of regression line parameters (Equation (3.2)), and confidence interval (Equations (3.8) and (3.9)). Estimation of variance (Equation (3.4)) is based on Equations (3.6) and (3.11), while (3.11) is based on (3.12). The conditions of the simple linear regression were simulated in Matlab. The model parameters were selected with Equations (4.1) and (4.2), and 10 000 series of 7 data were generated as a simulation of 10 000 experiments under the same conditions in engineering practice. Each series represented a measurement of a dependent variable for seven fixed independent variable values in circumstances in which the linear regression model assumptions had been satisfied. For a randomly chosen series of 7 data, the estimates of parameters can significantly deviate from true parameter values (Table 1), indicating that a relatively small number of measurements in practice can lead to unreliable estimates. The estimate can deviate from true value even if the number of measurements is relatively large (Table 2). On the other hand, it is shown that the arithmetic mean of 10 000 calculated parameters is almost identical to true parameter values. In other words, it is confirmed that the estimates from consecutive measurements under the same conditions are, in average, correct. Simulation of 10 000 series also confirmed other mentioned equations: distribution from Equation (3.12) (Table 3 and Fig. 2), t-distribution from (3.5.1) and (3.5.2) (Table 4 and Fig. 3), and confidence intervals for regression line parameters from Equations (3.8) and (3.9). The computer simulation can serve for the better understanding of the simple linear regression model and successfully replace proving the mathematical facts on which linear regression is based.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License

Keywords

simple linear regression, normal distribution, chi-squared distribution, Student's distribution, confidence interval, Matlab