ABSTRACT

This chapter aims to provide the fundamental idea of a data exploration methodology, regression. Regression is the process of estimating a relationship between variables. An event can be represented by statistical data where multiple variables participate to explain sub-events. When the variables are linearly dependent on each other, we can estimate one variable from the other by linear regression. Linear regression is a regression model where the dependent variable is continuous. Logistic regression or logit regression or logit model is a regression model where the dependent variable is categorical. Curve fitting is always a hard problem. In regression, we basically want to calculate the best fitting curve based on the statistical data. Outliers are the points that oppose the correlation of the points in the dataset. The cost function of regression is the square difference of the real value and predicted value.