Linear Regression

As a ML Practitioner, mainly the first thing to do is to frame the problem you are gonna work on, This will helps you to decide which algorithm to select and what performance measure you will us to evaluate your model and how much effort you should spend tweaking it.

The next question to ask is what the current solution looks like. it will often gives you a reference performance and insights to solve the problem.

Your next step is to select a performance Measure that is suitable for our case(Linear Regression) is the Root Mean Square Error (RMSE) as it gives idea of how much error the system typically make it prediction with a higher weight for large errors.

The root mean square error (RMSE) has been used as a standard statistical parameter to measure model performance in several natural sciences. The parameter indicates the standard deviation of the residuals or how far the points are from the regression or modelled line. The following figure shows the residuals as green arrows and its location between the point data and the regression line.


RMSE is preferred for regression problems but in some other context another method used, such as when the data containing a lot of Outliers then Mean Absolute Error used(MAE).

Both RMSE and MAE are ways to measure the distance between two vectors a ( predictions vector , vector of target value)

Linear regression to a process to get the best fit line for data. Firstly we have a training data then we start to fit a line on it and during the learning process the model tries to learn the parameter of a line the best fits the training data.

y = B0 + B1*x, this is the equation of a line and ( B0, B1) are the parameter that the model learn from data

When Parameters become zero, it effectively removes the influence of the input variable on the model and therefore from the prediction made from the model (0 * x = 0). This becomes relevant if you look at regularization methods that change the learning algorithm to reduce the complexity of regression models by putting pressure on the absolute size of the parameters, driving some to zero.


Regularization seeks to both minimize the sum of the squared error of the model on the training data but also to reduce the complexity of the model (like the number or absolute size of the sum of all coefficients in the model).

Two popular examples of regularization procedures for linear regression are:

  • Lasso Regression: where Ordinary Least Squares is modified to also minimize the absolute sum of the coefficients (called L1 regularization).
  • Ridge Regression: where Ordinary Least Squares is modified to also minimize the squared absolute sum of the coefficients (called L2 regularization).
  • Elastic_net_regularization: Elastic Net first emerged as a result of critique on lasso, whose variable selection can be too dependent on data and thus unstable. The solution is to combine the penalties of ridge regression and lasso to get the best of both worlds.


  1. Machine Learning Mastery
  2. Machine learning with Python( book)
  3. Wikipedia



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store