Implementation¶
This section demonstrates how to fit a regression model in Python in practice. The two most common packages for fitting regression models in Python are scikit-learn
and statsmodels
. Both methods are shown before.
First, let’s import the data and necessary packages. We’ll again be using the Boston housing dataset from sklearn.datasets
.
Scikit-Learn¶
Fitting the model in scikit-learn
is very similar to how we fit our model from scratch in the previous section. The model is fit in two steps: first instantiate the model and second use the fit()
method to train it.
As before, we can plot our fitted values against the true values. To form predictions with the scikit-learn
model, we can use the predict
method. Reassuringly, we get the same plot as before.

We can also check the estimated parameters using the coef_
attribute as follows (note that only the first few are printed).
Statsmodels¶
statsmodels
is another package frequently used for running linear regression in Python. There are two ways to run regression in statsmodels
. The first uses numpy
arrays like we did in the previous section. An example is given below.
Note
Note two subtle differences between this model and the models we’ve previously built. First, we have to manually add a constant to the predictor dataframe in order to give our model an intercept term. Second, we supply the training data when instantiating the model, rather than when fitting it.
The second way to run regression in statsmodels
is with R
-style formulas and pandas
dataframes. This allows us to identify predictors and target variables by name. An example is given below.
CRIM | ZN | INDUS | CHAS | NOX | RM | AGE | DIS | RAD | TAX | PTRATIO | B | LSTAT | target | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.00632 | 18.0 | 2.31 | 0.0 | 0.538 | 6.575 | 65.2 | 4.0900 | 1.0 | 296.0 | 15.3 | 396.90 | 4.98 | 24.0 |
1 | 0.02731 | 0.0 | 7.07 | 0.0 | 0.469 | 6.421 | 78.9 | 4.9671 | 2.0 | 242.0 | 17.8 | 396.90 | 9.14 | 21.6 |
2 | 0.02729 | 0.0 | 7.07 | 0.0 | 0.469 | 7.185 | 61.1 | 4.9671 | 2.0 | 242.0 | 17.8 | 392.83 | 4.03 | 34.7 |
3 | 0.03237 | 0.0 | 2.18 | 0.0 | 0.458 | 6.998 | 45.8 | 6.0622 | 3.0 | 222.0 | 18.7 | 394.63 | 2.94 | 33.4 |
4 | 0.06905 | 0.0 | 2.18 | 0.0 | 0.458 | 7.147 | 54.2 | 6.0622 | 3.0 | 222.0 | 18.7 | 396.90 | 5.33 | 36.2 |