Linear Model in Python

Basic Work Flow:

  1. check distribution for each variable for assumptions
  2. check for outliers and any transformation need to perform (log?)
  3. fit a linear regression model
  4. Analyze model fitting (R square)
  5. predict (un-transform the result if transformation is performed)
# useful packages
import pandas as pd
import json
import statistics
import urllib
import numpy as np
import scipy
import matplotlib.pyplot as plt
# assume features are log-trandformed
X = data["feature1", "feature2"]
X = sm.add_constant(X)
y = data["outcome"]

model = sm.OLS(y, X)
result =

print("R Squared is:", round(result.rsquared,3))

Xpred = np.log(range(8, 11))
Xpred = sm.add_constant(Xpred)
print("Predicted results for 10 is :",
  round(np.exp(result.predict(Xpred))[2], 3))

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: