Multiple Linear Regression

Learn how prediction works when we have more than one input: hours studied, sleep, practice, experience, rooms, area, price, and more.

1. The Simple Idea

In simple linear regression, we use one input to predict one output.

y = a + bx

But real life usually depends on many inputs. For example, marks may depend on:

Hours studied
Sleep
Practice sessions
Previous score

2. Multiple Linear Regression Formula

y = a + b₁x₁ + b₂x₂ + b₃x₃ + ... + bₙxₙ

Here:

y
Predicted output
a
Intercept or base value
x₁, x₂, x₃
Input features
b₁, b₂, b₃
Weights or coefficients

3. Example

Suppose we want to predict marks using hours studied, sleep, and practice.

Marks = 10 + 5 × Hours + 2 × Sleep + 3 × Practice

Meaning:

4. Dataset

Hours  Sleep  Practice  Marks
2      6      1         50
4      7      2         65
6      8      3         80
8      7      4         88
10     8      5         98

5. Python Implementation from Scratch

import numpy as np

# Inputs: Hours, Sleep, Practice
X = np.array([
    [2, 6, 1],
    [4, 7, 2],
    [6, 8, 3],
    [8, 7, 4],
    [10, 8, 5]
], dtype=float)

# Output: Marks
y = np.array([50, 65, 80, 88, 98], dtype=float)

# Add intercept column of 1s
X_b = np.c_[np.ones((X.shape[0], 1)), X]

# Normal Equation:
# beta = (X.T X)^-1 X.T y
beta = np.linalg.inv(X_b.T @ X_b) @ X_b.T @ y

print("Intercept:", beta[0])
print("Coefficients:", beta[1:])

# Predict marks for:
# Hours = 7, Sleep = 8, Practice = 4
new_student = np.array([1, 7, 8, 4])
prediction = new_student @ beta

print("Predicted marks:", prediction)

6. Using NumPy Least Squares

import numpy as np

X = np.array([
    [2, 6, 1],
    [4, 7, 2],
    [6, 8, 3],
    [8, 7, 4],
    [10, 8, 5]
], dtype=float)

y = np.array([50, 65, 80, 88, 98], dtype=float)

X_b = np.c_[np.ones((X.shape[0], 1)), X]

beta, residuals, rank, s = np.linalg.lstsq(X_b, y, rcond=None)

print("Intercept:", beta[0])
print("Coefficients:", beta[1:])

prediction = np.array([1, 7, 8, 4]) @ beta
print("Predicted marks:", prediction)

7. Using sklearn

import numpy as np
from sklearn.linear_model import LinearRegression

X = np.array([
    [2, 6, 1],
    [4, 7, 2],
    [6, 8, 3],
    [8, 7, 4],
    [10, 8, 5]
], dtype=float)

y = np.array([50, 65, 80, 88, 98], dtype=float)

model = LinearRegression()
model.fit(X, y)

print("Intercept:", model.intercept_)
print("Coefficients:", model.coef_)

new_student = np.array([[7, 8, 4]])
prediction = model.predict(new_student)

print("Predicted marks:", prediction[0])

8. Charts with Matplotlib

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

X = np.array([
    [2, 6, 1],
    [4, 7, 2],
    [6, 8, 3],
    [8, 7, 4],
    [10, 8, 5]
], dtype=float)

y = np.array([50, 65, 80, 88, 98], dtype=float)

model = LinearRegression()
model.fit(X, y)

predicted = model.predict(X)

plt.scatter(y, predicted)
plt.xlabel("Actual Marks")
plt.ylabel("Predicted Marks")
plt.title("Actual vs Predicted Marks")
plt.grid(True)
plt.show()

plt.plot(y, label="Actual Marks", marker="o")
plt.plot(predicted, label="Predicted Marks", marker="o")
plt.xlabel("Student Number")
plt.ylabel("Marks")
plt.title("Actual and Predicted Marks")
plt.legend()
plt.grid(True)
plt.show()

9. Important Concepts

Feature

An input column used for prediction.

Coefficient

The weight or importance of a feature.

Intercept

The base prediction when all inputs are zero.

Error

The difference between actual and predicted output.

10. Warning: More Inputs Are Not Always Better

Adding more inputs can improve prediction, but useless or repeated inputs can damage the model.

11. Embedded Python Editor

Try the code live in the Programmer’s Picnic Python editor.

12. MCQ Quiz

13. YouTube Closing Summary

Multiple Linear Regression is the natural upgrade of simple linear regression. Instead of using one input, we use many inputs to predict one output.

One input gives us a line. Two inputs can give us a plane. More inputs create a hyperplane. But the goal remains the same: find the best fitting model with minimum error.