Logistic Regression
In many practical scenarios, understanding the probability of an event is more critical than merely categorizing it. Logistic regression is a powerful and widely used technique for estimating such probabilities. These probabilities can be applied in two primary ways:
- Direct Usage: Treat the probability as a quantitative measure. For instance, if a spam filter assigns a probability of to an email, it indicates a 93.2% chance that the email is spam.
- Threshold-Based Classification: Convert the probability into a binary outcome (e.g., Spam or Not Spam) by applying a threshold, typically . This guide focuses on interpreting probabilities directly, leaving binary classification for a separate discussion.
The Sigmoid Function
At the heart of logistic regression lies the sigmoid function, a mathematical construct that maps inputs to values within the range , making it ideal for probability estimation.
Sigmoid Function Equation
The sigmoid function is expressed as:
- When ,
- When ,
- The output is continuous and always falls between 0 and 1.
Sigmoid Function in Practice
The table below illustrates how the sigmoid function converts input values into probabilities:
Input (x) | Sigmoid Output (f(x)) |
-3 | 0.047 |
-2 | 0.119 |
-1 | 0.269 |
0 | 0.500 |
1 | 0.731 |
2 | 0.881 |
3 | 0.952 |
How Logistic Regression Maps Inputs to Probabilities
Logistic regression achieves probability prediction by combining a linear model with the sigmoid function. This process involves two main steps.
Step 1: The Linear Model
The first step is computing a linear combination of the input features:
Where:
- : Linear combination (also known as the log-odds)
- : Intercept or bias term
- : Weight of the -th feature
- : Value of the -th feature
Step 2: Applying the Sigmoid Function
Next, the linear output zz is passed through the sigmoid function to produce a probability:
Where:
- : Predicted probability of the positive class
- : Linear output from the previous equation
Log-Odds and Probability in Logistic Regression
The linear term zz in logistic regression corresponds to the log-odds of the positive outcome. The relationship between probabilities and log-odds is given by:
Where:
- : Probability of the positive class
Log-odds represent the natural logarithm of the ratio of the probability of success () to the probability of failure (1−). This interpretation links logistic regression to statistical models like the logit function.
Example 1: Detecting Spam Emails
A logistic regression model predicts whether an email is spam based on features such as word frequency, email length, and sender reputation.
Input Features:
- : Frequency of the word “free”
- : Length of the email
- : Reputation score of the sender
Linear Model:
Prediction for Specific Input:
- , ,
Interpretation: The email has a near-zero probability of being spam.
Example 2: Diagnosing Disease Risk
A medical model uses logistic regression to estimate the likelihood of a patient having a disease based on test results.
Input Features:
- : Blood pressure
- : Cholesterol level
- : Age
Linear Model:
Prediction for Specific Input:
- , ,
Interpretation: There is a 92.4% probability that the patient has the disease.