Distribution & Least Squares
Stop memorizing dry statistical formulae. Adjust the slope and intercept physically, watching the error squares naturally expand and collapse before you.
Upgrade to Founding Pro to unlock downloads
Core Concepts
Line of Best Fit
A straight line drawn through the center of a group of data points that best represents the trend on a scatter plot, rather than rigidly connecting dots.
Residuals
The vertical distance between a data point's actual coordinate and the regression line's predicted coordinate. Positive is above the line, negative is below.
Method of Least Squares
A standard approach to regression analysis to approximate the solution by minimizing the sum of the squares of the residuals (SSR).
Why call it 'Least Squares'?
Faced with a scatter plot of experimental data, we intuitively want one line to sum up the trend. Many initially rely on the naked eye, assuming 'the line that passes through the most dots is best', or 'as long as points above and below are balanced, it’s correct'.
Statistics requires absolute precision. To prevent positive and negative errors from canceling each other out, mathematicians calculate the vertical distance (residual) from each point to the line and square it. Geometrically, this looks as though every data point projects outward a physical square.
The 'Least Squares' method is the pursuit of that one divine line causing the total area of all these squares to reach an absolute global minimum. Toggle to the second mode, drag the red line around, and watch the error 'monsters' aggressively inflate with your mistakes, only to collectively collapse into tiny boxes exactly along the perfect trajectory.