Minority Report : Special Edition
a small peek is all you need to see the whole picture
I’m not a guy who usually follows a lot of politics, but at the time of elections, everyone’s just so excited about who might win, how many votes they’ll get, and so on. It’s the talk of the town. And if you’d notice, from the start of the day when they start counting votes, these news channels keep flashing these numbers which over that day keep on deviating until the final results are declared.
When I was younger, I used to wonder: how are they coming up with the numbers if the votes haven’t been counted yet? Around the same time period, two more phrases keep popping up - exit polls and opinion polls. I mean, what are these polls? And why are they so reliable that news channels bet their reputation on them? These were some questions that had puzzled me once, and today we’re going to unpack exactly how this works.
But we’re not starting with voting booths or spreadsheets. We’re starting with a broken bus.
The Marathon Mystery
Consider the following scenario.
There’s a marathon race being organized in your country, and you’re running this grand event. Runners from all over the world have already arrived and are participating. On the day of the event, all participants were transported to the venue, but one of the buses broke down en route. Unfortunately, the bus is full of foreigners who can’t communicate with people nearby.
To resolve this matter, you set off on the road with your team, and lucky for you, you immediately spot a bus surrounded by a group of unhappy people. You get the information about those passengers, including their weights, and you notice that the average weight of passengers on that bus is nearly 220 pounds (around 99 kg).
Something clicks in your head. There is no way that a random group of marathon runners could all be this heavy. You immediately inform your team to keep searching.
Congratulations. If you can grasp how someone who takes a quick look at the weights of passengers on a bus can infer that they’re probably not on their way to the starting line of a marathon, then you now understand the basic idea of the central limit theorem.
Let’s break down what your brain just did. Most marathon runners weigh around 155 pounds. You looked at a bus with 60 people whose average weight is 220 pounds. Your instinct told you something was off, but why exactly?
Sure, there’s a possibility that people weighing 220 pounds do run marathons. Out of all the runners in the world, there might be hundreds of them scattered around. But here’s the thing, the likelihood that so many heavier runners would randomly end up assigned to the same bus… don’t kid yourself. It’s not impossible, just improbable enough that you can confidently conclude it’s not the bus you’re looking for.
This is the basic intuition of the Central Limit Theorem. There’s less than a 1 in 100 chance that the average weight of those 60 passengers would be 220 pounds if they were actually marathon runners. Your brain did the math without you realizing it.
Just now, we had enough information about the population data, and we knew the mean weight of a marathon runner is 155 pounds. With that knowledge, we concluded it was the wrong bus by looking at the sample. And of course, the inverse is also true, and that’s exactly how polling works.
In reality, it’s impossible to measure an entire population. You can’t weigh every marathon runner in the world, and you definitely can’t ask every single voter in a country who they’re planning to vote for. That’s why sufficiently large sample datasets are used, and based on those samples, the characteristics of the parent population are estimated. This is where CLT comes into the picture.
The central limit theorem tells us that a large sample will not typically deviate sharply from its underlying population. Think about that for a moment, a mere poll of 2,000 appropriately chosen people can tell a great deal about how an entire country of millions is thinking. That seems almost magical, but it’s pure mathematics.
Here’s where it gets really interesting. Imagine you kept taking different groups of people, 60 people here, 60 people there and calculated the average weight of each group. If you plotted all those averages on a graph, they would form a predictable pattern. Most of the averages would cluster around the true average weight of all marathon runners, with fewer and fewer groups having averages that are much higher or much lower.
And here’s the kicker, this pattern emerges even if the individual runners themselves are all over the place in terms of weight. Some might be 130 pounds, others 180, scattered without any particular pattern. But when you start taking groups and averaging them, order emerges from chaos.
The more groups you sample, the more reliable this pattern becomes. And the bigger each group is, the tighter those averages cluster around the true value. This is why a poll of 2,000 people is more reliable than a poll of 200; the larger sample gives you an average that’s more likely to be close to what the whole population actually thinks.
Why Exit Polls Work (And Sometimes Don’t)
So when you see those election numbers flashing on TV before all the votes are counted, this is what’s happening behind the scenes. Pollsters are taking samples, talking to voters as they exit polling stations, or calling random households before the election. They’re calculating the average from these samples (what percentage support each candidate) and using the Central Limit Theorem to estimate what the actual election result will be.
Most of the time, if you’ve chosen your sample well, it will give you an answer that’s reasonably close to the truth. The bigger your sample, the more confident you can be that you’re in the right ballpark. This is why pollsters are careful about sample size, asking 2,000 people gives you much more reliable results than asking 200.
But here’s the catch: the math only works if your sample actually represents the population. If it’s biased in some way, all bets are off.
This is also why polls sometimes get it spectacularly wrong. Maybe they’re only calling landlines and missing younger voters who only use cell phones. Maybe certain groups of voters are less likely to respond to polls. Maybe people aren’t being honest about who they’ll vote for. In all these cases, the sample doesn’t actually represent the population. The math still works perfectly, but you’re feeding it skewed data. Garbage in, garbage out, as they say.
You don’t need perfect information about everything. You just need a large enough, properly chosen sample. Your sample won’t be a replica of the population; it will vary… but the probability that it will deviate massively is very low. That’s not a guess or a hope. That’s a mathematical certainty, and it’s one of the most elegant ideas in statistics.
The next time you see those election numbers updating throughout the day, or wonder how a company can test just a handful of products and guarantee quality for millions, remember the broken bus full of unexpectedly heavy passengers. Sometimes, a small peek is all you need to see the whole picture.
And with that, it’s a wrap for today, and before I say goodbye for today, here’s a quote I’ve been pondering,
"Inspiration is a guest that does not willingly visit the lazy."
Please don’t forget to share it with your friends, family, and strangers.
Have a Great Day 💖


