Analytics

The Rule of Five

When needing to approximate the median of a certain population, it requires only 5 random samples to identify with 93.75% confidence that the value is between the smallest and the largest of those samples.

The Explanation
The chance to random pick a value that is above the median is 50%, same as flipping a coin. The chance of hitting heads five times in the row is 1/32 or 3.125%. The chance of not to get all heads or tails is 100 - 2 x 3.125 = 93.75

Varied Amounts
How about if we vary the amount of samples?

The Math Behind

collapse: false

// 2 samples
1 - 1/4 * 2 = 0,5

// 3 samples
1 - 1/8 * 2 = 0,75

// 4 samples
1 - 1/16 * 2 = 0,875

// 5 samples
1 - 1/32 * 2 = 0,938

// 6 samples
1 - 1/64 x 2 = 0,969

// 7 samples
1 - 1/128 x 2 = 0,984

It seems that we are getting diminishing returns after the 5 samples.

Examples

What are scenarios that this tool proved to be valuable?

TBD