The following is something I sent a WKNH (any gueses?) over PM, after he asked a similar question...
OK... Standard Deviation is the measure of spread in a normal population.
So, that asks the first question: whats a normal population?? A normal population is a set of data, that basically fits into a bell curve. I pulled this picture off of the web:
This is a very run of the mill normal curve.
Im not sure exactly what this represents, but lets assume it measures earthworm length (I am a bio major afterall).
The X-axis (the numbers on the bottom (2-18)) represent length. The Y-axis (they dont have numbers on it) represents the probability of picking an earthworm of that length (from a large pool or earthworms)
Do you see how it has a maximum at 10? That is because 10 is the mean of this population. That is, there are more earthworms that are 10 inches long than any other length.
OK... now, how does standard deviation relate to this?
Here are two normal curves I made up...
The first has a mean of 2 BB/100, with a SD = 16BB/100. The second has the same mean, but a SD = 8BB/100.
(the colours will be explained later)
What these graphs measure is the probability that you will win X BBs in your next 100 hands.
If you notice, at the peak of the first you will find that there is about a 2.5% chance that you will win 2BB in your next 100 hands. But, in the second, it is close to 5%. Also, in the first you will notice that it is possible to lose 30 BBs in 100 hands, but in the second, it becomes essentially impossible.
With a bigger standard deviation, simply put, you have bigger swings.
Now, what about the colours?
The dark red represents one standard deviation about the mean.
So, for the first... since the mean = 2BB/100, and SD = 16BB/100, this range =
2 +/- 16 =
-14BB -> 18BB
You can expect data to fall within this range about 68% of the time.
Put another way: if you have a winrate of 2BB/100 and a SD of 16BB/100, there is about a 68% chance that, in your next 100 hands, you will win between -14BB and 18BB.
The second colour represents two standard deviations about the mean. For the winrate = 2, SD = 16, this represents the range:
-30BB --> 34 BB.
There is about a 95% chance that the data falls in here. Put another way: there is a 95% chance that, in your next 100 hands, you will win between -30BB and 34BB.
Now... on calculating a range of your winrate...
In general, 95% confidence is about what most people use, so we will use that.
Here is the forumla:
winrate +/- 2*SD / ((#hands/100)^1/2)
Which looks complicated, but its really not.
Lets say you have played 51,000 hands. Your winrate = 2BB/100. Your SD = 16BB/100
here's how the forumla works...
Take your # of hands, and divide by 100. This = 510.
Take the square root of this number (510). This = 22.58
Now, calculate 2*SD/(this number)...
that is, 2*16/22.58 = 1.417
This is the number you add and subtract...
therefore, this player can be 95% sure his winrate is in the range:
2 +- 1.417
0.583 -> 3.417
We really cant be all that sure of anything at this point, ya know?
We can be pretty sure he is a winning player, but thats about it.
Just goes to show what those people who claim massive winrates after 15k hands really know...
PM me back if you have any questions / anything I didnt cover.
Also, I hope the images work, I've had problems in the past
Quote: Put another way: if you have a winrate of 2BB/100 and a SD of 16BB/100, there is about a 68% chance that, in your next 100 hands, you will win between -14BB and 18BB.
Quote: Put another way: there is a 95% chance that, in your next 100 hands, you will win between -30BB and 34BB.
Not to get picky, but these are common misconceptions in the poker world. Allow me to elaborate...
I am pretty sure that the winnings for each 100 hands does NOT have a normal distribution. I have not checked it because PT won't allow me to export vectors of total winnings for each 100th hand, so I don't know it for a fact though. I suspect that the distribution would have heavier tails than the normal distribution and it's probably not symmetric.
However, you can use the normal distribution to calculate a confidence interval for your true win rate. But this is because if you pick many samples from ANY distribution, the mean will be normally distributed according to the Central Limit Theorem.
let X1, X2, X3... Xn be your winnings for 100 hands each with an UNKNOWN distrubution, and say that the true expcted value is "my", the sample mean is Xbar and the standard deviation is "sigma".
Then (Xbar-my)/(sigma*sqrt(n)) will be normally distributed and a confidence interval for my will be Xbar+-z(a/2)*sigma*sqrt(n) with a confidence level of APPROXIMATELY 100*(1-a).
The confidence interval is defined as: Take a sample from the distribution specified by your estimated variables ,my and sigma, and calculate a confidence interval based on your sample and chosen a. 100*(1-a) % of the time you do this the confidence interval will cover your my.
But it was a good analysis coming from a bio major. Close enough.