PDA

View Full Version : Standard Deviation questions.

08-20-2005, 03:09 PM
I've done some searching, but came up with nothing very conclusive.

How many hands does a person have to play to have 90% confidence in their SD? 75%? How confident can a person be in their SD after say only 1000 hands?

My search results turned up some vague posts about how it needs to be "many, many hands". This, or course, is obvious, but I'm interested seeing how this is calculated.

My question is because it seems that people are always asking about winrate, and SD is often just a number that appears to be just "plugged-in", regardless of the confidence of that number.

tworooks
08-20-2005, 05:49 PM
746,931 hands

08-21-2005, 12:45 AM
[ QUOTE ]
but I'm interested seeing how this is calculated.

[/ QUOTE ]

I've got more posts than you, so you aren't allowed to be a smartass. /images/graemlins/grin.gif

08-21-2005, 02:48 AM
[ QUOTE ]
746,931 hands

[/ QUOTE ]

Seriously, you can troll my thread better than this.

AaronBrown
08-21-2005, 10:19 AM
I think you are heading down a bad path, a guy named William Gosset (http://www-groups.dcs.st-and.ac.uk/~history/Mathematicians/Gosset.html) figured out the right way about a hundred years ago.

You measure something by an average (like poker winrate) and you want to know how much you can trust it. So you measure standard deviation to get a confidence interval. But then you wonder how much you can trust standard deviation, so you measure standard deviation of standard deviation. Then SD of SD of SD. Of course, you never get satisfied.

Gosset proved that under certain assumptions, you can collapse all the uncertainties into a single distribution, called the Student-t (he published under the name "Student" because his employer, Guinness Beer, didn't want the brewery business associated with anything as disreputable as statistics.

With more than 30 observations, unless you're going way out into the tail (like wanting a 99.999% confidence interval for your win rate), the Student t is quite close to the Normal. With a Normal you go 1.96 standard deviations to either side of the mean for a 95% confidence interval, with a Student t with 30 observations you go 2.05 standard deviations. Since you need at least 1,000 observations to get the standard deviation small enough for useful inference, the standard deviation of standard deviation is not a problem. There are other problems like the fact that your win rate is not constant nor independent from hand to hand that are much more serious.

If you want the formula, the variance of the variance is equal to the variance squared times [2/(n-1) + kurtosis/n], where n is the number of observations. That means, loosely speaking, the error in your standard deviation, expressed as a fraction of standard deviation, is on the order of 2/(n-1) + kurtosis/n. With 1,000 observations, the first term is 2/999, or a 0.2% error. The kurtosis is a measure of how "fat" the tails of the distribution are, it's 0 for a Normal and negative for a uniform distribution. Unless you play a crazy no-limit game in which one hand a night determines the entire outcome, your kurtosis is unlikely to be big enough to affect your standard deviation much after 1,000 hands.

08-21-2005, 12:43 PM
Fabulous, and exactly what I was looking for!

When you mentioned Gosset, I thought of the fellow who went mad attempting to crunch the number of different configurations for a bingo card ( C(15,5)^5? ). It doesn't appear to be him, however.

AaronBrown
08-21-2005, 06:09 PM
I don't know him, but it sounds like an interesting story. I have a soft spot for mad mathematicians interested in gambling.

mosdef
08-21-2005, 07:48 PM
[ QUOTE ]
There are other problems like the fact that your win rate is not constant nor independent from hand to hand that are much more serious.

[/ QUOTE ]

do you think this is a serious problem because the player in question has a (hopefully) increasing skill level? or is is because the opponents stay fixed for a while and then change to a different set of opponents that stay fixed for a while, and so on.

AaronBrown
08-21-2005, 07:56 PM
Both. Also, your win rate might go up or down with tiredness, type of players at the table, how much you're concentrating and so forth. Another problem is tilts, if you're prone to them then all standard-deviation bets are off.

There's a big difference between having a constant 1 BB per hour expected winrate, and having a winrate that varies randomly between -1 and +3, with an average of +1.

mosdef
08-21-2005, 08:30 PM
this is all true, however i am not convinced that you should toss all models where you assume serial independence of results. i would prefer the approach whereby you ask a question, answer it with the simplist available model, then decide if your answer would be altered significantly if you changed the model. since the random variables behind poker results are so complicated, it would be futile to try to overcome all the shortcomings in any model. i would take the line of trying to find the simplist model that will give you a good enough understanding of the situation at hand, whatever that may be.

AaronBrown
08-21-2005, 10:35 PM
I concur 100%, but I add one caveat. That works as long as you're looking for 90% confidence or less. Once you get above that, the violations of assumptions can kill you.

When people talk about 6-sigma performance, meaning 6 standard deviations above the mean, meaning 1 chance in a billion; I say if you think you've achieved it you've underestimated sigma. 6 people in the world can call me a liar, I don't think I've met any of them yet.

BillC
08-22-2005, 05:49 PM
The variance follows a chi-square distribution (under some assumptions). If you just want answers, look at an elementary stat book, e.g., Triola. I did post some such numbers a long while ago.

The upshot is that estimating your variance takes a lot less than for your EV, so don't sweat it.
The t-statistic does not seem really relevant here.
Do not become obsessed with variance. If you do, the crazies will get you.

BillC
08-23-2005, 09:16 PM
For example, to estimate sigma to within 5% with 95% confidence, you need a sample of at least 767 (hours).

Change 5% to 10%, you need only n=191.

These numbers assume a underlying normal population.