PDA

View Full Version : Why is 10,000 hands too small?


Lexander
03-24-2005, 09:01 AM
I ask this mostly because I am curious and more experienced minds than me can address it.

I have at times read some discussions about how valid your BB/hr rate is for 10,000 hands. What I am curious about is why this is an insufficient sample size for reasonable calculations.

If I naively sit down and try to model my probabilities, I would probably start with a simple model. If I am dealt 2 cards in HoldEm, there is some unknown BB/hand for this combination of 2 cards. In theory this is modeled with some pmf, but for simplicity I will approximate with a pdf f(*).

Assuming this density isn't something truly ugly it has its first and second moments and therefore mean and variance are meaningful.

Now, since each hand is independent, my 10,000 samples give me a Sample Mean with Variance = Sigma^2/10000. That is what really bothers me.

Now, our Sample Mean has the same Expectation as f(*) but has 10,000th the variance. Additionally, the Central Limit Theorem assures that our Sample Mean has a nice normal distribution.

From this, one would naively think that any BB/hand estimate that we generated would have a pretty small Standard Error. We should have fairly small Confidence Interval's.

I can see a number of problems with this happy naive calculation, but it does make me curious. In a plain limit game I don't recall winning more than 30BB's in a single hand, nor losing much more than 10. Most of the time my results are pretty stable. That would suggest my variance has pretty strict bounds and shouldn't be too high.

What problem are people encountering that keeps this estimate from being useful?

- Lex

Paul2432
03-24-2005, 10:02 AM
As far as I know, variance in hold'em has always been derived empirically. With the advent of on-line poker and data base software tracking results over hundreds of thousands of hands is possible. Invariably one finds a standard deviation of around 15-20 BB/100 hands. One also finds win rate fluctuating 2 BB and more over 10,000 hands.

OK. So why is this? Well the vast majority of the time in hold'em you fold either preflop or on the flop and those hands don't contribute much to your results. A small number of hands determine your results so your effective sample size is much smaller.

For example, suppose a player has a true win rate of 2 BB/100 and this player loses 10 BB in one hand. For that particular hand the players win rate is -1000 BB/100. It takes a while to smooth out something like that.

Finally, consider that most authors state that nearly all profit comes from AA and KK and that most players break even or maybe make a slight profit on the rest of their hands. I think you can see that if a player gets AA and KK cracked two are three times or goes longer than usual without receiving these hands, that will have a large impact on win rate over many thousands of hands.

Paul

Lexander
03-24-2005, 02:35 PM
Thank you for comments Paul.

Alright, I suppose that raises some other things.

If 15-20BB is in fact a reasonable estimate of the Standard Deviation after 100 hands, then the SD per hand should be 1.5-2BB per hand.

After 10,000 hands, our sample mean should be a Normal Distribution centered on the actual BB/hand and a SD of 0.015-0.020 BB/hand.

Since the sample mean is normally distributed, a standard t-test can be used to determine signifigance. If we test the Null Hypothesis that our win rate is negative, we need only to average 0.03 BB/hand to consider our win rate significant. Applying the same reasoning to BB/100 hands, a win rate greater than 0.3BB/100 over 10,000 would be considered significant at the 5% level. If we really want to push it and require a 0.1% signifigance, then we need to average 0.61BB/100.

If this reasoning is correct (a speculative venture on a good day), then it would seem to me 10,000 hands would be fairly conclusive for many win or loss rates.

baumer
03-24-2005, 03:19 PM
[ QUOTE ]

Finally, consider that most authors state that nearly all profit comes from AA and KK and that most players break even or maybe make a slight profit on the rest of their hands.


[/ QUOTE ]

i'm no expert but i can't accept that these two hands are responsible for all profit. is this accepted?

dtbog
03-24-2005, 03:34 PM
[ QUOTE ]
Finally, consider that most authors state that nearly all profit comes from AA and KK and that most players break even or maybe make a slight profit on the rest of their hands.

[/ QUOTE ]

Wait.. what?

Grisgra
03-24-2005, 03:38 PM
[ QUOTE ]
Finally, consider that most authors state that nearly all profit comes from AA and KK and that most players break even or maybe make a slight profit on the rest of their hands.

[/ QUOTE ]

I've heard this many times before, and it's misleading. It's true that if you take away AA and KK that it's difficult to be a break-even player, but that's only because you have to keep putting in blinds.

A more accurate statement would be "The winnings from hands other than AA and KK are generally barely enough to stay ahead of the money you lose from forced betting."

BruceZ
03-24-2005, 03:47 PM
[ QUOTE ]
I ask this mostly because I am curious and more experienced minds than me can address it.

I have at times read some discussions about how valid your BB/hr rate is for 10,000 hands. What I am curious about is why this is an insufficient sample size for reasonable calculations.

If I naively sit down and try to model my probabilities, I would probably start with a simple model. If I am dealt 2 cards in HoldEm, there is some unknown BB/hand for this combination of 2 cards. In theory this is modeled with some pmf, but for simplicity I will approximate with a pdf f(*).

Assuming this density isn't something truly ugly it has its first and second moments and therefore mean and variance are meaningful.

Now, since each hand is independent, my 10,000 samples give me a Sample Mean with Variance = Sigma^2/10000. That is what really bothers me.

Now, our Sample Mean has the same Expectation as f(*) but has 10,000th the variance. Additionally, the Central Limit Theorem assures that our Sample Mean has a nice normal distribution.

From this, one would naively think that any BB/hand estimate that we generated would have a pretty small Standard Error. We should have fairly small Confidence Interval's.

I can see a number of problems with this happy naive calculation, but it does make me curious. In a plain limit game I don't recall winning more than 30BB's in a single hand, nor losing much more than 10. Most of the time my results are pretty stable. That would suggest my variance has pretty strict bounds and shouldn't be too high.

What problem are people encountering that keeps this estimate from being useful?

- Lex

[/ QUOTE ]

The standard error is still a significant fraction of the win rate at 10,000 hands. Typical SD for 100 hands is around 17 big bets, so the standard error for 10,000 hands is 17/sqrt(10,000/100) = 1.7 big bets. This is a significant fraction of the win rate for 100 hands which is typically 2-4 big bets.

If you only want to know that you are a winning player to 5% significance after 10,000 hands, then you must have a win rate of 1.64*1.7 = 2.8 big bets/100 hands.

Paul2432
03-24-2005, 03:53 PM
I think you're off by a factor of 10. If you winrate is over 3 BB/100 after 10,000 hands you can be fairly confident you are a winning player. I think there is a pretty big difference between "knowing you are a winning player" and "knowing your win rate". On these same numbers you only know your win rate +/- 3 BB*.

Paul

*Not really. Except at the very smallest stakes, winrates above 5 BB/100 are not possible.

BruceZ
03-24-2005, 04:24 PM
[ QUOTE ]
If we test the Null Hypothesis that our win rate is negative, we need only to average 0.03 BB/hand to consider our win rate significant. Applying the same reasoning to BB/100 hands, a win rate greater than 0.3BB/100 over 10,000 would be considered significant at the 5% level.

[/ QUOTE ]

This is where you are off by a factor of 10 as Paul said. 0.03 bb/hand = 3 bb/100 hands. Also, you should only take 1.65 standard deviations instead of 2 since this is a 1-sided confidence interval of 5%.

Lexander
03-24-2005, 09:33 PM
Ya, I realised that driving home, and then crashed after class. I knew something was wrong but hadn't pinned it down.

That I suppose addresses much of the question. The estimated standard deviation per hand is sounding like 1.5BB-2.00BB per hand. Higher than I would have thought given how often I fold but win rates are highly skewed. Kind of scary to think that a sample size of 10,000 produces a test with such low power.

And since you were picky about me using a 2-sided t-test (my range, btw, was from a 1-sided t-test but I used a conservative 20BB estimate just to impose more restriction), I will be picky about you referring to it as my 1-sided Confidence Interval. T tests might have a nice CI representation but the test itself is against a 5% upper tail critical region.

GrekeHaus
03-25-2005, 04:30 AM
A while ago, I posted this (http://forumserver.twoplustwo.com/showflat.php?Cat=&Board=genpok&Number=1702898&Foru m=,,,All_Forums,,,&Words=&Searchpage=2&Limit=25&Ma in=1692927&Search=true&where=&Name=15443&daterange =&newerval=&newertype=&olderval=&oldertype=&bodypr ev=#Post1702898). It is based on a standard deviation of 16 BB/100. If you are a tight player, these numbers are a bit high, if you're loose, they're a bit on the low side.

--GH

BruceZ
03-25-2005, 07:37 AM
[ QUOTE ]
And since you were picky about me using a 2-sided t-test (my range, btw, was from a 1-sided t-test but I used a conservative 20BB estimate just to impose more restriction),

[/ QUOTE ]


I actually figured out that was probably what you did after it was too late to take down my post. Many people fail to make this adjustment correctly, and for them this isn't picky. It is fundamental and significant.


[ QUOTE ]
I will be picky about you referring to it as my 1-sided Confidence Interval. T tests might have a nice CI representation but the test itself is against a 5% upper tail critical region.

[/ QUOTE ]

There is a fundamental equivalence between hypothesis testing and confidence intervals (DeGroot p. 482). The null hypothesis that the win rate is negative is called a 1-sided alternative, as opposed to a null hypothesis that the win rate lies between two bounds, which would be a 2-sided alternative. Rejecting the 1-sided hypothesis at the 5% level is equivalent to saying that the interval from 0 to +infinity is a 95% confidence interval for the win rate. This is what I meant by 1-sided confidence interval, so that 0 corresponds to 1.65 SD below the mean, and 95% corresponds to the area under the bell curve from 1.65 SD below the mean to +infinity, instead of the mean +/- 2 SD for a 2-sided alternative.

DougOzzzz
03-25-2005, 08:59 AM
[ QUOTE ]
A while ago, I posted this (http://forumserver.twoplustwo.com/showflat.php?Cat=&Board=genpok&Number=1702898&Foru m=,,,All_Forums,,,&Words=&Searchpage=2&Limit=25&Ma in=1692927&Search=true&where=&Name=15443&daterange =&newerval=&newertype=&olderval=&oldertype=&bodypr ev=#Post1702898). It is based on a standard deviation of 16 BB/100. If you are a tight player, these numbers are a bit high, if you're loose, they're a bit on the low side.

--GH

[/ QUOTE ]

Just curious, why is it 6200 and not 6400? And why do you keep subtracting 200 after quadrupling?

GrekeHaus
03-25-2005, 03:42 PM
[ QUOTE ]
[ QUOTE ]
A while ago, I posted this (http://forumserver.twoplustwo.com/showflat.php?Cat=&Board=genpok&Number=1702898&Foru m=,,,All_Forums,,,&Words=&Searchpage=2&Limit=25&Ma in=1692927&Search=true&where=&Name=15443&daterange =&newerval=&newertype=&olderval=&oldertype=&bodypr ev=#Post1702898). It is based on a standard deviation of 16 BB/100. If you are a tight player, these numbers are a bit high, if you're loose, they're a bit on the low side.

--GH

[/ QUOTE ]

Just curious, why is it 6200 and not 6400? And why do you keep subtracting 200 after quadrupling?

[/ QUOTE ]

I dunno. I came up with these numbers by running a python script.

Edit: Now that I think about it, it probably has to do with the 95% accuracy, which is only 1.96 Standard Devations. Using two SDs (97% accuracy), you do get the expected 6400.

Lexander
03-25-2005, 07:49 PM
[ QUOTE ]

There is a fundamental equivalence between hypothesis testing and confidence intervals (DeGroot p. 482)


[/ QUOTE ]

Hehe. True, and you win. I personally find it easier to think of it purely in terms of critical regions and p-values since not every test has a easily constructed CI lying around.

This is a funny topic of sorts for me right now. My Math Stat course is currently on hypothesis testing. One of our homework problems is to prove the relationship between a LRT involving normals and the t-test, so this topic is very relevant to me right now.