PDA

View Full Version : Variance and sample sizes, in real terms.


bisonbison
10-31-2004, 05:24 PM
I have a reputation for being a stickler about sample size and PT posts. To help explain why, I decided to dig around my databases for a pair of players that would help explain why 10k hands are not sufficient for you to know how well you're doing.

Player A and Player B are both actual Party 3/6 players. They both multitable 3/6 on a near-daily basis. Their stats, as you can see, are ridonculously similar. Here's what they did for October.

Player A | <font color="red">Player B</font>

Hands: 11,770 | <font color="red">12,085</font>
VP$IP: 14.63 | <font color="red">14.43</font>
VP$IP from SB: 15.23 | <font color="red">14.61</font>
Fold SB to steal: 88.89 | <font color="red">91.03</font>
Folded BB to steal: 71.43 | <font color="red">72.73</font>
Att. to Steal Blinds: 27.22 | <font color="red">27.50</font>
Went to SD: 34.75 | <font color="red">34.30</font>
Pre-Flop Raise: 9.03 | <font color="red">8.77</font>

Aggression
Flop: 3.24 | <font color="red">3.76</font>
Turn: 2.60 | <font color="red">2.48</font>
River: 1.60 | <font color="red">1.65</font>
Total 2.32 | <font color="red">2.40</font>

When Folds:
Preflop: 79.86 | <font color="red">80.13</font>
Flop: 6.06 | <font color="red">6.05</font>
Turn: 1.92 | <font color="red">1.81</font>
River: 1.19 | <font color="red">1.11</font>

I'll let you know off the bat that both these players posted winning months. I want you to look at these stats and tell me:

1) Who had a better win-rate this month?
2) How close are their BB/100?
3) Who is the better player?

Please remember, this post is not about their particular stats, but about what their particular stats show.

Yobz
10-31-2004, 05:44 PM
Just a side question: aren't both VPIP way too low? Shouldnt it be 16-18% at 3/6 or am I too loose? /images/graemlins/smile.gif

blackaces13
10-31-2004, 05:49 PM
These questions are impossible to answer. Also, with so many hands in your database for one month I would have to guess that both players are you.

bisonbison
10-31-2004, 05:59 PM
These questions are impossible to answer.

Well, if it's hard to see who did better or how much better one of them may have done, it goes to show that you can read waaaay too much into these stats.

fairness
10-31-2004, 06:12 PM
bisonbison is the man.

nyc999
10-31-2004, 06:17 PM
[ QUOTE ]

VP$IP: 14.63 | <font color="red">14.43</font>
VP$IP from SB: 15.23 | <font color="red">14.61</font>


[/ QUOTE ]

Is 3/6 this tight?

Another great post...

Blarg
10-31-2004, 06:20 PM
The real variable in this equation is not the players or the sample size, but the cards.

btspider
10-31-2004, 06:30 PM
[ QUOTE ]
These questions are impossible to answer.

Well, if it's hard to see who did better or how much better one of them may have done, it goes to show that you can read waaaay too much into these stats.

[/ QUOTE ]

was won $ at showdown too revealing to include?

dr. klopek
10-31-2004, 06:31 PM
Hey bison, do you ever get tired of making posts that are meant to warrant serious discussion, and having your efforts met with just a bunch of people riding your jock about how great you are and how amazing your post is?

BTW: Bison is the greatest. Another awesome post.

easypete
10-31-2004, 06:36 PM
[ QUOTE ]

Is 3/6 this tight?

[/ QUOTE ]

I know I tightened up. I'm about 2-3% less VPIP. This, is mainly due to the more aggressive nature of 3/6, as well as the cost of playing is cheaper per hand.

Also, my VPIP out of the SB is lower. Games from 0.02/0.04 to 2/4 have a SB = 1/2 BB. At 3/6, the SB = 1/3 BB. The cost of a normal full table round is 0.75BB/10 hands vs 0.66BB/10 hands.

As for the players A and B... You can't really say based on stats (or these stats) who the winner is. I've had months of play that look very similar statwise, but my winrate has been different by 2-3BB/100 (7k hands per month).

I voted &gt;3BB/100 difference just because this would prove a point that bison is making.

Now the big stats missing are W$SD and WtSD. These stats are much more telling of their postflop play.

bisonbison
10-31-2004, 06:40 PM
Is 3/6 this tight?

It depends. I'm about this tight.

If you go to the sessions tab, AP and ASF are the last two stats in the "Session Summary" box. They give you an idea of how tight your tables are. My 3/6 tables ASF over the last 3 months is about 31%.

Looking at some old 2/4 stats, I see the ASF there was 36%.

bisonbison
10-31-2004, 06:41 PM
Went to showdown is listed, or could be inferred from the "when folds" stats.

jason1990
10-31-2004, 06:45 PM
Does this post have anything to do with "variance", the statistical term? (I.e. variance = square of standard deviation.) I thought so by the title, but suspect not after reading it.

easypete
10-31-2004, 06:47 PM
[ QUOTE ]
Went to showdown is listed, or could be inferred from the "when folds" stats.

[/ QUOTE ]

True enough... but it's Sunday... I'm watching the Falcons take care of business in Denver... and it's all that higher level math stuff i can't do.

helpmeout
10-31-2004, 06:55 PM
Where is won at showdown and checkraise percentages.

All these stats say is that they are both a bit too tight preflop and out of the blinds and that they are a bit too agro on the flop.

Won at showdown will tell us whether they are calling down with too many hands and checkraising will tell us if they use tricky plays or not.

bisonbison
10-31-2004, 07:05 PM
Where is won at showdown

I'm convinced that won at showdown is an effect stat as much as it is a cause.

MarkL444
10-31-2004, 07:19 PM
Very nice post bison, have you considered this might be the same person?

bisonbison
10-31-2004, 07:26 PM
variance, as it's discussed around here, tends to mean "I am playing the same way and not getting the same results."

chesspain
10-31-2004, 07:27 PM
Hey, those look similar to my numbers...yet I'm still down approx. 50BB on Party 3/6 after 5K hands. /images/graemlins/tongue.gif

helpmeout
10-31-2004, 07:52 PM
Won At showdown can really only be analyzed by the player. Because only they can really know if they have been getting lucky on the river or calling down too often.

If my Won at SD is below 50 and I have been getting lucky then I know I have problems.

Without the stat it is a waste of time commenting on whether you played better in the first 10k or the second 10k.

If you played exactly the same (highly unlikeyly) then your Won at showdown stats will tell us the luck factor.

I'm guessing here the stats are fairly far apart which is probably why your BB/100 rates are far apart. Making the rest of your stats pretty meaningless.

bisonbison
10-31-2004, 08:51 PM
With these stats, 5 hands/100 go to showdown.

In a 1000 hand session, the difference between a "good" W$SD of 60% and a "supbar" one of 45% is about 8 hands. It's a dumb stat.

Chris Daddy Cool
10-31-2004, 09:13 PM
Hi bisonbison:

player B is clearly better than player A because he is tighter and more aggressive and over a larger sample too, so the extra couple hundred hands really show the difference.

bisonbison
10-31-2004, 09:19 PM
player B is clearly better than player A because he is tighter and more aggressive and over a larger sample too, so the extra couple hundred hands really show the difference.

Dude, you're my hero.

Peter Harris
10-31-2004, 09:49 PM
i'll, of course, be abstaining from the poll as the stats do not show any of the three answers conclusively. After only 12k, you can't even see a "better player", let alone determine a winrate.

But i think that's what you want to hear, as well as what i believe. I'm maligned that without voting i can't see what others think. can someone tell me the results on election day?!

Regards,
Pete Harris

antidan444
10-31-2004, 11:35 PM
Sure I can. The Packers beat the Redskins, so Kerry wins.

(Couldn't help myself ... sorry.)

Qhorin
10-31-2004, 11:35 PM
I'm hoping the win rates are greater than 3BB/100 apart, because it will make me feel better about my own swings.

Either way, the variance argument is synonymous to the SSH section on random and independent events - your mind, or PT can convince you that +EV moves aren't working in the short term.

Oh, and one more reason not to budge from $0.50/$1.00 - my ASF is over 45% at that level.

CT11
11-01-2004, 12:12 AM
Assuming these stats mean that they play nearly exactly the same. Also assuming that they have the same standard deviation/100 as I do (16bb/100 which is probably high for them) I can say that with this sample size they are probably within 6bb/100 of each other.

Though the fact that bison hand picked these two players makes any results he gives biased so all bets are off.

But I like point. Good post.

~CT11

sin808
11-01-2004, 12:15 AM
none of the options can be determined by the information given. Hell I could have stats that look exactly like that but be spewing chips all over the place. Ex- VP$IP 14%, but those 14% are all offsuit cards from 3-7 and any combination thereof. I'd probably be a horribly losing player, but if I was to look at all the same stats you posted they'd probably look similar.

I think stats are good to let you know you are headed in the right direction. A (very) rough indicator that you "may" be doing things that are good for your winrate. Making judgements based solely on stats is foolish though since they can be, and often are, misleading.

bisonbison
11-01-2004, 12:45 AM
ok. as some of you have guessed, Players A and B are my Empire and Party accounts, respectively. I was playing 3 tables of each site simultaneously, so the aggregate game conditions (and my play at any time) were as close to equal as you might like. The differences? The individual opponents and the individual cards.

As it turns out, those make a big difference.

In terms of BB/100:

Player A &gt; 2 * Player B
and
Player A &gt; Player B + 2BB/100

DeuceKicker
11-01-2004, 01:10 AM
Player B is tighter and more aggressive over every category except PFR%, but I'm going to guess Player A had the better win rate and is the overall "better" player only because you're trying to prove a point about sample sizes.

Edit: That's what I get for typing up a response then going to eat dinner

umdpoker
11-01-2004, 04:35 AM
"With these stats, 5 hands/100 go to showdown.

In a 1000 hand session, the difference between a "good" W$SD of 60% and a "supbar" one of 45% is about 8 hands. It's a dumb stat. "

do you realize that 8 more pots won will change a crappy session to a good one? if avg pot size= 8bb, then thats 64bb, or 6.4bb/100h extra.

sweetjazz
11-01-2004, 04:55 AM
[ QUOTE ]
Assuming these stats mean that they play nearly exactly the same. Also assuming that they have the same standard deviation/100 as I do (16bb/100 which is probably high for them) I can say that with this sample size they are probably within 6bb/100 of each other.

[/ QUOTE ]

So under these assumptions, how many hands do they need to play for them to within 1BB/100 95% of the time?

Is the point that if you want to say that you are probably a winner at a given limit (with SD of 16BB/100) after 10,000 hands then you need to be up 6BB/100?

Since I've just really started playing and logged 2000 hands at 0.5/1 on Party, I like these posts, because they remind me that I need to play a lot more hands before I can assess my performance accurately. I "know" that I am not making some of the horrible mistakes of the other players and I also "know" that I am making some mistakes and bad judgments due to lack of experience. What I don't know is how to quantify that knowledge into a true assessment of my win rate. Only tens of thousands of more hands can tell me that. As it is, my BB/100 rate has oscillated wildly in a 3BB/100 range over hands 1000-2000, another example of how meaningless the statistic is with my small sample size.

sweetjazz
11-01-2004, 05:08 AM
[ QUOTE ]
"With these stats, 5 hands/100 go to showdown.

In a 1000 hand session, the difference between a "good" W$SD of 60% and a "supbar" one of 45% is about 8 hands. It's a dumb stat. "

do you realize that 8 more pots won will change a crappy session to a good one? if avg pot size= 8bb, then thats 64bb, or 6.4bb/100h extra.

[/ QUOTE ]

Yeah, I was thinking along those lines at first, but I think that actually supports bison's claim.

If you are running good and win 8 more pots at showdown than normal in a 1000 hand session, you see a huge 6.4BB/100 swing in your win rate for the session.

If you make the mistake of calling down 8 times too many, which say costs you 12BB, that barely registers as a 1.2BB/100 effect.

The problem, of course, is that the 1.2BB/100 effect might be permanent, while the 6.4BB/100 is illusory. In a 1000 hand session, the luck swamps everything. In a 10000 hand sample, it's not as dramatic, but how you are running can have an equal effect on your win rate for that sample than skill issues. Add in the fact that a fair number of times you win more money in a pot if you make a bad decision than if you make the proper one because of the way the flop turn and river develop, and the ability to distinguish skill in a 10,000 hand sample just isn't there.

Of course someone with a 5BB/100 winrate after 10,000 hands is more likely to be a winning player than someone with a 1BB/100 rate. But a significant, albeit small, percentage of the time, that will be wrong. In a similar percentage of times, the first player will actually be 7 or 8 BB/100 better than the second player.

bisonbison
11-01-2004, 05:08 AM
do you realize that 8 more pots won will change a crappy session to a good one? if avg pot size= 8bb, then thats 64bb, or 6.4bb/100h extra.

Obviously, winning 8 more pots would be great, but there's no way to tell whether the stat is reflecting what you've done or what you've been dealt.

MarkL444
11-01-2004, 05:16 AM
[ QUOTE ]
The differences? The individual opponents and the individual cards.

[/ QUOTE ]

You're crazy. We all know winrate is dependant on intimidation factor caused by your party/empire/etc screenname.

Peter Harris
11-01-2004, 05:35 AM
hey, in the british papers about 12 months ago there was a quote from a derisive fellow who said that "Kerry would be president when the Red Sox win the World Series".

i wonder whether he'll eat his words come this week.

Regards,
Pete Harris

crockett
11-01-2004, 02:30 PM
1) Which ever one flopped better cards.
2) I'll say within 3BB is reasonable within this sample size. 6BB being the extreme but possible.
3) Huh? They play the same. They're skill is level equal.

[ QUOTE ]
Please remember, this post is not about their particular stats, but about what their particular stats show.

[/ QUOTE ]

I would think these stats show that the majority of the time both players are playing tight aggressive Poker. You mention they are bothing winning over this sample. I would also think that it is a good indication that they are winning players as well. Of course, it is no guarantee but if you BB is positive after 12K hands I think you can be reasonably sure you are at least a break even player.

Oh, it also shows they auto bet the flop.

John Paul
11-01-2004, 02:54 PM
Hi,
This is an interesting thread, but I am not sure I get it. If the idea is that you cannot easily predict a win rate from Poker Tracker stats, then I certainly agree. If the point is that looking at your stats is not very useful in improving your game, then I somewhat agree although if you are way too loose or something exteme I think that will show up.

But I don't understand what sample size has to do with it. If you had told me this was based on over 100,000 hands, I still would not be able to predict win rates from these hands.

SofaCoach
11-01-2004, 04:40 PM
[ QUOTE ]
But I don't understand what sample size has to do with it. If you had told me this was based on over 100,000 hands, I still would not be able to predict win rates from these hands.

[/ QUOTE ]
Then I think you missed the whole point. I believe bison is saying that 10k hands is insignificant because you can play the exact same way (same person) for 10k hands and have win rates that vary widely. Over 100k hands the win rates of these two players would be nearly identical statistically.

btspider
11-01-2004, 04:45 PM
[ QUOTE ]
[ QUOTE ]
But I don't understand what sample size has to do with it. If you had told me this was based on over 100,000 hands, I still would not be able to predict win rates from these hands.

[/ QUOTE ]
Then I think you missed the whole point. I believe bison is saying that 10k hands is insignificant because you can play the exact same way (same person) for 10k hands and have win rates that vary widely. Over 100k hands the win rates of these two players would be nearly identical statistically.

[/ QUOTE ]

i think sin808 had the right idea. these stats have nothing to do with winrate at all. someone could play tight aggressive poker the 2+2 way and come up with these numbers and a healthy 2.5 BB/100 winrate.

someone else could play tight aggressive poker with completely craptastic hands (fold AA PF, raise 72o) and come up with the exact same numbers over 100K hands and have a horrible winrate. that's why i asked about won$atSD. without it, you have no idea about the hand selection or luck factors involved and can make no assocation to winrate. it is a cause and effect stat as bison suggested, but needed to even think about relative winrates.

what stats are useful is in identifying where you may be deviating from the 2+2 TA style. VPIP of 29.. too loose. 1.2 flop aggression factor. not aggressive enough. they speak nothing of the appropriateness of your actions.. merely whether or not the frequency of your actions is falling in line with the suggested style of the players on this site. if your frequency is off in a certain respect.. try to post hands that emphasize decisions at that point.

jason1990
11-01-2004, 04:53 PM
[ QUOTE ]
Over 100k hands the win rates of these two players would be nearly identical statistically.

[/ QUOTE ]

This is absolutely not true and is a common misconception about the "power" of the magic number 100K. Over 100K hands, your observed winrate is likely to still have a standard deviation of somewhere between 0.5 and 1 BB/100. So even if they play exactly the same, it would not be unreasonable to find that after 100K hands, their observed winrates still differ by up to 2 BB/100.

CT11
11-01-2004, 08:19 PM
[ QUOTE ]
[ QUOTE ]
Assuming these stats mean that they play nearly exactly the same. Also assuming that they have the same standard deviation/100 as I do (16bb/100 which is probably high for them) I can say that with this sample size they are probably within 6bb/100 of each other.

[/ QUOTE ]

So under these assumptions, how many hands do they need to play for them to within 1BB/100 95% of the time?


[/ QUOTE ]

I get 4096*100=409,600 hands to have a 95% chance of your sample win rate(bb/100) to be within 1BB(+/-0.5BB) of you're actual win rate. Again with a SD of 16BB/100.

[ QUOTE ]

Is the point that if you want to say that you are probably a winner at a given limit (with SD of 16BB/100) after 10,000 hands then you need to be up 6BB/100?


[/ QUOTE ]
At 10,000 hands you win rate is probably correct +/- 3.2 bb.
As for needing to be at +6bb/100 to tell if your winning for sure the answer is yes and no. If you too a random 10,000 hands then I would be very confident that 4bb/100 makes you a winner (though your true win rate may be as low as .8 or lower). If you just stopped when you saw +4bb/100 in PT and say 4bb/100 I would say its meaningless because your samples are now biased.

sin808
11-02-2004, 12:40 AM
that gave me warm fuzzies /images/graemlins/tongue.gif

NoChance
11-02-2004, 01:39 AM
Player B was obviously drunk during most of his sessions.

DrBob
11-02-2004, 10:21 AM
I've used the actual results (net BB for each hand played) from my poker tracker db as the basis for a statistical experiment. In this experiment, sampling (with replacement) from this population is done for samples of 1000, 5000, and 10000 hands.

For each of these 3 cases, this is repeated 1000 times and the results are sorted by overall outcome (BB/100). This lets you see how often runs of various sizes occur. Here are the results:

Population sample size = 14264
Population BB/100 = 8.269
Statistics for samples of size 1000:
5% -2.925
10% -0.646
50% 8.014
90% 18.383
95% 21.022

Statistics for samples of size 5000:
5% 3.317
10% 4.441
50% 8.409
90% 12.825
95% 13.787

Statistics for samples of size 10000:
5% 4.548
10% 5.177
50% 8.230
90% 11.359
95% 11.978

Taking the last (10000 hand set) for example, this tells you that a result worse than +4.5 BB/100 will occur for 5% of the sets of 10000 hands, and a set of hands with BB/100 exceeding 11.4 BB/100 will occur 10% of the time.

What does this mean? For me, playing at my very easy 0.25/0.50 game, I'll virtually never have a really extended loosing streak. Yes, you really can average 8 BB/100 at these games, long term. With a variation (90% confidence) running roughly +/- 4 BB/100 over 10000 hands I'm willing to believe that maybe I'm only a +4 BB/100 player for my 14,000 hands, but almost certainly not worse than that. You really can beat these easy games by that much. I definitely do NOT consider myself a good player by higher stakes standards.

How to reconcile this with "common wisdom" about variance? Well, if your true BB/100 is only +1 or +2, much more characteristic of a winning player in a stronger game than I play, the +/- 4 BB/100 range (for 10000 hand samples) indicates that the chance of an extended loosing streak isn't small at all. So both are consistent.

I'm going to next build a population based on all players in my database (much much larger), which should have a net small negative expectation (given rake) and see what a statistical analysis shows. But I need to earn a living now, so that'll be another day.

Bob

Sand
11-02-2004, 04:57 PM
So what were the W$SD numbers, now that you have let the cat out of the bag? I'm curious.

k000k
11-02-2004, 06:49 PM
[ QUOTE ]
player B is clearly better than player A because he is tighter and more aggressive

[/ QUOTE ]

That was my thought.