Two Plus Two Older Archives  

Go Back   Two Plus Two Older Archives > General Gambling > Probability
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools Display Modes
  #1  
Old 06-06-2005, 08:23 PM
Dazarath Dazarath is offline
Senior Member
 
Join Date: Nov 2004
Posts: 185
Default Confidence Intervals

This isn't really probability, but I figure if that since a lot of mathematics related questions are posted in this forum, this is the best place to ask. Given my current winrate and standard deviation per 100 hands from Poker Tracker, how do I go about calculating confidence intervals for my true winrate? My thanks to anyone who can help me out.
Reply With Quote
  #2  
Old 06-06-2005, 08:52 PM
gaming_mouse gaming_mouse is offline
Senior Member
 
Join Date: Oct 2004
Location: my hero is sfer
Posts: 2,480
Default Re: Confidence Intervals

SD = your sample standard deviation (in BB/100)
sample_size = (number of hands)/100
sample_winrate = your sample winrate

you standard error (SE) is:

SD/sqrt(sample_size)

A 95% CI for your true winrate is sample_winrate +- 2*SE

HTH,
gm
Reply With Quote
  #3  
Old 06-06-2005, 09:11 PM
BruceZ BruceZ is offline
Senior Member
 
Join Date: Sep 2002
Posts: 1,636
Default Re: Confidence Intervals

[ QUOTE ]
This isn't really probability, but I figure if that since a lot of mathematics related questions are posted in this forum, this is the best place to ask. Given my current winrate and standard deviation per 100 hands from Poker Tracker, how do I go about calculating confidence intervals for my true winrate? My thanks to anyone who can help me out.

[/ QUOTE ]

Try this post.
Reply With Quote
  #4  
Old 06-06-2005, 09:39 PM
Orpheus Orpheus is offline
Senior Member
 
Join Date: Apr 2005
Posts: 178
Default Re: Confidence Intervals

This is the thread I refer people to. It includes a small spreadsheet formula to stick in the corner of your overall poker spreadsheet [but read down to the corrections]
Reply With Quote
  #5  
Old 06-06-2005, 10:01 PM
Dazarath Dazarath is offline
Senior Member
 
Join Date: Nov 2004
Posts: 185
Default Re: Confidence Intervals

Ok, thanks for the responses guy, I'll read over everything right now.
Reply With Quote
  #6  
Old 06-07-2005, 05:18 PM
Jerrod Ankenman Jerrod Ankenman is offline
Member
 
Join Date: Jun 2004
Posts: 40
Default Re: Confidence Intervals

[ QUOTE ]
This is the thread I refer people to. It includes a small spreadsheet formula to stick in the corner of your overall poker spreadsheet [but read down to the corrections]

[/ QUOTE ]

By the way, the post that BruceZ linked to makes reasonable statements. The post that is linked to in this post contains a lot of statements of dubious value. In fact, repeatedly utilizing the methodology there will lead to making false statements with probability 1.

Jerrod Ankenman
Reply With Quote
  #7  
Old 06-07-2005, 11:01 PM
uuDevil uuDevil is offline
Senior Member
 
Join Date: Jul 2003
Location: Remembering P. Tillman
Posts: 246
Default Re: Confidence Intervals

[ QUOTE ]
[ QUOTE ]
This is the thread I refer people to. It includes a small spreadsheet formula to stick in the corner of your overall poker spreadsheet [but read down to the corrections]

[/ QUOTE ]

By the way, the post that BruceZ linked to makes reasonable statements. The post that is linked to in this post contains a lot of statements of dubious value. In fact, repeatedly utilizing the methodology there will lead to making false statements with probability 1.

Jerrod Ankenman

[/ QUOTE ]

That post has been referred to quite often on these forums. Would you (or perhaps BruceZ, pzhon, or other resident mathematician) please specify what these dubious statements are?

By the way, when does your book come out?
Reply With Quote
  #8  
Old 06-08-2005, 11:43 AM
Jerrod Ankenman Jerrod Ankenman is offline
Member
 
Join Date: Jun 2004
Posts: 40
Default Re: Confidence Intervals

[ QUOTE ]
[ QUOTE ]
[ QUOTE ]
This is the thread I refer people to. It includes a small spreadsheet formula to stick in the corner of your overall poker spreadsheet [but read down to the corrections]

[/ QUOTE ]

By the way, the post that BruceZ linked to makes reasonable statements. The post that is linked to in this post contains a lot of statements of dubious value. In fact, repeatedly utilizing the methodology there will lead to making false statements with probability 1.

Jerrod Ankenman

[/ QUOTE ]

That post has been referred to quite often on these forums. Would you (or perhaps BruceZ, pzhon, or other resident mathematician) please specify what these dubious statements are?

By the way, when does your book come out?

[/ QUOTE ]

Our book discusses this topic a little, but it's one thing to criticize a flawed methodology and another to substitute a different one. Basically, the problem at hand is inferring a win rate from observed data.

If you KNOW the win rate (because Allah told you or something), then you can create a confidence interval for a sample, as long as your sample is big enough to satisfy the Central Limit Theorem (which it normally will be for poker-size HAND samples; tournaments are another thing entirely).

But that's not what we're doing here. Here we want to take an observed (sample) win rate and turn it into a true (population) win rate. Now in a lot of fields, we sorta just use x-bar (the sample mean) as the best estimator of population the mean and s, the sample standard deviation, as the best estimator of the population standard deviation, because we don't have any other data to work with.

But in poker, we do have a lot of other data that tells us something about the a priori distribution of all poker players. Say we have a player who has won 6 bb/100 hands over 10,000 hands (600 bb), with a variance of 4bb^2/h. Now using these parameters as best estimators, we get that the standard error of a 10,000 hand sample is 200 bb. So using a normal distribution with stddev 200 bb and mean 600, we are led to the following statements:

It is 95% likely that this player's win rate is between 2bb/100 and 10bb/100.

It is equally likely that this player's win rate is above 6bb/100 as it is that it is below 6bb/100.

It is equally likely that this player's win rate is 10bb/h as it is that the player's rate is 2bb/h.

Now I ask the following question; is it more likely that this player is a decent winner having a result a couple standard deviations to the right of his mean? Or is it more likely that he's really a player with a rate that is some multiple of the highest reliable rates reported. Of course the former. Bayes' theorem allows us to incorporate information about the distribution of all poker players into our analysis. What this leads to is that the formula in that post will likely overestimate the likelihood that a player is a winner (for the players who will actually use it), and also overestimate the probabilty that a player has a very high win rate. This is especially true because players who are trying to prove that they are winners after a reasonably small sample will normally be experiencing highly positive results relative to their mean.

Unfortunately, I don't really have any useful ideas as for incorporating the distribution of all poker player win rates, since a) I don't know it and b) it wouldn't be expressible in closed form, probably, anyway, and so wouldn't be that useful for finding a formula to answer the question "how many hands do I need to be a winning player?"

I hate criticizing someone else's work without offering a correction or a different methodology; but in this case, I hope it's clear why this method is so biased.

We will be wrapping up the first complete draft of our book in the next two weeks or so; then there's some sort of long editing process, and so on. But the book should be in stores late this fall.

Jerrod Ankenman
Reply With Quote
  #9  
Old 06-08-2005, 04:12 PM
BruceZ BruceZ is offline
Senior Member
 
Join Date: Sep 2002
Posts: 1,636
Default Re: Confidence Intervals

You apparently do not understand a very important and subtle thing which most people also do not understand. You need to understand this before you put the above in a book, or your book will be seriously flawed.

Homer's method of analysis is fundamentally valid. Your interpretation of what his analysis says is incorrect.

[ QUOTE ]
But in poker, we do have a lot of other data that tells us something about the a priori distribution of all poker players. Say we have a player who has won 6 bb/100 hands over 10,000 hands (600 bb), with a variance of 4bb^2/h. Now using these parameters as best estimators, we get that the standard error of a 10,000 hand sample is 200 bb. So using a normal distribution with stddev 200 bb and mean 600, we are led to the following statements:

It is 95% likely that this player's win rate is between 2bb/100 and 10bb/100.

It is equally likely that this player's win rate is above 6bb/100 as it is that it is below 6bb/100.

It is equally likely that this player's win rate is 10bb/h as it is that the player's rate is 2bb/h.

[/ QUOTE ]

These statements are all incorrect, and they do NOT follow from the analysis of confidence intervals that Homer has provided. This statement from Homer's post:

[ QUOTE ]
"I am 95% confident that my true win rate is between -$24 and $102 per table-hr."


[/ QUOTE ]

IS correct. The difference is the word "confident". Confidence has a mathematically defined meaning distinct from "likely" or "probability". Homer's statement does not mean that there is a 95% probability that his true win rate is in this range. It means that IF the true win rate were above this range, then the probability of observing a win rate equal to or less than the observed win rate would be less than 2.5%, and IF the true win rate were below this range, then the probability of observing a win rate equal to or greater than the observed win rate would also be less than 2.5%.

Another way to look at it is to note that if a large number of people constructed their 95% confidence interval in this manner, then 95% of the people would have their true win rate fall in this interval. This is because before they played any hands, there was a 95% probability that their observed win rate would fall within +/- 2 standard deviations of their true win rate. Conversely then, there was a 95% probability that their true win rate would lie within +/- 2 standard deviations of their observed win rate. After they play the hands, they then construct a +/- 2 standard deviation confidence interval around their observed win rate. However, it is no longer correct to say that their true win rate has a 95% probability of lying within this interval because their true win rate and their observed win rate are not probability distributions; they are fixed numbers. Thus we characterize the situation by saying that there is a 95% confidence that the true win rate lies within this interval.

Note that the center of this interval is the observed win rate, or the "sample mean", and this is termed the "maximum likelihood estimate" of the mean, assuming that we have enough samples that the sample mean is well-approximated by a normal distribution. This term "maximum likelihood estimate" does not mean that this is the most likely value of the mean. It means that this is the value of the mean which maximizes the likelihood of the observed win rate.

This is a completely accurate and mathematically correct statement. It is the only type of statement that can be made when using this type of statistics, called "maximum likelihood estimation". The type of analysis you are proposing, whereby we take into account our prior knowledge about the population distribution and actually compute a probability distribution for our win rate, is called "Bayesian estimation".

[ QUOTE ]
Here we want to take an observed (sample) win rate and turn it into a true (population) win rate.

[/ QUOTE ]

That is NOT what we are doing with maximum likelihood estimation, which Homer is performing. This is what we would do with Bayesian estimation.

Now the argument that Bayesian estimation is the only valid form of estimation that should be used is an argument that some statisticians make. It is controversial for the very reason that you state:

[ QUOTE ]
Unfortunately, I don't really have any useful ideas as for incorporating the distribution of all poker player win rates, since a) I don't know it and b) it wouldn't be expressible in closed form.

[/ QUOTE ]

This is precisely the circumstance under which you would use maximum likelihood estimation, and why the use of Bayesian estimation is controversial. The result that you get from Bayesian estimation depends on what prior distribution you assume, and different people will assume different prior distributions. At any rate, the maximum likelihood method which Homer has presented, when understood with the proper mathematical definitions, is mathematically correct.

[ QUOTE ]
I hate criticizing someone else's work without offering a correction or a different methodology; but in this case, I hope it's clear why this method is so biased

[/ QUOTE ]

It is interesting that you used the word "biased" because this has a precise mathematical definition. Bayesian estimates are biased as a result of taking into account our preconceived notions about the prior probability distribution. On the other hand, the maximum likelihood estimate of the win rate is an unbiased estimate because the expected value of the estimate is equal to the true win rate.
Reply With Quote
  #10  
Old 06-08-2005, 07:06 PM
Siegmund Siegmund is offline
Senior Member
 
Join Date: Feb 2005
Posts: 415
Default Re: Confidence Intervals

[ QUOTE ]
On the other hand, the maximum likelihood estimate of the win rate is an unbiased estimate because the expected value of the estimate is equal to the true win rate.

[/ QUOTE ]

No time like the present to remind y'all that MLEs don't come with an automatic guarantee of being unbiased either (and for non-symmetric distributions they usually aren't), only that they asymptotically approach the unbiased value in an efficient fashion.

Not commenting on the actual question at hand, except to affirm the general idea that there are a few different types of calculations you can do, each of which requires a certain set of assumptions that are often glossed over.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 03:26 PM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.