Two Plus Two Older Archives  

Go Back   Two Plus Two Older Archives > General Poker Discussion > Poker Theory
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools Display Modes
  #1  
Old 03-23-2004, 08:27 PM
BillC BillC is offline
Member
 
Join Date: Sep 2002
Posts: 43
Default Estimating EV and SD

Assume a SD of about 10 big bets for an hour of play. Assume a confidence level of 95%.
Then it takes at least 40,000 hours of play to estimate your hourly EV to within .1 big bets. It takes at least 10,000 hours for a margin of error of .2. 1600 hours for margin of error of .5 bbs. General formula: 2 times variance/E^2, where variance = SD^2 and E-margin of error. Consequence, very few people a good fix on their EV, and there is a lot of wild guesswork published.

Otoh, it takes only about 200 hours to estimate ones SD to within 10% (using chi-square distribution).
Reply With Quote
  #2  
Old 03-26-2004, 08:43 AM
pzhon pzhon is offline
Member
 
Join Date: Mar 2004
Posts: 66
Default Re: Estimating EV and SD

[ QUOTE ]
Otoh, it takes only about 200 hours to estimate ones SD to within 10% (using chi-square distribution).

[/ QUOTE ]

Ok, I'll bite. What's the method of estimating the error in the standard deviation estimate? I'm more familiar with probability than statistics, but don't you typically need to assume something about the 4th moment of the distribution? That's easy to estimate coarsely in limit poker, but tough in no-limit.

Btw, I like your work on www.bjmath.com .
Reply With Quote
  #3  
Old 03-26-2004, 12:51 PM
BruceZ BruceZ is offline
Senior Member
 
Join Date: Sep 2002
Posts: 1,636
Default Re: Estimating EV and SD

Here's an old post of mine that explains how to do it.
Reply With Quote
  #4  
Old 03-26-2004, 12:53 PM
BillC BillC is offline
Member
 
Join Date: Sep 2002
Posts: 43
Default Re: Estimating EV and SD

I'll have to refer you to a basic stat book or website. If you just want "answers", a summary table is given in Triola's Elementary Statistics. Things are streamlined if you ask about a percentage as the margin of error for SD. Otherwise you have to get into the chi-square distribution.
This model assume that hourly results are approximately normally distributed.

The large no. of hours needed to estimate EV makes casts doubt on all the wild claims about 1 to 1.5 BB per hour, as data is not produced to back up these claims (as far as I know). It would be interesting if people would post their sample EVs and SDs (in BBs, broken down by game type) if they have hundreds of hours of play.
Reply With Quote
  #5  
Old 03-27-2004, 11:34 PM
pzhon pzhon is offline
Member
 
Join Date: Mar 2004
Posts: 66
Default Re: Estimating EV and SD

Thanks for the reference. In no-limit poker, the assumption that the result of each hour follows a normal distribution fails badly. It is questionable in limit poker. Are you sure a method that works for normal distributions works for distributions with large tails, and if so, why?
Reply With Quote
  #6  
Old 03-29-2004, 01:30 PM
BruceZ BruceZ is offline
Senior Member
 
Join Date: Sep 2002
Posts: 1,636
Default Re: Estimating EV and SD

As I mentioned in the follow-up posts, you shouldn't use hourly results since they are not normally distributed. Instead, use the results of N hours to compute your standard deviation, where the larger N is, the more normal the results will be by the central limit theorem. Even just adding a few hourly results together should produce results which are much more normal. Essentially the hourly distribution is being convolved with itself N times. You could use session results, and use the formula in the essay section to compute the standard deviation for sessions of variable length. The chi-square distribution can then be used to determine the accuracy of the estimate.

Here's an example. In a post above, BillC stated:

Assume a confidence level of 95%...it takes only about 200 hours to estimate ones SD to within 10% (using chi-square distribution).

I concur with this result as long as "200 hours" is replaced by "200 sessions" where the sessions are long enough to produce normally distributed results. As long as the sessions are sufficiently long to meet this requirement, it is the number of sample points or sessions used to compute the standard deviation which determines the accuracy, not the number of hours. Also, for 200 sample points, we don't need the chi-square distribution anymore because for N > 100, a chi-square distribution with N degrees of freedom is essentially a normal distribution with mean N and variance 2N. Now we can see where this result comes from using a standard normal distribution table as follows.

10% accuracy means that we want

.9*sigma < SD < 1.1*sigma

where SD is our estimate, and sigma is the true standard deviation. Let var = SD^2 be our estimate of the variance.

0.81*sigma^2 < var < 1.21*sigma^2

Now divide through by sigma^2.

0.81 < var/sigma^2 < 1.21

Now multiply through by N sessions.

0.81*N < N*var/sigma^2 < 1.21*N

In our case, N = 200, so this becomes

162 < 200*var/sigma^2 < 242

The term in the middle, 200*var/sigma^2, is the thing that is distributed as a "chi-square distribution with 199 degrees of freedom", but this is essentially a normal distribution with mean 199 and variance 2*199 = 398. We can convert this to a standard normal distribution (mean = 0 and sigma = 1) by subtracting the mean of 199 and dividing by the standard deviation of sqrt(398). This is called "normalizing".

(162 - 199) / sqrt(398) < (200*var/sigma^2 - 199) / sqrt(398) < (242 - 199) / sqrt(398)

-1.85 < NORMSDIST(x) < 2.16

Where NORMSDIST(x) is just the cumulative standard normal distribution. Now we can just use the standard normal distribution table to lookup the probability corresponding to 2.16 and -1.85 standard deviations, and subtract these, or use the Excel function NORMSDIST.

NORMSDIST(2.16) = 98.4%
NORMSDIST(-1.85) = 3.2%
98.4% - 3.2% = 95.2%.

So this is the 95.2% confidence interval. We have 95.2% confidence that the SD is accurate to 10% after 200 sessions.

The above method can be used for any N over about 100. Below 100, the only difference is that we replace the normal distribution table with a table of the chi-square distribution, and refer to the line corresponding to N-1.
Reply With Quote
  #7  
Old 03-30-2004, 08:22 AM
pzhon pzhon is offline
Member
 
Join Date: Mar 2004
Posts: 66
Default Re: Estimating EV and SD

[ QUOTE ]

This model assumes that hourly results are approximately normally distributed.

[/ QUOTE ]

I don't accept that assumption. The result of each hand is far from normally distributed, so it takes more than 200 hands to estimate the variance per hand accurately. The result of an hour is not normally distributed, so why would 200 hours suffice? If it would, how about 200 half-hours? 200 orbits?

Some people guess that the variance in backgammon is about 8 square points per game. My guess is that it doesn't exist in theory, but under some circumstances it is about 16 in practice. I really can't estimate it accurately yet. In roughly 20,000 games of backgammon, I've seen one game end on a 256-cube. If 256-cubes should happen once every 20,000 games then they contribute 3 to the variance, assuming single wins. However, with a count of 1 in 20,000 trials, the confidence is only about 92% that the real frequency is between 1/4000 and 1/500,000. If 256-cubes occur every 4000 games then they contribute about 16 to the variance, but if they happen every 500,000 games their contribution to the variance is about .1. This alone means I don't have an estimate of the variance in backgammon within 10% with high confidence after 20,000 games.

As I stated, I believe the usual non-asymptotic ("effective") improvements of the Central Limit Theorem involve an upper bound on the 4th moment. I think that offers a way to determine how long you need to play to be able to estimate the standard deviation accurately.

As an alternative in limit poker, you could estimate the probabilities of each of the finitely many results if you have recorded the hand-by-hand results.
Reply With Quote
  #8  
Old 03-30-2004, 11:06 AM
pzhon pzhon is offline
Member
 
Join Date: Mar 2004
Posts: 66
Default Re: Estimating EV and SD

[ QUOTE ]
As I mentioned in the follow-up posts, you shouldn't use hourly results since they are not normally distributed. Instead, use the results of N hours to compute your standard deviation, where the larger N is, the more normal the results will be by the central limit theorem. Even just adding a few hourly results together should produce results which are much more normal.

[/ QUOTE ]

How do you know how many hours to put together to call a session with roughly normal results?

More concretely, suppose each hand has roughly a 9/10 chance of being worth +-1, about a 1/10 chance of being worth +-5, about a 1/100 chance of being worth +-25, and about a 1/1000 chance of being worth +-100. Each "about" means the figure may be half as much to twice as much. How many hands do you need to play to estimate the variance accurately?

Sorry to sound like a broken record, but I raised this question in my first post on this thread, and it has not been answered here.
Reply With Quote
  #9  
Old 03-30-2004, 06:16 PM
BillC BillC is offline
Member
 
Join Date: Sep 2002
Posts: 43
Default Re: Estimating EV and SD

Good point about normality. I don't think 4th moments are really relevant b/c you have a sum of a bunch of i.i.d. random variables, called "hands". It would be nice to test the normality of hourly results. I did boost the required number of hours to n=200 b/c of inaccuracies such as departures from normality. To those with a lot of records:

Are your hourly results "bell-shaped"?
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT -4. The time now is 04:44 AM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.