Confidence intervals & sample size

JayKon · #1 10-28-2004, 11:35 AM

I want to know how many hands a person plays overall to gage them on a tight/loose scale.

The calculation is simple enough: Hands Played/Hands Dealt * 100 = Percent hands played. (<20% tight, >=20% & <30% semi-loose, etc).

What I don't know is how to calculate the confidence and sample size needed.

If I see David Sklansky dealt 10 hands and he plays 6, I can conclude that he plays 60% of all hands with a confidence level of (roughly) 0%.

On the other hand, if I see him dealt 30,000,000 hands and he plays 4,500,000 of them, I can conclude he plays 15% of all hands with a confidence of roughly (or greater than [img]/images/graemlins/smirk.gif[/img]) 100%.

So, how many hands do I need to see David play, in order to have 50% confidence in the % of hands played? 75%? 90%?

What, if anything, happens to the calculation, if I measure more than one thing: like the % of time he raises, given that he has entered the pot.

I know the answer involves the 3,270 possible starting combinations, not the 169 unique starting hands and will probibly need a Z value, but thats all.

Thanks,
Jay

irchans · #2 10-28-2004, 02:43 PM

The size of a 95% confidence interval is about
1/Sqrt("number of observations"). (About two Standard Deviations)

So for your first example, if you observe 6 hands played out of 10, then the confidence interval size is about 1/sqrt(10) = 32%, so you can conclude that he plays between 28% and 92% of his hands. (This is a rough estimate--better estimates are possible.)

For your second example, you saw him play 4.5 million out of 30 million hands, so 2 standard deviations is about 1/Sqrt(30000000) = 0.02%. You can conclude that the player plays between 14.98% and 15.02% of his hands with 95% confidence.

Cheers,
Irchans

IsaacW · #3 10-28-2004, 02:58 PM

I think we can look at this as trying to estimate the proportion of hands played to hands dealt. Let the following:

Nd = # of hands dealt (sample size)
Np = # of hands played

p = estimate of proportion of hands played = Np/Nd
s = estimate of standard deviation of proportion of hands played = sqrt(p*(1 - p)/Nd)
t = t-value for desired confidence level with Nd - 1 degrees of freedom

Then the lower and upper bounds of our confidence interval are respectively:

pL = p - t*s
pU = p + t*s

For example, with Nd = 100, Np = 15, and desired confidence level 90 %, we calculate:

p = 15/100 = 0.15
s = sqrt(0.15*0.85/100) = 0.0357
t = 1.6604

pL = 0.15 - 1.6604*0.0357 = 0.0907
pU = 0.15 - 1.6604*0.0357 = 0.2093

So we can say with 90 % confidence that the actual value of his proportion of hands played is between 9.07 and 20.93 %.

The width of the confidence interval is:

w = 2*t*s = 2*t*sqrt((Np/Nd)*(1-Np/Nd)/Nd)

Solving this for Nd (the sample size) gives:

Nd = 4*t^2*p*(1-p)/(w^2)
where w = width of confidence interval

The formula is actually more complicated than this because the value of p depends on both Np and Nd. If you want the worst case scenario, assume p = 0.5 in the above formula, since that will maximize the number of hands you need to get down to a particular interval width. So if you wanted to get a 90 % confidence interval that is 2 % wide (+ or - 1 %):

Nd = 2*(1.6604)^2*0.5*0.5/(0.02^2) = 3,446 hands

In other words, you'd need to have seen a lot of hands by a player to figure out his true proportion of hands played, but you can get a good idea (+- 5 %) with as few as 100 hands.