PDA

View Full Version : Measuring Reliability


naphand
02-15-2005, 02:59 PM
Without wishing to generate a series of eyeball-rolling moments, I would like to know if a reasonably accurate formula or rule of thumb exists for calculating the reliability of Poker Tracker data over different hand number samples.

For example, throwing a coin 100 times. The normal outcome for Heads/Tails is would be in the range 50 +/-10, derived from the mathematical probability of Heads/Tails of 0.5 and the square root of the sample size, sqrt(100)=10.

Does this type of formula exist for more complicated models, such as the PFR% and V$IP% displayed by PokerTracker?

In a nutshell: is there a way to accurately gauge the reliability of a given figure based on the number of hands played?

Such as Player A has 200 hands in PT and a V$IP of 25%. How reliable is the 25% figure and what range of actual values could give rise to this figure, i.e. is this 20-30% or tighter like 24-26%? And how will this vary by hands played? How can we estimate the number of hands required for the figure to "converge" to X% of its "real" value.

The coin throw example only has 2 possible outcomes, obviously indicators such as PFR% or V$IP are subject to a lot of variables. Can anyone define those variables mathematically so an approximate measure of reliability can be derived? This is the subject of much debate/ignorance over on the SH Forum.

Thanks in advance. Apologies is I re-phrased the same question several times, I am not a statistician.

olavfo
02-15-2005, 03:18 PM
This is not my field of expertise, but my immediate thought is that this is far too complex for an analytical method.

Sounds more like a problem for numerical simulations. For example, if you had the source code for Turbo Texas Hold'em, you could probably have modified it to do something like this:

Program a bot to play the way you want it to (*), run 100 million hands or so to get the true values for the statistical parameters, and then run series of shorter simulations to see when the parameters begin to approach their true values.

Real poker is much more complex than this, but simulations with bots would tell you a lot.

Olavfo

(*) You can do that already of course. The only thing you would have to add is a way of keeping track of VPI$P and such during the simulations.

Paul2432
02-15-2005, 03:34 PM
I am not a statistician either. A couple of points.

1) VP$IP and PFR are no more complex than a coinflip. They are all binary events. Head or tails is the same as raised or did not raise. The underlying complexity does not matter. For example, the result of a coin flip is undoubtedly dependent on dozens of parameters (such as rotational speed, air currents, etc.)

2) A 50/50 event has the highest SD=0.5

3) Standard error is equal to SD/sqrt(N)

4) Every hand is a trial with respect to VPIP and PFR (I suppose there are some exceptions related to the big blind or being all-in). Therefore N is equal to the number of hands. (note that this is not the case with some other stats like cold calls a raise, or check raises the river)

5) A range without a confidence is meaningless. For example, to say PFR = 6-10% is meaningless. Instead say PFR=6-10% with 99% confidence.

Given all this 95% confidence corresponds to 1.96 SD. So after 200 hands, SE = 50%/sqrt(200) = +/-3.5%

Paul

naphand
02-15-2005, 04:20 PM
Excellent little summary. I had not thought to classify the PF actions as binary, but this seems very obvious now... /images/graemlins/blush.gif

Action on later streets cannot be considered binary though, as it involves bet/raise, call or fold. 3 possibilities. Is there a way to handle this? PT uses Aggression Factor (AF) and this is derived from bet%/raise% divided by call%, which is binary, so that can be treated the same way (?). I suppose folding can be included by considering separately continue%/fold% as a binary, using the "When folds..." data.

Thanks for this.

naphand
02-16-2005, 04:06 PM
Now I have a couple of questions on this, pretty basic... /images/graemlins/blush.gif

(1) I understand that the SD of a binary event is 0.50 (I looked up the way to calculate SD). However, I do not recall seeing SD quoted as a % as in your formula for SE

SE = 50%/sqrt(N)

It seems to me that this should be

SE = 0.50*N/sqrt(N)

<font color="blue">Should N be the number of hands played or the total hands dealt?</font>

50% is just 0.50*100, and 50 is right for 100 samples only. Am I missing something? Or rather, what am I missing?

(2) Confidence interval of 95% is the standard boundary for significance. I know that a confidence interval of 66% represents 1*SE. How do I calculate the factor for confidence intervals between 50 and 100%? For example (confidence of Y, is X*SE) X for 50%, 75%, 80%, 90%?

Thanks again, anyone.

Paul2432
02-16-2005, 05:01 PM
The VPIP and PFR figures are given as percentages, so I used a percentage in figuring the SE to maintain consistant units. The result would be exactly the same if we used 0.5 instead, except you would then need to convert your VPIP and PFR numbers to an ordinary decimal instead of a percentage (e.g. use 0.2 VPIP instead of 20% VPIP).

N should be hands that you were dealt cards.

50% or 0.5 is correct regardless of the number of samples.

1.96 x SE will give you a 95% interval.

To get other intervals you use lookup tables (do a google search for z-tables). Microsoft Excel also has functions for this.

Paul