Two Plus Two Older Archives - View Single Post

BruceZ · #2 08-30-2005, 03:33 AM

[ QUOTE ]
So for those of us that are madly in love with Pokertracker, we are used to thinking about our standard deviation in terms of BB/100. As far as I know, poker tracker groups your hands in 100s (i.e. hands 1-100, 101-200, 201-300, ...), then does the typical SD calculation of summing up the square of the difference from the mean and dividing by n or (n-1) or whatever. Is there a way to calculate the standard deviation _per session_ rather than _per 100 hands_? Would such a statistic even make sense?

I'm thinking in terms of some sort of weighting, but don't really know how to do it. Here's an example, though, of 400 hands played over 3 sessions:

Session 1 (160 hands):
100 hands, +2 BB
60 hands, -1 BB

Session 2 (180 hands):
40 hands, +3 BB
100 hands, +2 BB
40 hands, +5 BB

Session 3 (60 hands):
60 hands, -3 BB

So my understanding (and this may be wrong) is that poker tracker sorts the hands in this manner:

100 hands: +2 BB
100 hands: +2 BB
100 hands: +2 BB
100 hands: +2 BB

and obviously the SD is 0 BB/100 hands (even if I don't know if it is 0*(1/3) or 0*(1/4) [img]/images/graemlins/smile.gif[/img]).

But if we look at the data as
160 hands: +1 BB
180 hands: +10 BB
60 hands: -3 BB

and use a mean of 0.02 BB/hand is there a way to get a reasonable standard deviation? I'm thinking along the following steps --

1) Make an Expected Value chart based on 0.02 BB/hand, like such:
160 hands: +3.2 BB
180 hands: +3.6 BB
60 hands: +1.2 BB

2) For each session, find the square of the deviation from the expected value
160 hands: (2.2 BB)^2 [EV of 3.2 BB, actual of 1 BB]
180 hands: (6.4 BB)^2 [EV of 3.6 BB, actual of 10 BB]
60 hands: (4.2 BB)^2 [EV of 1.2 BB, actual of -3 BB]

3) So obviously there is some variation from session to session, but how do you weight it? And how do you interpret the results?

It seems clear to me that the SD should be between 2.2 BB/session and 6.4 BB/session, but beyond that I'm pretty lost.

Did anybody make it this far with even a vague notion of what I'm trying to do? Anyone have any thoughts?

[/ QUOTE ]

This essay by Mason shows you how to compute your SD for variable length sessions. It is from Gambling Theory and Other Topics. If you want, I can email you a spreadsheet that does this. The following is a derivation of that method, which also contains a form which closely resembles the form for fixed length sessions.

This is the derivation of the maximum likelihood estimator for the variance for sessions of variable length. The derivation is exactly the same as the textbook derivation for sessions of equal length, except that the variance is multiplied by the session length Ti, and the standard deviation is multiplied by sqrt(Ti). Here is the derivation (sorry about the ascii):

Let X be a vector of session results, and Ti be the duration of the ith session. Each session result Xi is a random variable distributed as a normal distribution of mean Ui = uTi, and unknown variance Ti*sigma^2, where u and sigma^2 are the mean and variance for 1 unit of time or number of hands (e.g. 100 hands). The probability distribution of a given observation Xi given sigma is:

f(Xi | sigma) = 1/[ sqrt(2*pi*Ti)*sigma ]*exp[ -(Xi - Ui)^2/(2*Ti*sigma^2) ]

This is simply the definition of the normal distribution where the standard deviation has been replaced by sqrt(Ti)*sigma, and the variance has been replaced by Ti*sigma^2. The conditional probability of a vector of N observations X given sigma, called the likelihood function, is obtained by multiplying N of these together, which causes a sum to appear in the exponential, and a product of 1/sqrt(Ti) out front.

f(x | sigma) = [ (2*pi*sigma^2)^-N/2 ]*prod[i=1 to N][1/sqrt(Ti)]*exp[ -1/(2*sigma^2) ]*sum[i = 1 to N](Xi - Ui)^2/Ti

To find the value of sigma^2 which maximizes the likelihood function, it is convenient to take the log of the likelihood function and maximize that. The logs of products become sums.

(-N/2)log(2*pi) - (N/2)*log(sigma)^2 - (N/2)*sum[i=1 to N] *log(Ti) - 1/(2*sigma^2) *sum[i = 1 to N] (Xi - Ui)^2/Ti

Taking the derivative of this with respect to sigma^2 and setting = 0:

-(N/2)*(1/sigma^2) + 1/(2*sigma^4)* sum[i = 1 to N] (Xi - Ui)^2/Ti = 0

sigma^2 = (1/N)*sum[i = 1 to N](Xi – Ui)^2/Ti

Note the similarity of this result to the standard definition of variance for sessions of equal duration. The only differences are that each term inside the sum is divided by the session duration Ti, and the constant mean u has been replaced with Ui which depends on the duration of each session. If the sessions are of equal length, Ti becomes a constant T which can be removed from the sum, and the sum would be divided by NT which is the total number of hours in N sessions.

To put this in the form found in Mason’s essay, expand the square, and break this into 3 sums:

sigma^2 = (1/N)*sum[i = 1 to N] Xi^2/Ti + (1/N)*sum[i = 1 to N] -2*XiUi/Ti + (1/N)*sum[i = 1 to N] Ui^2/Ti

Since Ui = u*Ti,

sigma^2 = (1/N)*sum[i = 1 to N] Xi^2/Ti + (1/N)*(-2u)*sum[i = 1 to N] Xi +
(1/N)*u^2*sum[i = 1 to N] Ti

Now since sum[i = 1 to N] Xi is the sum of the session results, this is the same as the hourly rate u times the total hours, or u* sum[i = 1 to N] Ti, so the second term is
(1/N)*-2u^2* sum[i = 1 to N] Ti . This can be combined with the final term to give Mason’s form:

sigma^2 = (1/N)*sum[i = 1 to N] Xi^2/Ti – (u^2/N)* sum[i = 1 to N] Ti.