Two Plus Two Older Archives - View Single Post

jason1990 · #4 12-11-2004, 12:45 PM

I intend to write an article in the near future titled, "How Accurate is my SD?" It will be for my own personal use and I will put it on my website (as soon as I finish building it), but I doubt it will appeal to a large audience since I intend it to be fairly "math heavy". But at any rate, my claim is that your SD is not very accurate after only 6000 hands. Here's some initial computations. Everything below is approximate and I will make it exact in the article.

Let n = 6000 and X_1, ..., X_n denote the results of the n hands that you played. Let X denote some future hand. We want to estimate sig := sqrt{E|X|^2}. (Note that 10sig is your true SD in BB/100.) There are a couple of ways to do this.

First, you could consider aggregate sums. For example, let Y_j = X_{100(j-1)+1} + ... + X_{100j}, so that Y_1, ..., Y_{60} are the net results of each block of 100 hands. It is probably a relatively safe assumption that each Y_j is a Gaussian random variable. (To be safer, you could consider blocks of 150 or 200. This will still give you at least 30 data points, which, according to Mason in an article I can no longer find a link to, should be enough.) We then compute

bar{Y} = 60^{-1}sum_{j=1}^{60}{ Y_j }
S^2 = 60^{-1}sum_{j=1}^{60}{ (Y_j - bar{Y})^2 }

(Note that S is your empirical SD in BB/100. In your case, S = 24.4.) Standard techniques using the tails of the chi-square distribution give that [0.72S^2,1.48S^2] is a 95% confidence interval for (10sig)^2. In other words, a 95% confidence interval for your true SD in BB/100 is [20.7,29.7].

A second method of estimating your SD is to use individual hand data (which is probably what Poker Tracker does). In this case, you compute

bar{X} = n^{-1}sum_{j=1}^n{ X_j }
S^2 = n^{-1}sum_{j=1}^n{ (X_j - bar{X})^2 }.

Unfortunately, you can no longer use the standard methods for estimating the accuracy of S^2 since X is not Gaussian. To get a confidence interval for your true SD in this case, you must have an estimate on E|X|^4. More precisely, if we write E|X|^4 = Csig^4, then we want to estimate C. For a Gaussian random variable, C = 3. Unfortunately, with only 6000 hands, you will not be able to get an accurate estimate on the tail behavior of X, and this will result in a horrible estimate for C. For example, suppose the largest net win you had over these 6000 hands is 30 BB. If we set p = P(|X| > 30), then a 95% confidence interval for p is given by [0,1-.05^{1/n}] = [0,.0005]. Since the largest win possible is 108 BB, this gives us the estimate of

E[|X|^4 1_{|X|>30}] <= .0005(108^4) = 68000.

Combined with the rest of the data, our estimate of E|X|^4 will likely exceed 100,000. If we observe that sig^4 is certainly less than 100, we see that we will get an estimate for C of more than 1000, which is way too large to be effective.

An alternative way to estimate C is to use the assumption that Y_1 is approximately Gaussian. This means (with all equalities being approximate)

E|Y_1|^4 = 100 E|X|^4
= 3(E|Y_1|^2)^2
= 3(100 E|X|^2)^2

which gives C = 300. If we now assume S^2 is approximately Gaussian, then its mean is approximately sig^2 and its variance is approximately

n^{-2}sum_{j=1}^n{ E|X|^4 }
= n^{-2}sum_{j=1}^n{ 300sig^4 }
= 300sig^4/n
= sig^4/20.

Hence, the standard deviation of S^2 is sig^2/sqrt{20}. In your case, S = 2.44, so sig^2/sqrt{20} is roughly 1.33. Hence, a 95% confidence interval for S^2 is [4.62,7.28]. Taking square roots and converting to BB/100 gives a 95% confidence interval for your true SD as [21.5,27.0]. This is not very different than the previous confidence interval. (It is a little smaller, but this is an artificial effect of the crude nature of these approximate equalities.) This is no surprise since the computations were founded on the same assumption; namely, that X_1 + ... + X_{100} is Gaussian. Only by having a better estimate of E|X|^4 that comes directly from the data can we hope to improve on this. And we can only get such as estimate by controlling the tail behavior. And we can only control the tail behavior by playing many, many more hands. Exactly how large a sample size we need is something I intend to address later.

The moral of the story is this: you know that your SD is high compared to others on this forum. (It is probably at least 20 BB/100.) However, if you want to do any calculations with your SD, such as computing risk of ruin under the assumption of some given winrate, your calculations will be hopelessly inaccurate.

Note: Don't take these confidence intervals literally. As I said, everything here is approximate.