PDA

View Full Version : Tournament win rate accuracy


Homer
11-24-2003, 01:34 PM
I've recently started playing some single-table tournaments online. I was wondering how I might go about computing the accuracy of my win rate. I'd like to be able to make a statement like:

"There is a x% chance that my win rate is between $y and $z per tournament."

Background information on the tournaments -- Ten players, buy-in is $33, 1st pays $150, 2nd pays $90, 3rd pays $60.

-- Thanks, Homer

Robk
11-24-2003, 03:08 PM
Let X be your earn per tournament, n the number of tournaments you've played, s your sample standard deviation. Then your 100*(1-a)% confidence interval is approximately

X +/- z(a/2)*s*(1/SQRT(n))

where z(a/2) is the value which has an area of a/2 to its right on a standard normal curve.

For example you've played 150 tournaments and your total result is 16 per tournament, with sample standard deviation 60. Then letting a = .05, your 100*(1-.05) = 95% confidence interval is

16 +/- z(.025)*60*(1/12.25) = 16 +/- 1.96*60*(1/12.25)

= 16 +/- 9.6 = (6.4, 25.6)

Normal tables abound on the web, but here are a few common z values

z(.1) = 1.282
z(.05) = 1.645
z(.025) = 1.96
z(.005) = 3.09

As usual, smart posters please tell me if I messed up.

Homer
11-24-2003, 03:56 PM
Thanks, this is great. Exactly what I was looking for. Let me make sure I'm doing this right...

So far I've played in 14 $30+3 tourneys and have 3/2/3 -> 1st/2nd/3rd.

n = 14

X = [(3*150 + 2*90 + 3*60) - (14*33)] / 14 = $24.86

s = sqrt((3*(117-24.86)^2 + 2*(57-24.86)^2 + 3*(27-24.86)^2 + 6*(-33-24.86)^2)) / 14) = $58.33

* Note - Should I divide by (n-1) or n? In Excel, STDEVP() uses n and STDEV() uses (n-1). It says to use STDEVP() when you are solving for the SD of your entire population, so it seems I should be dividing by n.

95% Confidence Interval = 24.86 +/- 1.96*58.33*(1/3.74) = 24.86 +/- 30.57

If I play around with the confidence interval I find that:

- The lower bound becomes a positive number at a 89% CI, so I can be 89% sure that I am a winning player.

- The lower bound gives me an ROI of 40% at a 55% CI, so I can be 55% sure that I have an ROI of at least 40%.

- The lower bound gives me an ROI of 20% at a 76% CI, so I can be 76% sure that I have an ROI of at least 40%.

Does all of this seem correct and is there anything else I can I extract any other meaningful data that I have not yet?

-- Thanks, Homer

bigpooch
11-24-2003, 08:09 PM
That is a very good idea! I have done this for my
tournaments in the past, but let me warn you about
determining your win rate per tournament: you have
to play hundreds (if not thousands!) of these
tournaments to determine your win rate to any
reasonable accuracy (95% or better confidence and
within about 10% of EF say). At the start of my
statistics, I ran so well that I thought I could
beat tournaments for about 0.8xEF but after about
100 or so NL tournaments, a regression to the mean
occurred so that my win rate seemed only about half
as much.

It may also be important to consider that these
tournaments might be less soft as time goes on in
which case a regression with respect to the date
(and year) of the event may seem useful.

Also, if you don't really need the raw data (or
distribution of your results), you need only
keep track of these numbers:

n = number of events
s = sum of results
s2 = sum of squares of results

This will give sufficient data for sample mean and
variance. The individual result should simply be
the net/EF and if you constantly play the ten-handed
NL events, there should only be four possibilities.

cottonmather0
11-25-2003, 01:18 AM
RE: Excel Function - What you are asking about is degrees of freedom within a cumulative distribution function. With only 14 trials, your sample size is still relatively small, so you should be using 'stdev' for now. Eventually the results for the sample approach that of the population as the sample grows large enough (this can be seen to be intuitively correct - eventually the sample grows so large that it approximates the entire population itself). Relatively speaking, though, it doesn't matter which function you use because the results are approximately the same and your analysis doesn't need to be so precise, anyway.

Otherwise, your analysis seems to be correct, but for something that you *know* is so highly dependent on chance, I would say just 14 tournaments isn't nearly enough for you to definitively say whether you are a winner or not or what exactly your results are. However, as you play in more tournaments, you can be more and more confident of the accuracy of your results.

Bottom Line: You're on the right track, but you need to play a lot more to be sure of the results.

ZManODS
11-25-2003, 11:45 AM
Can someone elaborate on exactly what is a normal curve. I havent taken stats in a few years.

I do not understand how homer achieved his answers. I completely understand how to get N, X, S. But where do i get Z from? Something about Z tables on the web?? Can you just explain to me like you would to a little child.

Thanks!

bigpooch
11-25-2003, 10:02 PM
The standard normal curve is just the function
f(x) = ( 1/sqrt(2*pi)) exp ((-x**2)/2).
[sqrt=square root function and x**2 is just x squared]
The area under this curve is exactly 1.

This curve gives the distribution in the limit as n goes
to infinity of the "scaled" sum of independent identically
distributed random variables. The scaling is just by
shifting by the mean and dividing by sqrt(n*sigma**2) where
sigma**2 is the variance. You'll have to pick up a stats
book if the above didn't digest very well!

z values were described in a previous post but they are
just the x-ordinate values when the area to the right is
of a specific size; typically, statisticians use A=0.05
or A=0.025. They indicate the number of standard
deviations away from the mean and most of the time, z
values are between -3 and 3 except for extraordinary
events!

Copernicus
11-26-2003, 05:39 PM
If thats the way you would explain it to a child, I'd hate to be your kid!

The normal curve is the bell shaped curve. The higher the curve and closer in the tails are, the lower the standard deviation is. The squatter the curve and more spread out the tails are, the higher the standard deviation is.

The probability of your results lying outside different points in the tails for a given standard deviation are given by z tables.

MrBlini
11-28-2003, 06:44 PM
[ QUOTE ]
The lower bound becomes a positive number at a 89% CI, so I can be 89% sure that I am a winning player.

[/ QUOTE ]The news is actually a bit better than that. Monte Carlo simulations I ran this afternoon show that no breakeven player profile (based on percentages of 1st, 2nd, and 3rd places) can earn as much in 14 tourneys more than about 7% of the time.

This seems to be related to the fact the probability distribution for payouts with low probability is skewed to the left. A breakeven player has at best an 11/50 probability of scoring a win, so if we assume he or she never places 2nd or 3rd, this can be modeled by a binomial distribution with p=11/50. The corresponding binomial distribution for n=14 has most such players winning 2 or 3 tourneys, with the bulk in a lump on the left side of the distribution and a long, thin tail. This does not resemble a normal distribution, so the confidence intervals are quite different. 93% of such players will score five wins ($288 net) or less.

On the other hand, the money distribution for a breakeven player who always places in 3rd if at all is such that he/she will earn less about 99% of the time. The variance is too low to allow much of the tail to reach $348. The same is true of a player who always places in 2nd if at all.

A breakeven player with a mix of possible results weighted toward winning stands the best chance of matching your win. Having run a variety of breakeven player profiles, I would place the confidence that you are a winning SNG player at about 93%. Not bad.

A binomial fit says the confidence interval for your placing in the money is (0.2886, 0.8234). If you are equally likely to take any place, you need this to be .3 to break even.

What is hardest to say with any confidence due to the small sample size is your ability to win in the heads-up situation and take the big prizes. The interval for this is 14% to 95%. That's right, you probably win in heads-up situations something between 14% and 95% of the time. Bet you didn't expect that.

I also estimated your variance using a multinomial distribution, a better model for tournament results than a normal distribution. The variance of the money expectation based on a multinomial distribution was in line with your normal approximation.

I did this all because I questioned wheteher this was a sufficient sample size to say much about your performance. It turns out that intricacies of this problem actually let us say a little more about the likelihood of a breakeven sit'n'go player duplicating your feat.

Homer
11-28-2003, 06:56 PM
Thanks for your post. This is excellent information, though admittedly it does leave my eyes a bit glazed over. /images/graemlins/grin.gif

-- Homer

Robk
11-29-2003, 03:49 AM
[ QUOTE ]
I also estimated your variance using a multinomial distribution, a better model for tournament results than a normal distribution.

[/ QUOTE ]

We're not using the normal distribution as a model for tournament results. We're considering each tournament as an iid random variable, and by the central limit theorem the arithmetic mean of the results is approximately normal.

MrBlini
11-29-2003, 05:56 AM
[ QUOTE ]
We're not using the normal distribution as a model for tournament results. We're considering each tournament as an iid random variable, and by the central limit theorem the arithmetic mean of the results is approximately normal.

[/ QUOTE ]It can be very approximate with small sample sizes, which is what got me looking at the possibility of estimating the confidence intervals based on the actual distribution. Unfortunately, estimating confidence intervals for variables derived from multinormal distributions is not easy, at least with the tools I have available (Matlab and R). For practical purposes, the normal approximation seems to be just fine, although you have to be careful what you conclude about the tails.

joeg
12-01-2003, 01:17 PM
Can anybody reccomend any good books on this kind of mathematics, I've got a maths degree but almost no experience in stats so I dont mind complicated or technical as long as it starts at the begining

Robk
12-01-2003, 02:48 PM
BruceZ suggests a probability/statistics book for people with math experience here (http://forumserver.twoplustwo.com/showflat.php?Cat=&Board=probability&Number=416704& Forum=All_Forums&Words=197&Match=Username&Searchpa ge=0&Limit=25&Old=allposts&Main=416690&Search=true #Post416704) .