PDA

View Full Version : SNG Results, and Distribution


AleoMagus
07-27-2004, 11:01 PM
This is leading to a very complicated question so bear with me...

Recently I have been doing a bit of statistics studying in order to calculate confidence levels in sng results.

After calculating Standard deviation, I assume standard normal distribution and have come up with workable numbers which give some idea of what kind of swings to expect, etc...

I put this all into an excel spreadsheet for ease of use

http://www.aleomagus.freeservers.com/Spreadsheet

My 'confidence calculator' is the file I am refering to.

Anyways, I was happy with this for a while, but more and more I am thinking... just one problem... this is all wrong.

My calculations are correct, but it seems foolish to say that after 1 10+1 SNG, my SD is $19. It also seems foolish to say that after n tourneys, my standard deviation is SD*SQRT(n)

The reason I say this is that in any given sng, my expectation will only deviate in the given prize totals and this deviation is a direct result of my Finish percentages.

For example, after 1 sng, I can only be +39, +19, +9, or -11 and this result will occur in the frequency of my 1st, 2nd, 3rd, 4th-10th place finishes. This is to say that after 1 sng, I can have only 4 distinct outcomes.

Similarly after 2 sngs, I can only have 10 distinct outcomes
After 3 sngs, 20 outcomes
After 4 sngs, 35 outcomes
After 5 sngs, 56 outcomes
and eventually
After 1000 sngs, 167668501 outcomes
and so on

By outcomes, I mean the combination of all net $ finishing possibilities within the set of sngs. For example, if I play 4 sngs, a few possible outcomes are...

{+19,-11,-11,-11}
{+39,+9,+9,-11}
{-11,+39,+19,+19}
{-11,-11,-11,-11}
{+39,+39,+39,+39}
etc...

Keep in mind, I also am treating sets like

{-11,+9,-11,+9}
{+9,+9,-11,-11}
{-11,+9,+9,-11}

As only ONE outcome because the order doesn't matter to me here.

Don't ask me how I discovered for n tourneys, the number of outcomes. After a lot of work and discovery it turns out that all I needed to do was use:

C(4,1) for 1 sng
C(5,2) for 2 sngs
C(6,3) for 3 sngs
and so on...

Anyways... It seems to me that these outcomes can each have a percentage attatched to them and can even be ordered by the amount of net profit or loss. These percentages could be added together to give a more precise measure of such statements as... Given my current results, after n tourneys, there is a x% chance that I will be at least breaking even.

In this way, an exact confidence level specific to sng outcome distributions can be acheived.

As I actually don't know much about statistics, I am unsure if this is what people mean when they say that normal standard distibution tables are incorrect. It would seem so.

So... My questions then are essentially these...

a) Is this reasoning valid? Am I on to something here?
b) Can the calculations I am suggesting be accomplished without the aid of a supercomputer. Obviously it is easy for small sng groupings, but it gets a lot more difficult once you need to calculate percentages for more than a few possible outcomes.
c) Is there more advanced mathematics available for dealing with these kinds of situations?
d) Does SD for sngs still hold a kind of truth despite never representing an actual potential outcome?...

... Actually, I know SD doesn't need to represent an actual potential outcome, much the same as my $/tourney average doesn't need to represent an actual outcome. What I am really asking is - Does SD need to be adjusted to take into account the fact that in SNGs, a loss can only be a certain size (which is usually smaller than SD) and wins can be much larger? Or is this SD still ultimately correct and usable for calculating confidence with a refined Distribution?

No doubt, much of this is unclear. If there are any questions, Ask and I will try to expand on what I have written.

Any thoughts would be appreciated.

Regards
Brad S

stupidsucker
07-27-2004, 11:30 PM
I know very little about these equations and such. I can only assume that the exact possibilities dont really matter. I average about $7/SnG but I can never make only $7. The number doesnt fit.

All of those equations are for estimates and I may be way off for saying it, but I dont understand the importance of having to calculate it by exact outcomes. After you play 100plus SnGs I think you can come up with any numerical value from -xxx to +xxx.

I do hope you figure out exactly the answers you are looking for. I enjoy your confidence calculator a lot myself.

AleoMagus
07-28-2004, 06:37 AM
Truthfully, you may be right. Normal distribution might be a really good approximation and calculating by exact outcome possibilities might be a lot of work with little reward.

I just have the nagging suspicion lately that normal distribution is getting it quite wrong.

While some have argued that in truth, we might have even less confidence in out results than the normal distribution assumption, I think that the exact opposite is true. I think that the acual distribution will be smaller and especially so on the losing end of the distribution.

Normal distribution tells me, if I understand it correctly, that I have an equal probability of being +/- $100 or $1000 or $10,000 from my mean profit. This might not make sense when looking at sng results.

Consider the results I can expect in my next 10 sngs. In the most extreme case of all wins or all losses in a future sample, my outer limits of expectation are:

(10)(-11)=-110
(10)(39)=+390

If I assume I make the money ~45% and I make 1st ~15%, the repective probabilities of these occurences are:

10 losses = .55^10=0.00253
10 wins = .15^10=0.00000000577

Obviously, our outer limits of expectation are far more likely on the losing side, but it is also a smaller figure.

Perhaps with an equal degree of likelihood we can expect to be ahead as much as behind with that .000253 probability and in this way Normal distribution might be ok. I'm not sure.

I think I may just have to crunch all the numbers for 1,2,3,4,5,6,7,8,9,10... tourney samples and see if I see a pattern of distribution forming. Beyond ten (and maybe even less), the calculations will get crazy complicated and I don't know if there are any simpler methods for doing this. There must be I think.

We shall see.

Regards
Brad S

pzhon
07-28-2004, 07:15 AM
[ QUOTE ]

My calculations are correct, but it seems foolish to say that after 1 10+1 SNG, my SD is $19. It also seems foolish to say that after n tourneys, my standard deviation is SD*SQRT(n)

[/ QUOTE ]
I don't see a problem with either of those statements. Perhaps you think "standard deviation" should mean something more than it does. It is simply a numerical measure of how spread out a distribution is from its average. A standard deviation of $19 means that the distribution is as spread out as the result of a fair even bet with $19 at stake. That's all. Many very different distributions have the same standard deviation, just as many different people have the same height.

[ QUOTE ]

Anyways... It seems to me that these outcomes can each have a percentage attatched to them and can even be ordered by the amount of net profit or loss. These percentages could be added together to give a more precise measure of such statements as... Given my current results, after n tourneys, there is a x% chance that I will be at least breaking even.

[/ QUOTE ]
Yes, you could use a more direct calculation rather than the normal approximation. This is often computationally intensive and produces little or no improvement in accuracy.

You could calculate (p_lose x^-11 + p_third x^9 +p_second x^19 + p_win x^39)^n to get a generating function for your results after n SNGs. If Excel can't do that, you could try to set B100 to "=p_lose*A100 + p_third*A98 + p_second*A97+p_win*A95". Then put 1 in A1, and paste the formula from B100 to the region B1:U100.

[ QUOTE ]
an exact confidence level specific to sng outcome distributions can be acheived.

[/ QUOTE ]
The central limit theorem says that you don't need to create a new formula for the confidence interval for every new distribution. You can approximate the distribution of a sum of independent events by a normal distribution with the same mean and standard deviation. This approximation gets better and better at estimating probabilities as the number of trials increases.

For small numbers of SNGs, you might find it worthwhile to use a slight improvement on the normal approximation analogous to the way that you associate the interval [49.5,50.5] to 50 when using a normal approximation to a count. For example, let "buy-in" refer to the entry fee without the rake. If my standard deviation is 2 buy-ins/sqrt(tournament), and my mean is .2 buy-ins, then breaking even after 100 SNGs is 20 buy-ins below the mean. Since the result after 100 SNGs must be an integer, it may be better to say that breaking even is between 19.5 and 20.5 buy-ins below the mean, so the result of breaking even or worse is 19.5+ buy-ins below the mean, or .975+ standard deviations below the mean rather than 1+. If you aren't excited about this improvement in accuracy, just trust the normal approximation when there are at least 20 SNGs.

I recommend using the normal approximation.

elbooneb
07-28-2004, 11:44 AM
thanks! What if i want to calculate all this for my last 1000 tourneys?
Take all my results divide by 10? Doesn't sound right...My sample size should affect the error margins right?

AleoMagus
07-28-2004, 04:28 PM
I assume you are talking about my confidence calculator.

No, just input your actual stats into the grey fields in the calculator and it will automatically do the rest.

the 100 which are already in that excel sheet are just samples.

Otherwise, I am not sure I understand the question.

Regards
Brad S