Standard deviations [Archive] - Two Plus Two Older Archives

Angel

10-21-2003, 01:48 AM

I am trying to better understand standard deviation. In particular, what it can mean to me besides a troublesome exercise. I understand most people don't work it out - I also understand it can be helpful to do so - I would prefer to be one of the ones then who work it out. I've picked a recent period of time to work with (52 hrs) and while I don't believe that it is a significant number and realize I'll need more hrs to give me meaningful data - 52 session hours ago marked a period of game change for me and the mixing the results of previous hours would not be helpful. So, despite these numbers giving a relatively valueless result, I would like to practice on these because these are the numbers I'll be adding to as time goes on.
I've calculated my hourly SD at $213. I've calculated my hourly rate at this point to be $126.9. There are 13 sessions in question. These numbers don't compute. If my average win is calculated as 52*$126.9/sqrt52*213 =6599/1536 or 4.3 SD then I am looking at $915 average win for 13 sessions. This is clearly not the case, though I can't find my error. Is it simply that my true standard deviation is not $213 as I thought? Was this mathematical sloppiness on my part or am I misunderstanding something conceptually?

I've seen in another post where BruceZ has said, "When your average win becomes exactly equal to your standard deviation, you will be ahead more than 84% of the time." I'm embarressed to say I couldn't make heads or tails out of this. Is 84% a probability constant? How about 68%? I've also read, "When your average win becomes exactly equal to your standard deviation..." This sounds like a given. Must this occur?

Any light you could shed on these questions would be appreciated. Thank you.

BruceZ

10-21-2003, 07:36 AM

I've calculated my hourly SD at $213. I've calculated my hourly rate at this point to be $126.9.

This is a very small SD for that win rate. Normally you would expect the hourly SD to be about 10 times your win rate. It is likely that either your hourly rate or SD were computed incorrectly, or else you are including a very large bonus. If you PM me, I can email you an Excel spreadsheet which will compute your SD correctly. You just have to input your session results and the length of each session. The formula is also in one of Mason's essay's in the essay section of this forum. Your hourly rate is simply your total win divided by total hours played.

If my average win is calculated as 52*$126.9/sqrt52*213 =6599/1536 or 4.3 SD then I am looking at $915 average win for 13 sessions.

That would be 4.3 times your standard deviation for 52 hours, not 4.3 times hour hourly SD. Your SD for 52 hours is sqrt(52) times your hourly SD.

I've seen in another post where BruceZ has said, "When your average win becomes exactly equal to your standard deviation, you will be ahead more than 84% of the time." I'm embarressed to say I couldn't make heads or tails out of this. Is 84% a probability constant?

Your EV for N hours is N times your hourly EV or EV*N. Your SD for N hours is sqrt(N) times your hourly SD or N*sqrt(N). Your actual results will lie within +/- SD*sqrt(N) from EV*N 68% of the time, since they are distributed by a normal distribution. When N becomes such that these two things are equal to each other, that is EV*N = SD*sqrt(N), then there is an 84% probability that you will be ahead, meaning your results are above 0. This probability is calculated *before* you play the N hours, not after. This will occur when N = (SD/EV)^2. You must know or assume some SD and EV to apply this. The reason this is 84% is because your actual results will lie between 0 and +2EV*sqrt(N) 68% of the time. The amount below 0 is 16%, and the amount above +2EV*sqrt(N) is 16%, so the total amount above 0 is 84%.

After you play any number of hours, your results will be something, maybe postitive, maybe negative. At this point, you can say that there is an 84% *confidence* that your actual EV is no more than 1 hourly SD below your hourly rate. If your hourly rate is equal to exactly 1 SD, then there is an 84% confidence that your actual EV is above 0, or that you are a winning player.

BruceZ

10-21-2003, 01:30 PM

After you play any number of hours, your results will be something, maybe postitive, maybe negative. At this point, you can say that there is an 84% *confidence* that your actual EV is no more than 1 hourly SD below your hourly rate.

Oops, the part in red is unfortunate wording, and it might give you silly results. The last sentence should be replaced with one of these two equivalent statements:

At this point, you can say that there is an 84% *confidence* that your actual EV for the entire time period is no more than 1 SD for the entire time period below your total win for the time period. 1 SD for the time period is equal to your hourly SD multiplied by the square root of the number of hours played, that is hourly SD*sqrt(N).

OR

At this point, you can say that there is an 84% *confidence* that your actual hourly EV is no more than 1 hourly standard error below your hourly rate. Standard error is your hourly standard deviation divided by the square root of the number of hours played, that is, SE = SD/sqrt(N). Standard error is actually the standard deviation of your hourly rate, whereas standard deviation is the standard deviation of your hourly results.

The second form is the most useful. For example, if you play for 100 hours, and you compute your hourly rate to be 1 bb/hr, and your hourly standard deviation is 10 bb, then you have 84% confidence that your true hourly EV is greater than 1 - 10/sqrt(100) = 0, so you have an 84% confidence that you are at least a break even player.

What is going on here is that if you play for N hours, your average win for N hours is N*EV, and your standard deviation for N hours is sqrt(N)*SD, where EV and SD are hourly. So 1 standard deviation below your average for the period is N*EV - sqrt(N)*SD. Dividing this by N gives the hourly rate corresponding to performing 1 standard deviation below expectations, and this is

[ N*EV - sqrt(N)*SD ]/N
= EV - sqrt(N)*SD/N
= EV - SD/sqrt(N)
= EV - SE.

Bozeman

10-21-2003, 04:29 PM

Bruce, please do not take personal umbrage at this, for i respect your knowledge and willingness to share it immensely, but I have made some observations and would like to ask you a tactless question.

You often post results, and then come back shortly thereafter to make corrections. Often you use phrases like this: "unfortunate wording", instead of directly admitting that the previous result is wrong. Is this sort of ass-covering required to succeed as a consultant? Or do you just do things quickly without proofreading here?

Thanks,
Craig

BruceZ

10-21-2003, 06:18 PM

Why would I take umbrage at that? It's not like you posted it in public rather than PMing me, so it's not like there is any reason to suspect that your intent was to embarrass me in any way.

In this particular case, I chose my wording of the correction very carefully because the original wording was not technically wrong. I used the term 'standard deviation' where I intended 'standard error', but standard error is a standard deviation, and I can show you places in Mason's writings where the terms are used interchangeably. It simply isn't what most people think of as their hourly SD, which is why I posted the extensive elaboration.

When something I've written is wrong, I always say it is wrong, and I have done this countless times. Do a search for 'wrong' or 'incorrect' or 'not right' in my posts if you don't believe me. Watch for a day or so, and you'll see me do this very thing with an earlier thread on general theory. Sometimes I come back a long time later to make a correction, but this is rare. Normally it occurs within the first few hours.

When you attempt to post as much technical material as I do on complex subjects, there is a 3-way tradeoff between accuracy, quantity, and time spent. The motto at my old consulting firm was "Good, Fast, Cheap. Pick 2". The information I dispense here is as cheap as it comes. I don't get paid a dime for it. That leaves a tradeoff between good and fast. I could get more posts exactly correct the first time if I posted less, or if I did not attempt to explain material in great depth, or if I only handled the most mundane of issues. I don't feel that would be in anyone's best interest since I feel I make very few errors now, and the ones I do make I always correct very promptly. If anything, I actually spend far too much time on writing these posts and trying to make them perfect. I have improved this accuracy rate tremendously over the past year by developing a several step checklist for proofreading. I used to post several corrections to almost every post. Now I only occasionally need to post any correction at all.

When you attempt to explain complex material to people who often have very little background in math, it becomes very easy to oversimplify. I rewrote the above post many times before I got what I wanted because the poster apparently did not understand something I had written previously. Posts where people say they don't understand are actually the most valuable to me as a writer. I was having trouble getting it simple and correct at the same time, and I was getting frustrated, so the proofreading stage was shortened as a result. This caused me to accept some sloppy wording which was fixed within a matter of hours. English is a poor language for expressing mathematical ideas. That's why we have the language of mathematics. If I wrote exclusively in that language, I would be accurate almost 100% of the time, but I think many people would become very unhappy, and my posts would be far less valuable to most people.

The best engineer, mathematician, consultant, and writer in the world all make a ton of errors. That's why engineering work is peer reviewed, inspected, and tested; mathematical proofs are verified by several mathematicians; and books are edited and subject to countless revisions. It's a lot more difficult than most people think to produce a high quantity of in depth technical information that is very accurate, and it is impossible to do this as a single individual. That is why 2+2 deserves a great deal of credit for producing a high quantity of in depth material that you can count on to be accurate. I can assure you that this takes a great deal of time, many revisions, and the input of a number of talented people. As a consultant, I'm often forced to work without this safety net, so I have developed ways to be more accurate than normal on the first pass. I also command a premium price for my time and ability to do this.

People can always be assured that any definitive statements I make are 100% accurate, or at least they will be within a day or so /images/graemlins/laugh.gif, and I am always willing to discuss any issues anyone has with what I have written. Because I have worked certain problems so many different times, I often spot errors in other people's posts within a matter of seconds, even when those posts are written by people with strong mathematical backgrounds. I've always thought your posts seem quite accurate, but I'll be happy to examine them under a microscope from now on.

I know from my PMs that many people really appreciate these posts and find them useful and informative, so I believe I am negotiating the quantity/quality compromise quite well. If I didn't think that this was appreciated, then I would just stop posting here, and I'd go find a probability newsgroup to post on instead. I enjoy posting here when I recieve feedback that it is appreciated, which is most of the time. I don't enjoy posting here when my efforts are clearly not appreciated, as in this case.

psychprof

10-21-2003, 06:22 PM

I spent a couple of days preparing these questions for this forum, and on the day I decide to post it I find that someone else beat me to the punch with regard to asking about standard deviation. Unfortunately, I don't have any answers, just more questions./images/graemlins/confused.gif Because (some of) my questions relate to SD, I will add my post to this thread instead of creating a new one. The thread answered one of my questions already, but that has simply led to a follow-up.

Your responses to the following questions are appreciated

1. I have calculated my SD using Mason's formula in the Essay section. In that essay, he indicates that a SD of $83 is not realistic and that it should be much larger. However, he doesn't indicate at what stakes. What is an expected SD for stakes of .5/1? What about 1/2 stakes? A previous post in this thread indicates SD should be 10 times one's win rate. My follow-up question is this: does this rule of thumb apply to all stake levels, or will one's SD-to-win-rate ratio be lower at lower stakes?

2. Does one's SD say anything about proper play? For example, if %wins at showdown is 100%, that suggests (based on what I've read in the small stakes forum) that a person is playing too tight. Well, if a person's SD is much smaller (or larger) than expected, does that indicate improper play? If so, what kind of improper play?

3. I have often read that an expected win rate for a decent player is 1 BB/HR, but that it should be more at lower stakes. How much more? What is a good win rate at .5/1 stakes? 1/2 stakes?

4. I have played for about 3 months, but only kept records for 2 months. In those records, I have about 80 hours (50 sessions) at one site at .5/1 stakes, about 50 hours (33 sessions) at 1/2 stakes and about 70 hours (38 sessions) at another site (.5/1 stakes). I've read posts where people say things like "I've played for 20 hours and made 9bb/hr, should I move up?" and the replies all indicate that 20 hours is not a large enough sample size to mean anything, and that you need about a thousand hours to be meaningful. Do you agree that I can conclude nothing from the samples I have so far? That even if I currently have very good win rates at all three sites/stakes it still could be that I'm just a goofball who catches cards?

5. Mason states in his essay that you need at least 30 sessions for SD to be reliable. If 50 sessions will produce a reliable SD, why doesn't it produce a reliable win rate?

Thanks for your replies. I think all newbies will appreciate a discussion of how to properly analyze one's performance using SD, BB/HR, and sample size.

PsychProf

BruceZ

10-21-2003, 09:16 PM

1. I have calculated my SD using Mason's formula in the Essay section. In that essay, he indicates that a SD of $83 is not realistic and that it should be much larger. However, he doesn't indicate at what stakes. What is an expected SD for stakes of .5/1? What about 1/2 stakes? A previous post in this thread indicates SD should be 10 times one's win rate. My follow-up question is this: does this rule of thumb apply to all stake levels, or will one's SD-to-win-rate ratio be lower at lower stakes?

The intention of making the rule of thumb 10 times the win rate was to take into account the fact that your win rate will be less than 1 bb/hr at high stakes, and higher than 1 bb/hr at low stakes. Your examples are all low stakes. At 20-40 and above, win rates are typically less than 1 bb/hr, approaching .5 bb/hr as the stakes are raised still more.

2. Does one's SD say anything about proper play? For example, if %wins at showdown is 100%, that suggests (based on what I've read in the small stakes forum) that a person is playing too tight. Well, if a person's SD is much smaller (or larger) than expected, does that indicate improper play? If so, what kind of improper play?

If your SD is too high, that could mean that you are chasing too many marginal hands. It could also mean that you are playing in very wild games or shorthanded games without the necessary skill to increase your win rate in these games as much as you increase your SD. If your SD is too low, it may mean that your win rate is too low due to any one of a number of skill problems, or you could be playing too tight.

3. I have often read that an expected win rate for a decent player is 1 BB/HR, but that it should be more at lower stakes. How much more? What is a good win rate at .5/1 stakes? 1/2 stakes?

At these micro-limits, a good player can make over 2 bb/hr per table, and as much as 3 or 4 bb/hr if the competition is very poor.

4. I have played for about 3 months, but only kept records for 2 months. In those records, I have about 80 hours (50 sessions) at one site at .5/1 stakes, about 50 hours (33 sessions) at 1/2 stakes and about 70 hours (38 sessions) at another site (.5/1 stakes). I've read posts where people say things like "I've played for 20 hours and made 9bb/hr, should I move up?" and the replies all indicate that 20 hours is not a large enough sample size to mean anything, and that you need about a thousand hours to be meaningful. Do you agree that I can conclude nothing from the samples I have so far? That even if I currently have very good win rates at all three sites/stakes it still could be that I'm just a goofball who catches cards?

You can always say something. In particular, you will say something like "I have X% confidence that my win rate is W +/- Z". But with a low number of hours, you can make X high, but then Z will be very high, or you can make Z low, and then X will be very low. You would like to make X high and Z low, and that usually takes a lot of hours, many more than most people realize. To calculate this tradeoff, use this equation:

x*(SD/EV)/sqrt(N)*100 = Z

x is the number of SDs for the confidence you want.
For 68% confidence, x = 1
For 80% confidence, x = 1.3
For 90% confidence, x = 1.6
For 95% confidence, x = 2

SD and EV are hourly
N is the number of hours
Z is how accurate your win rate will be in per cent.

Example: If your SD is 10 times your win rate, so SD/EV is 10, and you play for 400 hours, then to 90% confidence, your win rate will be accurate to within:

1.6*10/sqrt(400)*100 = 80%.

So after 400 hours, your hourly rate could be off by 80% in either direction. For example, if you made 1 bb/hr, your actual EV could be 1 bb +/- .8 bb. That isn't a very useful range. Suppose we want it to be accurate to within 50% or .5 bb, Then from the above equation, you can see you would have to play 1024 hours. What if you wanted it to within 25%? Then you would have to play 4096 hours (4 times as long as for 50%). For 10% accuracy, you need to play 25,600 hours, or over 12 years of full time play. You can see the number of hours increases rapidly with the reduction in range. If you are willing to accept a lower confidence, these hours can be reduced.

It may not take so long if you do particulary well early on. I posted this ridiculously extreme example awhile back:

---
It may seem strange that being lucky will allow you to prove that you are good sooner. As an extreme example, when I first started playing, I practiced on irc. I won 10 bb/hr for the first 10 hours. I was able to conclude mathematically from this that I was virtually certain to be at least a 1 bb/hr winner against the opponents I had played against, and probably much more *. Now, 10 hours isn't normally nearly enough to make this kind of assessment. It normally takes hundreds or thousands of hours, or it may never happen even if you are a 1bb/hr winner. But I had done *so* well, that I could still say something about my true win rate with good confidence. Now it's unlikely that I was a 10 bb/hr winner. Perhaps I was a 3 or 4 bb/hr winner who also got lucky. But that luck factor is what provided the margin to put 1 bb/hr in a high confidence interval. Also notice that if this hadn't occurred in the very first 10 hours I ever played, it would not have meant the same thing, since then it would have been averaged with all of my other sessions.

* Actually this assumed some range of standard deviations. It is possible to take into account the uncertainty in the standard deviation by use of the "t-test".
---

For your example, 20 hours means very little because conditions can change a great deal, and these 20 hours may not be representative of the future. For example, you may happen to be playing with especially bad players during these 20 hours. The above confidence interval calculation assumes that the future conditions will be the same as the past, namely that your EV and SD will remain the same. On the other hand, you can use the combined results of your play at different levels to assess how you are doing at least at the lowest level.

5. Mason states in his essay that you need at least 30 sessions for SD to be reliable. If 50 sessions will produce a reliable SD, why doesn't it produce a reliable win rate?

The distribution of your SD depends on the chi-square distribution with N-1 degrees of freedom, where N is the number of sessions, and this converges relatively fast. Your win rate or EV is distributed as a normal distribution with an SD of SD/sqrt(N), called the standard error. From the above math, you can see how long this takes to converge to a reasonably small range with decent confidence. For a detailed technical explanation of how to determine the accuracy of your SD, see this post. (http://forumserver.twoplustwo.com/showthreaded.php?Cat=&Board=genpok&Number=149814&F orum=genpok&Words=chi-square&Match=Entire%20Phrase&Searchpage=0&Limit=25 &Old=allposts&Main=149588&Search=true#Post14981 4)

Bozeman

10-22-2003, 02:25 AM

Well, I've dug myself a big hole, and let me be the second (after you) to say I am wrong.

I did not mean anything in my post sarcastically, I have the greatest respect for your work. I did not PM you because I rarely read them, and do not assume that they will be read. I thought that this small forum is somewhat like a group of friends (I would not have replied in a larger forum such as General Theory). I also thought I clearly stated that these were observations, and that it was not meant to attack you. If I had any devious motives, it would only be that I found this mildly amusing (and my friends could tell you that I laugh at my mistakes more than anyone else's, at least when I am done berating myself).

Perhaps many of the errors I have seen you fix were from older threads. I take your word for it that you are doing better now, not that there is anything wrong with making edits or corrections in a forum such as this, unlike a reviewed journal, for instance.

With a renaissance man's background in statistics and not a statistician's, I have always heard SD and SE used distinctly. I was wrong ...

As for correcting yourself, you often place the emphasis on the correct part, and thereby possibly directing the reader away from the error. This seems to be useful strategy, and I was wondering if you agreed. For example, in another thread you used the subject Correct(ion). I think it is a flaw of my own that I overconcentrate on my mistakes.

Obviously I should have kept my subjective observations to myself.

For the record, I am very glad you post, glad you post your corrections, and sorry if this has offended you,
Craig

BruceZ

10-22-2003, 04:10 AM

As for correcting yourself, you often place the emphasis on the correct part, and thereby possibly directing the reader away from the error. This seems to be useful strategy, and I was wondering if you agreed. For example, in another thread you used the subject Correct(ion).

That wasn't even my title, it was the title of well's post (http://forumserver.twoplustwo.com/showthreaded.php?Cat=&Number=361562&page=&view=&sb =5&o=&vc=1) which I was responding to! He corrected my typo, and then I marked my error in red, and put the correction in blue, so I don't see how I'm trying to "direct the reader away from the error". I'm making both the error and the correction as plain as day. I did the same thing in this thread.

When I'm wrong, I have no problem saying I'm wrong, but if I just feel something needs to be clarified, then I'm not going to charge myself with an error that I didn't make. Too many errors can be damaging to credibility, and credibility is all I've got. I really hate being wrong, and I go to great lengths to make sure that doesn't ever happen.

In light of that, your suggestion that I just "do things fast without proofreading" is especially disturbing. I doubt that there is another poster who spends as much time scrutinizing every word of their posts as I do. Lately, virtually all my errors have been problems with wording. I can't remember the last time I made a calculation error, and I virtually always correct my own errors before anyone else does.