PDA

View Full Version : what is standard deviaton?


nicky g
08-11-2004, 10:36 AM
I've asked a few people this question and all they've told me is either how to calculate it, which I know, or that it's "a measure of the spread", which doesn't really tell me anything. I also know how to use it for poker purposes - but not why. For instance, I understand mean deviation; it's the mean of the distances between each measurement and the mean of the measurements. That makes sense to me and I know what it's telling me. But SD doesn't. What is it telling me, for goodness sake? For that matter, what is variance telling me? I can kind of understand why adding together all of the differences between the measurements and their mean gives you an idea of the spread of the data, but why do you sqaure them first? And what does taking the square root of that actually telling you?
Sorry if this is basic.
/images/graemlins/confused.gif.

Tharpab
08-11-2004, 11:03 AM
I also have a question about this, is the avarege deviation(not sure its the correct name, its the nome that doest need square root) useful?why sd is so used and the ad not at all?

topspin
08-11-2004, 04:04 PM
[ QUOTE ]
I've asked a few people this question and all they've told me is either how to calculate it, which I know, or that it's "a measure of the spread", which doesn't really tell me anything. I also know how to use it for poker purposes - but not why. For instance, I understand mean deviation; it's the mean of the distances between each measurement and the mean of the measurements. That makes sense to me and I know what it's telling me. But SD doesn't. What is it telling me, for goodness sake?

[/ QUOTE ]

Standard deviation is used as a useful measure of how closely your data lies around the mean. If your random variable is normally distributed (the "bell curve"), then about 95% of the time it will lie within 2 standard deviations.

For example, if you came up with a strategy for playing JJ that yielded an average profit of 4BB with a standard devation of 0.5BB, then 95% of the time when you got that hand and used your strategy, you would win between 3BB and 5BB.

You can easily get more information by digging around on Google -- e.g. here (http://www.physics.csbsju.edu/stats/descriptive2.html).

FlashFunk
08-11-2004, 05:01 PM
I'm not sure if my details are totally correct (its been a while since my last statistics course)

But the reason you must square the differences from the mean is due to the fact you have numbers both about and below the mean. By squaring you get rid of all the negatives (the samples that were under the mean) and you have a number that actually measures the squared absolute value of the distances from the mean. If you sum all these differences up, then take the square-root of them (basically reversing the square you did to get rid of the negatives), and finally take the average of these, you will have the standard deviation.

tubbyspencer
08-11-2004, 10:59 PM
[ QUOTE ]
For that matter, what is variance telling me? /images/graemlins/confused.gif.

[/ QUOTE ]

From the standpoint of poker, variance tells you what your swings in bankroll are likely to be. For example, at the lower limits where people play much looser, and see more flops, your BB/hr or BB/100 may be higher; but your variance will be higher as well. At higher limits, your BB/hr or BB/100 will be lower, but with fewer folks seeing the flop, and capitalizing on garbage, the lower your variance. So the lower your bankroll swings will be.

uuDevil
08-11-2004, 11:32 PM
I'll give it a shot:

Variance:

Suppose you have a group of people and you measure their heights. Suppose the average height is X ft. Is everyone in the group X ft tall? Probably not. Since they are not all the same height, there is some "spread" or "variation" or "dispersion". Well how much variation? Maybe a little, if these people are all NBA centers. Maybe a lot if there are adults and children in the group. But saying there is "a little" or "a lot" is not very precise. So we construct a way to measure the spread and call it the variance.

Why do we square the deviations from the mean?

Suppose we did not. Add up the deviations from the mean without squaring and what do we get? Zero. Always. Because deviations for values below the mean are negative and deviations above the mean are positive, when you add them up, they cancel out. This is not useful. So we square the deviations to give us positve values. Now when we add them, they don't cancel. The mean of the squared deviations is called the variance. The further away from the mean, on average, our observations are, the greater the variance.

Standard deviation-- why do we take the square root of the variance?

Well, if you look at the units associated with variance, it is square feet in the example above. If we want to express the amount of spread in our data in the same units as the data itself, we just take the square root of the variance and call it the "standard deviation." It is just as valid a measure of spread. Now we can say meaningful things like "My height is 4 standard deviations above the mean, so gimme the rock!"

nicky g
08-12-2004, 05:42 AM
" [ QUOTE ]
Why do we square the deviations from the mean?

Suppose we did not. Add up the deviations from the mean without squaring and what do we get? Zero. Always. Because deviations for values below the mean are negative and deviations above the mean are positive, when you add them up, they cancel out. This is not useful. So we square the deviations to give us positve values. Now when we add them, they don't cancel. "


[/ QUOTE ]

Yes... but when you calculate the mean deviation, you solve this problem by simply changing all the negatives to positives. Why not just do that?

I realise variance is telling us about variation around the mean; that the bigger it is, the more variation there is. But what exactly is it telling us? Mean deviation already tells us about variance around the mean. What makes variance and standard deviation more useful?

I'm not sure I'm really getting my question across here so I'll try some other ones:
What is standard deviation telling us that mean deviation isn;t?
Why do you square the differences from the mean to get variance, when you could simply get rid of the negativce signs (as you do to calculate mean variance)?
Why will 95% of results fall within two standard deviations of the mean? Where does this magical property come from?

donkeyradish
08-12-2004, 06:20 AM
[ QUOTE ]

For example, if you came up with a strategy for playing JJ that yielded an average profit of 4BB with a standard devation of 0.5BB, then 95% of the time when you got that hand and used your strategy, you would win between 3BB and 5BB.


[/ QUOTE ]

Hmm, I measured the SD of my win rate at 0.5/1 hold'em and it was 16BB/hour with a win rate of 2BB/hour (it was a very small sample).

So that means 95% of my hours I should expect to fall somewhere between winning 34BB or losing 30BB? Not exactly a revelation /images/graemlins/tongue.gif

nicky g
08-12-2004, 06:27 AM
There is a BruceZ thread somewhere on a way to get much more accurate figures. I forget exactly what it is but it involves number of hours played; the bigger they are, the more accurate it is. Try a search.

uuDevil
08-12-2004, 05:23 PM
[ QUOTE ]
What makes variance and standard deviation more useful?

[/ QUOTE ]

Hopefully I don't get in too much trouble with the mathematicians....

The Magical Part

Take a set of measurements like height in my previous example and arrange them in order from smallest to biggest. Properly plotted, this "distribution" has a characteristic shape. This shape is called "bell" or "normal" or "Gaussian." Many different things in nature exhibit this same shape.

The Math Part

It would be nice if we could use math to represent what is going on with normal distributions, so we look for a mathematical function that has this same shape. Here it is:

f(x)=B*exp(-(x-u)^2/A), where A, B, and u are constants

and B=sqrt(1/(pi*A))

Look at the argument of the exponential. Does the (x-u)^2 part look familiar? We used this form in the definition of variance (where u=mean).

[Sorry if the following is not that clear.] The argument of the exponential has to be dimensionless, so A has to have the same dimensions as (x-u)^2. And what constant can we choose that is both relevant and has these dimensions? One defined in terms of the variance will work. So the variance is important because it shows up in the function that represents Normal distributions.

The Good Part

With properly defined constants, we can now use the distribution function to calculate general features of normal distributions like the fact that 95% of results fall within 2 SD of the mean, etc.

Does that help?

nicky g
08-12-2004, 05:58 PM
It's as I suspected. I don;t understand the answer to my question. Thanks though.

topspin
08-12-2004, 06:16 PM
[ QUOTE ]
[ QUOTE ]

For example, if you came up with a strategy for playing JJ that yielded an average profit of 4BB with a standard devation of 0.5BB, then 95% of the time when you got that hand and used your strategy, you would win between 3BB and 5BB.


[/ QUOTE ]

Hmm, I measured the SD of my win rate at 0.5/1 hold'em and it was 16BB/hour with a win rate of 2BB/hour (it was a very small sample).

So that means 95% of my hours I should expect to fall somewhere between winning 34BB or losing 30BB? Not exactly a revelation /images/graemlins/tongue.gif

[/ QUOTE ]

I agree, I don't think looking at the variance of your win rate is a useful stat. The reason your win rate has such a huge variance is because it encompasses both AA and 72o in the same stat: of course you'd expect the variance to be big. If you tracked your variance for a given starting hand(s) (e.g. only for AA, or Axs) it'd narrow down quite a bit more.

Personally I have trouble envisioning how analyzing your hand variance would help improve your game in any significant way, except perhaps in evaulating how big your bankroll needs to be to absorb the swings at your favorite poker site. There might be some applications in tournaments (where you have fixed bankrolls) in figuring which +EV situations are too marginal and should be passed over.

uuDevil
08-12-2004, 07:14 PM
[ QUOTE ]
It's as I suspected. I don;t understand the answer to my question.

[/ QUOTE ]

The physicists say you never understand a new theory, you just get used to it. /images/graemlins/tongue.gif

If you can, get a copy of a book called "Statistics Without Tears" by Derek Rowntree. It is very basic but easy to read and exceptionally clear.

uuDevil
08-13-2004, 12:34 AM
[ QUOTE ]
[Sorry if the following is not that clear.] The argument of the exponential has to be dimensionless, so A has to have the same dimensions as (x-u)^2. And what constant can we choose that is both relevant and has these dimensions? One defined in terms of the variance will work. So the variance is important because it shows up in the function that represents Normal distributions.

[/ QUOTE ]

On reflection, I did you a disservice with this gibberish-- sorry. /images/graemlins/blush.gif

It turns out that for the normal distribution, the standard deviation and the mean deviation are just proportional to each other: std dev= 1.25(mean dev), approximately. So you could as easily use mean deviation as the standard deviation for some purposes. For example, for normally distributed data, you could say that "95% of observations fall within 2.5 mean deviations of the mean."

So why do we prefer the std dev? Apparently for convenience, since analytical expressions for the mean deviation can get complicated due to the need to introduce absolute values (in order to get rid of the negatives, as you suggested). It makes common procedures like least squares fitting more painful. (This is discussed on this web page. (http://mathworld.wolfram.com/MeanDeviation.html))

I'll try to do better next time. /images/graemlins/frown.gif

nicky g
08-13-2004, 05:34 AM
That's interesting, thanks. Sorry, I didn;'t mean to suggest there was anything wrong with the explanation you gave me before, just that I suspect I don't have enough maths to properly understand the answers to what I'm asking. I'll have a look at the book, cheers for thr recommendation.

pzhon
08-13-2004, 05:40 AM
[ QUOTE ]
Personally I have trouble envisioning how analyzing your hand variance would help improve your game in any significant way, except perhaps in evaulating how big your bankroll needs to be to absorb the swings at your favorite poker site.

[/ QUOTE ]
If you know exactly how much you make per hour, and your bankroll is large, you can ignore the variance.

If you are not sure what your win rate is, you might be interested in figuring it out. Suppose you just moved up a level, and get crushed. Or, maybe you win a huge amount. When is that significant? To answer this, you should consider the variance.

Someone earlier mentioned a standard deviation of 16/sqrt(hour), hence a variance of 256/hour. One interpretation of this is that after 256 hours, you should be mildly surprised to be 1BB/hour above or below average, and seriously surprised to be 2BB/hour above or below average.

This might not help your play directly. However, it might tell you to get out of a game, or to keep trying.

pzhon
08-13-2004, 06:43 AM
[ QUOTE ]

Yes... but when you calculate the mean deviation, you solve this problem by simply changing all the negatives to positives. Why not just do that?

[/ QUOTE ]
Suppose you have a set A of 10 items, 2 of which have weight 5. You replace these two by items with weight 6 and 4 to get a new set B of 10 items.

Is the mean deviation, E(|weight-average weight|), different for A and B? It depends. The average weight doesn't change. If the average weight is greater than 6 or less than 4, the mean deviation doesn't change. If the mean is between 4 and 6, the mean deviation of B is greater than the mean deviation of A. The variance has increased by precisely 1(2/10), no matter what the mean is. The variance is a more natural quantity.

Suppose you have a 50 lb. dumbell and a 50 lb. barbell. You are blindfolded, and pick up one or the other in the middle of the bar. Can you tell the difference? Yes, one is much harder to spin than the other. If you arrange the weights on the bar, there are precisely two quantities you can detect by picking up the bar at the center: The total mass, and the moment of inertia, which is analogous to the variance of the distributions of the weights on the bar. If you move two weights on one side slightly farther apart, it will be harder to spin the barbell, but you won't be able to tell whether the change was made on the left side or the right side.

The variance of the sum of two independent events is the sum of their variances. Nothing analogous is true for the mean deviation.

A confirmation that variance is the more natural quantity comes from the Central Limit Theorem. The Central Limit Theorem says that if you add many independent, identically distributed random variables together, the result is approximated by a normal distribution with the same mean and variance. That is, a normal distribution whose mean is the sum of the means, and whose variance is the sum of the variances. If you add many similar variables and want to understand the resulting distribution, you only need to pay attention to the mean and variance. <font color="white">This assumes the distribution is not so wild that the variance is infinite. The number of terms need to be added to get a distribution close to a normal distribution depends on the distribution.</font>
[ QUOTE ]

Why will 95% of results fall within two standard deviations of the mean? Where does this magical property come from?

[/ QUOTE ]
That's true for a normal distribution. It is not true for all distributions.

Most distributions are not normal. Heights are not normally distributed. However, if you add together many heights from a random sample of people, the heights will be close to normally distributed by the Central Limit Theorem. Your results per hour in poker are not normally distributed. However, if you add together the results of many similar hours, the total will be roughly normally distributed by the Central Limit Theorem.

The formula for a standard normal distribution's density, 1/sqrt(2pi) exp(-x^2/2), is complicated, but it is not arbitrary. There are many other formulas that produce curves that look about the same. However, the normal distribution has some remarkable, unusual properties. The product of 2 normal distributions is a 2-dimensional distribution that is rotationally symmetric. Another way of saying this is that if X and Y are independent standard normal distributions, then so is aX+bY if a^2+b^2 = 1, and aX+bY and bX-aY are independent.

In summary, the reason we consider normal distributions is the Central Limit Theorem. The Central Limit Theorem tells us that many distributions of interest are approximately normal, so they have properties similar to those of a normal distribution.

topspin
08-13-2004, 09:43 AM
[ QUOTE ]
[ QUOTE ]
Personally I have trouble envisioning how analyzing your hand variance would help improve your game in any significant way, except perhaps in evaulating how big your bankroll needs to be to absorb the swings at your favorite poker site.

[/ QUOTE ]

Someone earlier mentioned a standard deviation of 16/sqrt(hour), hence a variance of 256/hour. One interpretation of this is that after 256 hours, you should be mildly surprised to be 1BB/hour above or below average, and seriously surprised to be 2BB/hour above or below average.

[/ QUOTE ]

That's an excellent point that didn't occur to me. Thanks for the insight.

Of course, given that most people around here recommend a 300BB bankroll, after 256 hours of losing 1-2BB/hr, the fact that you have no bankroll left is also likely to tell you that you may need to move down a level /images/graemlins/wink.gif

eastbay
08-15-2004, 05:31 PM
[ QUOTE ]
I've asked a few people this question and all they've told me is either how to calculate it, which I know, or that it's "a measure of the spread", which doesn't really tell me anything.

[/ QUOTE ]

Maybe a good question is: why do you think that doesn't tell you anything? Because that's exactly what it is.

eastbay

nicky g
08-16-2004, 06:32 AM
It doesn't tell me anything specific, and I already know it;s a measure of the spread. It's like anwering "what's a flush" with "It's a type of poker hand"; or my question with "it's a statistical measurement". My point is I understand how mean deviation is measuring the spread - what exactly it's telling me. But not with SD (although some posts here have helped).

eastbay
08-18-2004, 12:32 AM
[ QUOTE ]
It doesn't tell me anything specific, and I already know it;s a measure of the spread. It's like anwering "what's a flush" with "It's a type of poker hand"; or my question with "it's a statistical measurement". My point is I understand how mean deviation is measuring the spread - what exactly it's telling me.


[/ QUOTE ]

In what sense? You know how to compute it. What else do you know about it that makes you say that you "know exactly what it is telling you."

eastbay

uuDevil
08-18-2004, 02:10 AM
[ QUOTE ]
You know how to compute it. What else do you know about it that makes you say that you "know exactly what it is telling you."

[/ QUOTE ]

I've been off so far, but I think what he is saying is that mean deviation is in some sense more natural or intuitive. The thing is, with every mathematical step you take, you are in danger of going beyond what is intuitive or natural.

Start with the counting numbers. Probably these seem natural and intuitive enough. Add zero. This may seem natural too, but actually the idea didn't appear until maybe a few thousand years ago. Add negative numbers. Remember having trouble with these? Probably they now seem natural too. But then take the square root of -1. Does an imaginary number feel natural? Well, probably yes, if you are an electrical engineer. But sooner or later, you have to give up on intuition. Even for the most brilliant physicists, the tortured mathematical descriptions of reality they are forced to resort to in their grand unification schemes must seem far from intuitive.

Maybe we can come up with a description of standard deviation that would make it feel natural. But maybe we just have to let it be unintuitive until we get used to it.

nicky g
08-18-2004, 06:48 AM
"You know how to compute it. What else do you know about it that makes you say that you "know exactly what it is telling you."

I don't understand why you are doing what you do to compute it.

Griffin
08-19-2004, 07:14 PM
[ QUOTE ]
"You know how to compute it. What else do you know about it that makes you say that you "know exactly what it is telling you."

I don't understand why you are doing what you do to compute it.

[/ QUOTE ]

I'll give it a shot...

First, we have to know what we are looking for before we can come up with a caculation for it. With something like the mean (average) we know we want ONE number that best represents the whole distribution of scores. The mean, or average score, is the best example of the distribution of scores if the sample size is large and normally distributed.

When we ask for the standard deviation, we are asking for ONE number that represents "on average, how much does each score in this distribution deviate from the mean of the distribution?"

So...we want a number that tells us the average deviation from the mean. Well, if the formula for the mean is equal to the sum of scores divided by sample size M = (SUM X / N) then the formula for the average (standard) deviation would intuitively be the sum of the deviations (from the mean) divided by the sample size, SD = (SUM dev / N).

Well, that doesn't work because the sum of the deviations is zero. So we square the deviations to get rid of the zeros (SD = (SUM dev)sqrd / N).

But that formula doesn't work because squaring the deviations will produce a standard devation that can be larger than any of the individual deviations that produced it. It doesn't make sense to say the average deviation from the mean is 6.4 when each score only deviated from the mean about 1 to 3 points. That is why we take the square root, which produces the final, correct, formula for standard deviation.

SD = SQRT of ((SUM dev)sqrd / N)

In short, you are calculating an average so you use the same basic formula for doing so....sum the scores and divide by the number of scores. But because we are calculating an average deviation from the mean, we have to manipulate the numbers a couple of times in order to get around a characterstic of those deviations (that they sum to zero).

I have a feeling that you already know/knew all that /images/graemlins/blush.gif, but I don't know how else to answer the question. /images/graemlins/confused.gif

Griff

TITHEAD
08-20-2004, 12:18 AM
All i know is i studied it for 3 months and got an 87% in the exam on it and i still cant remember how the hell to do it in a poker game. I think it is good if you know the principles of frequency distribution for poker becasue if u can understand all of that the standard deviation becomes much simpler.

nicky g
08-20-2004, 07:21 AM
Thanks for your explanation. One of my questions was that when finding mean deviation you solve the negative problem by simply removing the negative signs, and why not just do that rather than squaring. Several people here have given some helpful answers tot hat. One of my friends who's a big maths boffin told me that it had something to do with pythagoras's theorem and some other stuff I didn;t really understand; anyone heard anything along those lines?

Anyway here's what I'm telling myself from now on: SD is a similar measure to MD but is calculated in a different way that gives it some more useful properties. Does that make sense?

demonx5
08-24-2004, 09:25 AM
I also took a stat course, in one ear out the other, but got a B+ overall... so maybe not as aloof as I thought I was during those classes. SD and SE are not easy concepts to grasp in a practical sense. You'd be better off reading a book on this subject or even a textbook might help through it's examples and such.

I'm not sure how much it mentions it but I think "Getting the Best of it" (in the books section) has a section on probability in gambling, maybe go check that out.