Understanding Variance, Co-Variance, and Correlation (opens in new tab)

(countbayesie.com)

106 pointsCountBayesie11y ago18 comments

18 comments

This is a horrible explanation of variance. And it's missing WHY we need variance, or, what is the usefulness of variance vs. other measures like mean and range.

Say you want to buy a car and want to choose a brand and model based on user ratings of quality and value online.

Cars A, B, and C all have the same average rating - let's say 8 out of 10. How to choose? You need more information, but all you have are the ratings.

You could look at the range of ratings. This is the difference between the maximum rating and minimum rating. But what if only one or two people gave a car a bad (low) rating of 1 or 2, whereas another car had a lot of low ratings of 3 and 4, but no one rated it a 1 or 2. If you just look at the range, it might not be a good characterization of the ratings on the whole, because just one person (data point) can skew the information.

You want to look at the spread of the ratings - how consistent or variable the ratings are. A car with a lot of 7, 8, 9 ratings is better than a car with ratings all over the place, that happen to average the same (8). When you buy a car with an average rating of 8 out of 10, you expect a car that is an 8. You want to minimize the chance of getting a lemon.

This spread can be calculated by looking at the difference between each individual rating with the average rating. If you add up all these differences though, the negative differences with the mean would cancel out the positive differences with the mean. With variance, this difference is thus squared to make them all positive (or zero). And so on...

anonymousDan11y ago

But why take the square and not just the absolute value of the differences? Is the idea to emphasize outliers and hence give higher variance to skewed datasets?

yummyfajitas11y ago

The real idea is that you have an implicit model, specifically a normal distribution. The variance is one of the parameters of the normal distribution (the other being the mean).

A normal distribution is a good implicit model to choose - the central limit theorem and similar laws suggest that lots of other distributions will asymptotically approach it. But it's not always the right choice - e.g., it's a disaster when you have power law tails, or low frequency high amplitude noise.

1 more reply

edtechdev11y ago

Right, I stopped short because I reached the point at which I'd have to Google to double check anything.

Yeah, you've invented the 'mean absolute deviation' (or 'average absolute deviation'), which might be better than variance or standard deviation (the square root of variance) in some circumstances. It's been debated for 100 years: http://www.leeds.ac.uk/educol/documents/00003759.htm

Part of the reason for using variance might be like you said, to give more weight to outliers.

Part of the reason variance and standard deviation might be more popular is because usually the spread of a set of data has a normal distribution. And there are all these formulas and calculations that were invented before computers that are easier to do with variance and standard deviation apparently. Manipulating equations with absolute values is trickier.

There are also some mental shortcuts you can take if you know the standard deviation of a set of data. If a car is rated 8 on average, then about 95% of all of the ratings are within 2 standard deviations of the mean. Thus, if you want to buy a car rated 8 on average and want to be 95% sure that the particular car you buy is at least a 7, check that the standard deviation of the ratings is less than 0.5. Probably not a great example. Imagine instead you are buying oranges that are 8 out of 10 quality-wise on average, and you want to be confident that 95% of the oranges are at least a 7, so that you don't have to throw out too many. See https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rul...

I don't know, here are some other suggested reasons for using variance & standard deviation instead of absolute mean differences: https://stats.stackexchange.com/questions/118/why-square-the... https://www.quora.com/Why-do-we-square-instead-of-using-the-...

1 more reply

cafebeen11y ago

One interesting benefit of the square difference is that it's differentiable, while the absolute difference isn't, which can come in handy for some optimization problems.

agumonkey11y ago

> When you buy a car with an average rating of 8 out of 10, you expect a car that is an 8. You want to minimize the chance of getting a lemon.

I may break HN code, but I LOL'd. (I think I'll never forget that variance explanation now)

thetwiceler11y ago

To define variance as

    E[x^2] - E[x]^2

and not ever allude to the far more meaningful version,

    E[ (x - E[x])^2 ]

is just criminal. This is not at all a good explanation of variance, covariance, and correlation.

mturmon11y ago

I have to agree with you here.

One problem with introducing it the way the article does is that it's hard to see why the variance is never negative, and is zero exactly when the R.V. is constant.

This is a very important property, to say the least.

It would be better to say you measure the "energy" with

  E x^2

but that this is not immune to level shifts, so you need to subtract some constant off first. And it so happens that the optimal constant to subtract off is our friend E x.

Edited to add: The notion of introducing the ideas of a sample space and a random variable (in the technical sense), as is done in the article, and at the same time being shy about calculus, is rather contradictory. That is, the intersection of

  { people who want measure-theoretic probability concepts }

and

  { people who don't know calculus }

may be empty.

hessenwolf11y ago

It looks like an enthusiastic newbie with some clipart and an equation. Fair play for trying.

I would suggest adding the following.

1. What the poster above said.

2. The reason for the E[(x_{bar} - x_i)^2] choice. Why not E[|x_{bar} - x_i|]? Was it a mathematical convencience? Was it, perhaps, because Gauss had the integral of e_{t^2} from -Inf to plus Inf lying around in a letter from Laplace?

3. It is an equation with a square. Use a square somewhere.

4. The square root of the variance happens to be the horizontal distance between the mean and the point of inflection in the normal distribution. How cool is that?

1 more reply

krcz11y ago

Covariance - dot product. Variance - squared norm. Correlation - cosinus of the angle between vectors.

sukilot11y ago

This is why I love math. Rigorous definitions and pattern reuse, distilling concepts to their essential features.

cafebeen11y ago

Never seen covariance spelled with a hyphen...

hyperliner11y ago

I love how some people have the ability to explain in simple terms something that is not as simple. And I appreciate that they take the time to write it down.

Thank you OP.

it_learnses11y ago

so many spelling mistakes...

j / k navigate · click thread line to collapse

18 comments

edtechdev11y ago

This is a horrible explanation of variance. And it's missing WHY we need variance, or, what is the usefulness of variance vs. other measures like mean and range.

Say you want to buy a car and want to choose a brand and model based on user ratings of quality and value online.

Cars A, B, and C all have the same average rating - let's say 8 out of 10. How to choose? You need more information, but all you have are the ratings.

anonymousDan11y ago

But why take the square and not just the absolute value of the differences? Is the idea to emphasize outliers and hence give higher variance to skewed datasets?

yummyfajitas11y ago

The real idea is that you have an implicit model, specifically a normal distribution. The variance is one of the parameters of the normal distribution (the other being the mean).

1 more reply

edtechdev11y ago

Right, I stopped short because I reached the point at which I'd have to Google to double check anything.

Part of the reason for using variance might be like you said, to give more weight to outliers.

1 more reply

cafebeen11y ago

One interesting benefit of the square difference is that it's differentiable, while the absolute difference isn't, which can come in handy for some optimization problems.

agumonkey11y ago

> When you buy a car with an average rating of 8 out of 10, you expect a car that is an 8. You want to minimize the chance of getting a lemon.

I may break HN code, but I LOL'd. (I think I'll never forget that variance explanation now)

thetwiceler11y ago

To define variance as

    E[x^2] - E[x]^2

and not ever allude to the far more meaningful version,

    E[ (x - E[x])^2 ]

is just criminal. This is not at all a good explanation of variance, covariance, and correlation.

mturmon11y ago

I have to agree with you here.

One problem with introducing it the way the article does is that it's hard to see why the variance is never negative, and is zero exactly when the R.V. is constant.

This is a very important property, to say the least.

It would be better to say you measure the "energy" with

  E x^2

but that this is not immune to level shifts, so you need to subtract some constant off first. And it so happens that the optimal constant to subtract off is our friend E x.

  { people who want measure-theoretic probability concepts }

and

  { people who don't know calculus }

may be empty.

hessenwolf11y ago

It looks like an enthusiastic newbie with some clipart and an equation. Fair play for trying.

I would suggest adding the following.

1. What the poster above said.

3. It is an equation with a square. Use a square somewhere.

4. The square root of the variance happens to be the horizontal distance between the mean and the point of inflection in the normal distribution. How cool is that?

1 more reply

krcz11y ago

Covariance - dot product. Variance - squared norm. Correlation - cosinus of the angle between vectors.

sukilot11y ago

This is why I love math. Rigorous definitions and pattern reuse, distilling concepts to their essential features.

cafebeen11y ago

Never seen covariance spelled with a hyphen...

hyperliner11y ago

I love how some people have the ability to explain in simple terms something that is not as simple. And I appreciate that they take the time to write it down.

Thank you OP.

it_learnses11y ago

so many spelling mistakes...

j / k navigate · click thread line to collapse