undefined | Better HN

0 pointsmitthrowaway21y ago0 comments

The article's presentation of the James-Stein estimator sets the arbitrary point at the origin. (My previous comments should be read in this context). Of course, we could set it anywhere, including [42,...]. Let's call it p. Regardless of where you set it, the estimator suggests that your best estimate û, of the mean μ, should be nudged a little away from x and towards p.

My point is that the choice of 'p' (or, in the article's presentation, the choice of origin) cannot truly be arbitrary because if it reduces the expected squared difference between μ and û, then it necessarily contains information about μ. If all you truly know about μ is x and σ, then you will have no way to guess in which direction you should even shift your estimate û to reduce that error.

If you do have some additional information about μ, beyond just x alone, then sure, take advantage of it! But then don't call it a paradox.

0 comments

kgwgk1y ago

(I cannot speak for the original article, I’ve not put the effort to fully understand it so I won’t categorically say it’s wrong but it didn’t seem right to me.)

The “paradox” is that it can truly be arbitrary! Pick a random point. Shrink your least-squares estimator. You got yourself a “better” estimator - without having any additional information.

That’s why the “Inadmissibility of the Usual Estimator for the Mean of a Multivariate Normal Distribution” paper had the impact that it had.

mitthrowaway2OP1y ago

Then you'll have to clarify what you mean by "random" when you say "pick a random point".

Unless you mean that every point on a spherical surface centered on x would have a lower expected squared error than x itself?

kgwgk1y ago

We may be talking about different things.

Let's say that you have a standard multivariate normal with unknown mean mu = [a, b, c].

The usual maximum-likelihood estimator of the unknown mean when you get an observation is to take the observed value as estimate. If you observe [x, y, z] the "naive" estimator gives you the estimate mû = [x, y, z].

For any arbitrary point [p, q, r] you can define another estimator. If you observe [x, y, z] this "shrinkage" estimator gives you an estimate which is no longer precisely at [x, y, z] but is displaced in the direction of [p, q, r]. For simplicity let's say the resulting estimate is mû' = [x', y', z'].

Whatever the choice you make for [p, q, r] the "shrinkage" estimator has lower mean squared error than the "naive" estimator. The expected value of (x'-a)²+(y'-b)²+(z'-c)² is lower than the expected value of (x-a)²+(y-b)²+(z-c)².

j / k navigate · click thread line to collapse

0 pointsmitthrowaway21y ago0 comments

If you do have some additional information about μ, beyond just x alone, then sure, take advantage of it! But then don't call it a paradox.

0 comments

kgwgk1y ago

(I cannot speak for the original article, I’ve not put the effort to fully understand it so I won’t categorically say it’s wrong but it didn’t seem right to me.)

The “paradox” is that it can truly be arbitrary! Pick a random point. Shrink your least-squares estimator. You got yourself a “better” estimator - without having any additional information.

That’s why the “Inadmissibility of the Usual Estimator for the Mean of a Multivariate Normal Distribution” paper had the impact that it had.

mitthrowaway2OP1y ago

Then you'll have to clarify what you mean by "random" when you say "pick a random point".

Unless you mean that every point on a spherical surface centered on x would have a lower expected squared error than x itself?

kgwgk1y ago

We may be talking about different things.

Let's say that you have a standard multivariate normal with unknown mean mu = [a, b, c].

j / k navigate · click thread line to collapse