Modeling Uncertainty with PyTorch (opens in new tab)

(romainstrock.com)

64 pointssrom4y ago22 comments

22 comments

The field of ML is largely focused on just getting predictions with fancy models. Estimating the uncertainty, unexpectedness and perplexity of specific predictions is highly underappreciated in common practice.

Even though it is highly economically valuable to be able to tell to what extent you can trust a prediction, the modelling of uncertainty of ML pipelines remains an academic affair in my experience.

marbletimes4y ago

When I was in academia, I used to fit highly sophisticated models (think many-parameters, multi-level non-linear mixed effect models) who were giving not only point estimate but also confidence and predictive intervals ("please explain to me the difference between the two" is one of my favorite interview questions and I still have not heard a correct answer).

When I tried to bring an "uncertainty mindset" over when I moved to industry, I found that (1) most DS/ML scientists use ML models that typically don't provide an easy way to estimate uncertainty intervals, (2) in the industry I was in (media) people who make decisions and use model prediction as one of the input for their decision-making are typically not very quantitative and an uncertainty interval, rather than give strength to their process, would confuse them more than anything else: they want a "more or less" estimate, more than a "more or less plus something more and something less" estimate. (3) When services are customer-facing (see ride-sharing) providing an uncertainty interval (your car will arrive between 9 and 15 minutes) would anchor the customer to the lower estimate (they do for the price of rides book in advance, and they need to do it, but they are often way off).

So for many ML applications, an uncertainty interval that nobody internally or externally would base their decision upon is just a nuisance.

code_biologist4y ago

Great answer. It prompts a bunch of followup questions!

most DS/ML scientists use ML models that typically don't provide an easy way to estimate uncertainty intervals

Not an DS/ML scientist but a data engineer. The models I've used have been pretty much "slap it into XGBoost with k-fold CV, call it done" — an easy black box. Is there any model or approach you like to estimate uncertainty with similar ease?

I've seen uncertainty interval / quantile regression done using XGBoost, but it isn't out of the box. I've also been trying to learn some Bayesian modeling, but definitely don't feel handy enough to apply it to random problems needing quick answers at work.

1 more reply

antman4y ago

That is really an effect of CS rather than math people dominating ML both in applications and management. My background is in engimeering but always hire a percentage people witb math and business background. In reality there are very few ML applications that don't need confidence estimation and estimation of monetary costs. Else each company will end up having the equivalent of the google graveyard of useless applications. It really is not that hard.

1 more reply

j7ake4y ago

Although confidence and prediction intervals are slightly different, is there a an example where mistaking one for the other has led to real world consequences? I have a feeling it’s rare for it to matter.

1 more reply

joconde4y ago

What do "multi-level" and "mixed effects" mean? There are tons of non-linear models with lots of parameters, but I've never heard these other terms.

1 more reply

curiousgal4y ago

> the difference between the two

One is bigger than the other as far as I remember which means that the standard error of the prediction interval is bigger?

1 more reply

NeedMoreTime4Me4y ago

You are definitely right; there are numerous classic applications (i.e. outside of the cutting-edge CV/NLP stuff) that could greatly benefit from such a measure.

The question is: Why don’t people use these models? While Bayesian Neural Networks might be tricky to deploy & debug for some people, Gaussian Processes etc. are readily available in sklearn and other implementations.

My theory: most people do not learn these methods in their „Introduction to Machine Learning“ classes. Or is it lacking scalability in practice?

b3kart4y ago

They often don’t scale, they are tricky to implement in frameworks that people are familiar with, but, most importantly, they make crude approximations meaning after all this effort they often don’t beat simple baselines like bootstrap. It’s an exciting area of research though.

disgruntledphd24y ago

It takes more compute, and the errors from badly chosen data vastly outweigh the uncertainties associated with your parameter estimate.

To be fair, I suspect lots of people do this, but for whatever reason nobody talks about it.

shakow4y ago

> Or is it lacking scalability in practice?

Only speaking from my own little perspective in bioinformatics, lack of scalability above all else, both for BNNs and GPs.

Sure, the library support could be better, but that was not the main hurdle, more of a friction.

1 more reply

math_dandy4y ago

Uncertainty estimates in traditional parametric statistics are facilitated by strong assumptions on the distribution of the data being analyzed.

In traditional nonparametric statistics, uncertainty estimates are obtained by a process called bootstrapping. But there's a trade-off. There's no free lunch!) If you want to eschew strong distributional hypotheses, you need to pay for it with more data and more compute. The "more compute" typically involves fitting variants of the model in question to many subsets of the original dataset. In deep learning applications in which each fit of the model is extremely expensive, this is impractical.

j / k navigate · click thread line to collapse

22 comments

gillesjacobs4y ago

Even though it is highly economically valuable to be able to tell to what extent you can trust a prediction, the modelling of uncertainty of ML pipelines remains an academic affair in my experience.

marbletimes4y ago

So for many ML applications, an uncertainty interval that nobody internally or externally would base their decision upon is just a nuisance.

code_biologist4y ago

Great answer. It prompts a bunch of followup questions!

most DS/ML scientists use ML models that typically don't provide an easy way to estimate uncertainty intervals

1 more reply

antman4y ago

1 more reply

j7ake4y ago

1 more reply

joconde4y ago

What do "multi-level" and "mixed effects" mean? There are tons of non-linear models with lots of parameters, but I've never heard these other terms.

1 more reply

curiousgal4y ago

> the difference between the two

One is bigger than the other as far as I remember which means that the standard error of the prediction interval is bigger?

1 more reply

NeedMoreTime4Me4y ago

You are definitely right; there are numerous classic applications (i.e. outside of the cutting-edge CV/NLP stuff) that could greatly benefit from such a measure.

My theory: most people do not learn these methods in their „Introduction to Machine Learning“ classes. Or is it lacking scalability in practice?

b3kart4y ago

disgruntledphd24y ago

It takes more compute, and the errors from badly chosen data vastly outweigh the uncertainties associated with your parameter estimate.

To be fair, I suspect lots of people do this, but for whatever reason nobody talks about it.

shakow4y ago

> Or is it lacking scalability in practice?

Only speaking from my own little perspective in bioinformatics, lack of scalability above all else, both for BNNs and GPs.

Sure, the library support could be better, but that was not the main hurdle, more of a friction.

1 more reply

math_dandy4y ago

Uncertainty estimates in traditional parametric statistics are facilitated by strong assumptions on the distribution of the data being analyzed.

j / k navigate · click thread line to collapse