undefined | Better HN

0 pointssillyinseattle4y ago0 comments

Question about terminology (no background in AI). In econometrics, estimation is model fitting (training, I guess), and inference refers to hypothesis testing (e.g. t or F tests). What does inference mean here?

0 comments

iamaaditya4y ago

In machine learning (especially deep learning or neural networks), the 'training' is done by using Stochastic Gradient Descent. These gradients are computed using Backpropagation. Backpropagation requires you to do a backward pass of your model (typically many layers of neural weights) and thus requires you to keep in memory a lot of intermediate values (called activations). However, if you are doing "inference" that is if the goal is only to get the result but not improve the model, then you don't have to do the backpropagation and thus you don't need to store/save the intermediate values. As the layers and number of parameters in Deep Learning grows, this difference in computation in training vs inference becomes signifiant. In most modern applications of ML, you train once but infer many times, and thus it makes sense to have specialized hardware that is optimized for "inference" at the cost of its inability to do "training".

eklitzke4y ago

Just to add to this, the reason these inference accelerators have become big recently (see also the "neural core" in Pixel phones) is because they help doing inference tasks in real time (lower model latency) with better power usage than a GPU.

As a concrete example, on a camera you might want to run a facial detector so the camera can automatically adjust its focus when it sees a human face. Or you might want a person detector that can detect the outline of the person in the shot, so that you can blur/change their background in something like a Zoom call. All of these applications are going to work better if you can run your model at, say, 60 HZ instead of 20 HZ. Optimizing hardware to do inference tasks like this as fast as possible with the least possible power usage it pretty different from optimizing for all the things a GPU needs to do, so you might end up with hardware that has both and uses them for different tasks.

sillyinseattleOP4y ago

Thank you @iamaaditya and @eklitzke . Very informative

dekhn4y ago

it took me 20 years to learn this body of knowledge and now it can just sort of be summed up in a paragraph.

When I learned and used gradient descent, you had to analytically determine your own gradients (https://web.archive.org/web/20161028022707/https://genomics....). I went to grad school to learn how to determine my own gradients. Unfortunately, in my realm, loss landscapes have multiple minima, and gradient descent just gets trapped in local minima.

thentherewere24y ago

This is the case most contemporary neural networks as well. It turns out for many domains, a "good" local minima generalizes well across many tasks.

1 more reply

ray__4y ago

What's your realm?

1 more reply

derbOac4y ago

Worth noting that inference in "traditional" statistics and ML/AI/DL isn't really that different at some level. In both cases you have an inverse problem; in one case the parameters are about a group or population (e.g., something about all cats in existence), and in another it is about an individual case (something about a particular cat).

dataexporter4y ago

This sounds really fascinating. Are there any resources that you'd recommend for someone who's starting out in learning all this? I'm a complete beginner when it comes to Machine Learning.

dr_zoidberg4y ago

Deep Learning with Python (2nd ed), by Francois Chollet.

If you don't mind about learning the part where you program, it's got a lot of beginner/intermediate concepts clearly explained. If you do dive into the programming examples, you get to play around with a few architectures and ideas and you're left on the step to dive into the more advanced material knowing what you're doing.

sshlocalhost984y ago

Thanks for the explanation, really succinct. Do you recommend any good back propagation tutorials for an EE undergrad?

abm534y ago

It is confusing that the ML community have come to use "inference" to mean prediction, whereas statisticians have long used it to refer to training/fitting, or hypothesis testing.

I'm not sure when or why this started.

mattkrause4y ago

Prediction.

The model is literally "inferring" something about its inputs: e.g., these pixels denote a hot dog, those don't.

malshe4y ago

I have background in both and it's very confusing to me. Inference in DL is running a trained model to predict/classify. Inference in stats and econometrics is totally different as you noted.

upwardbound4y ago

Inference here means "running" the model. So maybe it has a similar meaning as in econometrics?

Training is learning the weights (millions or billions of parameters) that control the model's behavior, vs inference is "running" the trained model on user data.

Q6T46nT668w6i3m4y ago

I’m surprised nobody has provided the basic explanation: inference, here, means matrix, matrix or matrix, scalar multiplication.

j / k navigate · click thread line to collapse

0 comments

iamaaditya4y ago

eklitzke4y ago

sillyinseattleOP4y ago

Thank you @iamaaditya and @eklitzke . Very informative

dekhn4y ago

it took me 20 years to learn this body of knowledge and now it can just sort of be summed up in a paragraph.

thentherewere24y ago

This is the case most contemporary neural networks as well. It turns out for many domains, a "good" local minima generalizes well across many tasks.

1 more reply

ray__4y ago

What's your realm?

1 more reply

derbOac4y ago

dataexporter4y ago

This sounds really fascinating. Are there any resources that you'd recommend for someone who's starting out in learning all this? I'm a complete beginner when it comes to Machine Learning.

dr_zoidberg4y ago

Deep Learning with Python (2nd ed), by Francois Chollet.

sshlocalhost984y ago

Thanks for the explanation, really succinct. Do you recommend any good back propagation tutorials for an EE undergrad?

abm534y ago

It is confusing that the ML community have come to use "inference" to mean prediction, whereas statisticians have long used it to refer to training/fitting, or hypothesis testing.

I'm not sure when or why this started.

mattkrause4y ago

Prediction.

The model is literally "inferring" something about its inputs: e.g., these pixels denote a hot dog, those don't.

malshe4y ago

I have background in both and it's very confusing to me. Inference in DL is running a trained model to predict/classify. Inference in stats and econometrics is totally different as you noted.

upwardbound4y ago

Inference here means "running" the model. So maybe it has a similar meaning as in econometrics?

Training is learning the weights (millions or billions of parameters) that control the model's behavior, vs inference is "running" the trained model on user data.

Q6T46nT668w6i3m4y ago

I’m surprised nobody has provided the basic explanation: inference, here, means matrix, matrix or matrix, scalar multiplication.

j / k navigate · click thread line to collapse