Interview on ”Bayesian Statistics the Fun Way” (opens in new tab)

(notamonadtutorial.com)

222 pointsunbalancedparen6y ago52 comments

52 comments

As someone who has a master's degree in statistics and often uses Bayesian statistics, I think we should not focus on whether one is a Bayesian or a frequentist, but rather be pragmatic and take the most practical approach to solving a statistical problem. Moreover, I think statistical education should start with frequentist concepts and then extend them to the Bayesian framework since the likelihood plays also a major role in obtaining the posterior distribution. In my opinion, this progression is much more natural than starting fully Bayesian.

MR4D6y ago

Respectfully, I find that people who are not statisticians overwhelming disagree with your point on which should be taught first.

Bayesian is a natural order of inference for people. The whole concept of the black swan ("all swans are white") proves this out.

Frequentist statistics is much less intuitive to people.

My preference is for people to be able to use some statistics, and Bayesian gets them productive faster.

tel6y ago

Frequentist statistics is often pretty poorly taught. Ideas like likelihood, modeling, and optimization underly the mechanics of both worlds. There's a big obsession with testing, but the Neyman Pearson testing framework is sound an intuitive.

Bayesian statistics gets a big boost because it's usually taught as a system instead of as a recipe book.

codesushi426y ago

I would argue that the problem with frequentist statistics is that it aligns with humans' flawed intuition of how randomness works. People are inherently obsessed with finding patterns to support their hypotheses.

The problem is that what we perceive as random and extremely unlikely events are in fact much more probable than what we estimate from using Gaussian methods. And the frequentist approach helps to create this distortion by ignoring black swans.

Here's a great video demonstrating how people tend to misunderstand randomness: https://youtu.be/tP-Ipsat90c

loup-vaillant6y ago

One approach gives the right answer. The other approach is more computationally tractable. Computers are pretty powerful now, so we can afford the correct answer much more often than we used to.

As for what is more natural… I've seen a (frequentist) introduction to statistics, and it simply did not make sense. Nothing was justified, you just had to learn the stuff by rote and apply it in situations that look like they could use one tool or another.

Probability theory on the other hand is pretty obvious. The axioms required to derive it are ridiculously few and ridiculously intuitive. From there you get the sum and product rules, and all the rest. Always made perfect sense to me.

samch936y ago

I am surprised by how many people equal frequentist statistics with Neyman-Pearson hypothesis testing. In my opinion, the main difference between the two approaches being whether the parameters of a statistical model are considered as fixed or random, everything else follows from this.

On the subject of statistical education: The point I tried to make is that I think it is much easier to study first the likelihood, the central quantity of frequentist inference. One can then go to the Bayesian world simply by allowing the parameters to be random variables. Furthermore, as other commentors have pointed out, technical difficulties arise in the non-conjugate Bayesian setting when MCMC sampling has to be used. In my opinion, MCMC algorithms, convergence diagnostics, etc. are certainly not topics for an intro stats course.

davidmanheim6y ago

Having taught frequentist stats as a TA to grad students, I understand why frequentist stats seems not to make sense. On the other had, my prior on teaching quality, and my data on the relative difficulty of understanding the approaches says with near-certainty that your experience has nothing to do with the approach taken.

Having used Bayesian stats heavily, I'd note that the hard parts are not gone, they are just located elsewhere - in how to actually do the computations, rather than in how to set up problems. Each can be taught poorly or well, but given that MCMC is certainly harder than least-squares, it seems difficult to argue that using Bayesian statistics is easier. (Unless you're not just applying the methods by rote, and letting the computer spit out answers - and if you are, I don't know why you are better off with Bayesian methods. In fact, if that's what you're doing, please stop doing statistics and pay an expert instead.)

1 more reply

techwizrd6y ago

I agree with this approach, and this is roughly the approach my own Statistics master's degree takes as well. It can be challenging to understand the finer points of likelihoods and posteriors (and the how to choose a prior) without serious mathematics that you're unlikely to have upon entering a graduate statistics degree.

Starting with applied probability and applied statistics (incl. regression, ANOVA, GLMs) allow you to solve problems and feel useful and engaged before being thrown into the mathematical rigor required of Bayesian statistics.

gbrown6y ago

I agree, although I respect those who look for deeper justification for the methods we use. Bayesian statistics/decision theory does have axiomatic foundations after all.

davidmanheim6y ago

So does frequentist stats - they are just different axiomatic foundations and assumptions.

1 more reply

moultano6y ago

In an introductory course, we should be teaching people to collect enough data that any reasonable choice of prior or method doesn't matter that much.

analog316y ago

I started college in 1982. At that time, calculators were common, but not computers. The data sets had to be small enough for us to work problems by hand. Not any more. I see no reason why a stats course can't start out with big bright data sets that are easy to analyze, then advance through more difficult problems where it becomes progressively easier to get things wrong, and thus requires more sophistication to think about problems.

I just want to add a bit more. It's quite easy today, to generate and play with random numbers. If you think you understand a process that has generated your data simulate it and run the simulated data through the same analysis. I do this for real -- I don't trust myself to choose the right statistical analysis, so I always test my chosen analysis with simulated data. If I can fool myself with simulated data, than my real data is probably fooling me too.

loup-vaillant6y ago

That is often not possible.

Could we, for instance, collect enough data on typing discipline to end the static/dynamic typing once and for all? Enough data to overcome the priors of both static typing and dynamic typing proponents?

We could, but that would require pretty big sample sizes. Like 10,000 developers of various competence, working on 1,000 projects of various domains and difficulties for various amounts of time (from a few days to at least a few months). Who is ever going to fund that?

Until we get such a miracle controlled study, our respective priors will still matter.

arafa6y ago

As someone who uses statistics all the time at work, I sympathize so much with this article and greatly enjoyed it. Every time I try to introduce a Bayesian prior, coworkers either look at me like I'm crazy (because they've never heard of or used Bayesian stats) or like I've suddenly gone soft and introduced a bunch of nebulous, touchy-feely context into the objective truth (if they're dedicated frequentists).

Then we promptly switch back to p-values of .05, a lot of the time not even bothering with a statistical power calculation. I've had better success with introducing power, though. I suspect that's because we can fit it into the existing frequentist framework.

hotdog9996y ago

> like I've suddenly gone soft and introduced a bunch of nebulous, touchy-feely context into the objective truth

This drives me nuts. If you haven't, check out the paper "Beyond subjective and objective in statistics" by Gelman and Hennig (2017).

Right at the beginning they make the point that any analysis includes external information in many ways, such as adjusting variables for imbalance, how we deal with outliers, regularization, etc.

Especially if you're doing any sort of causal inference, you're usually making strong assumptions before estimating your model, even just in terms of which variables are included and how they're connected. The idea that priors are somehow ruining an "objective" model is just absurd to me. You're already making so many other decisions about your model that will affect estimates and your interpretation of them. Priors seem like another perfectly reasonable decision to have to make as well, with the benefit of getting results that I think in general are must more easily understood by a lay audience. (E.g., I don't think I've ever encountered someone not on my data science team that actually understands what a p-value is. But people are much better at understanding when I say, there's an X percent chance that there is a positive effect here.)

jonathanstrange6y ago

This critique might come from the idea that having a good analytic model, or at least some valuable analytic insights, involves much more than assigning some priors. Of course, the two things don't exclude each other, but for some frequentists Bayesians have the wrong perspective - or at least that's the critique, whether it's true or not.

Another issue that I personally have with Bayesianism is that I believe that assigning probabilities to singular events is only meaningful and admissible at all if there is a good analytic explanation for the respective propensity. For example, we may be able to deduce that a die is reasonably fair from the way it is constructed and our knowledge of physics, and later confirm this by frequentist analysis. Merely believing or claiming that the die is fair is not acceptable. Again, the difference is only one of attitude in the end, I suppose.

Maybe philosophers have given Bayesian statistics a bad rap, too, because many of those who call themselves Bayesians are also "probabilists", i.e., they think that rational belief must conform to the probability calculus. There are many arguments against probabilism and the only arguments that speak for it are Dutch book arguments. The view does not have very strong foundations.

2 more replies

dhfromkorea6y ago

> The idea that priors are somehow ruining an "objective" model is just absurd to me.

I think some caution can be justified to a certain extent (not the blind "emotional" objections). When establishing priors in a low data regime, one must necessarily be careful. It's a knob whose mass can change a lot in the inference conclusion. That said, if we trust our belief about the region the available data do not inform us well of, why not utilize our domain knowledge/belief?

RosanaAnaDana6y ago

I think the swedish fish approach is a particularly fun way: https://www.youtube.com/watch?v=3OJEae7Qb_o

mself6y ago

I love the idea of making an approachable version of Ed Jaynes’s classic.

MadWombat6y ago

"For coin tosses both schools of thought work pretty well"

How many coin tosses in a row have to land heads before a frequentist decides that the coin is unfair?

3wolf6y ago

6, if it's a two-sided test.

MadWombat6y ago

Explain?

1 more reply

bryanrasmussen6y ago

>First of all, p-values are not the way sane people answer questions

I think they are pretty close to the way sane people answer some kinds of questions.

gdy6y ago

Can anyone recommend a 'Bayesian statistics the hard way' book?

kgwgk6y ago

For the hard way, look at Bruno de Finetti's Theory of Probability:

https://onlinelibrary.wiley.com/doi/book/10.1002/97811192863...

Jaynes is certainly very deep and some sections are harder than others. It's interesting regardless of your level (this is a book worth rereading several times).

For a less technical, but full of insight, introduction see Dennis Lindley's Understanding Uncertainty:

https://onlinelibrary.wiley.com/doi/book/10.1002/97811186501...

ploika6y ago

Bayesian Data Analysis by Andrew Gelman

http://www.stat.columbia.edu/~gelman/book/

dhfromkorea6y ago

one vote for BDA. For programmers who learn better by implementing things, this book [1] is also good:

[1]: https://www.amazon.com/Bayesian-Methods-Hackers-Probabilisti...

1 more reply

j7ake6y ago

Probability Theory: The Logic of Science by Edwin Jaynes

loup-vaillant6y ago

http://www.med.mcgill.ca/epidemiology/hanley/bios601/Gaussia...

But really, the first two chapters aren't that hard.

datasciencetext6y ago

Statistical Rethinking: A Bayesian Course with Examples in R and Stan is also considered pretty good.

gdy6y ago

Thank you all!

RobertRoberts6y ago

I got in an argument with a friend (a mechanical/electrical engineer) who knew about bayesian statistics. My other friend, a PhD in statistics, whom I had many discussions about because both personal interest and work interests, had supplied me with my modicum of statics knowledge.

My engineer friend called my PhD friend a "frequentist", like it was a dirty word, despite only having one, maybe two, classes in college about bayesian math/statistics/whatever (my ignorance).

This quote jumped out at me in the article:

"I wanted to write a book on Bayesian statistics that really anyone could pick up and use to gain real intuitions for how to think statistically and solve real problems using statistics."

In the context of the statement, it sounds like he is claimin any non-bayesian statistics is useless (or less valuable/reliable at best) than other forms of statistical analysis?

jmvoodoo6y ago

Having known Will when I lived in Reno I'm certain your focus should be on "anyone could pick up and use" and not any statement about the usefulness of other approaches. The Will I know is fundamentally about teaching things in very easy to understand ways, and curious about all approaches to solving a problem.

brylie6y ago

It just reads to me like he wants to make statistics accessible to a wide audience.

jdreaver6y ago

That's not how I'm reading that quote at all. Saying Bayesian stats can solve real problems doesn't imply frequentist stats can't.

j / k navigate · click thread line to collapse

52 comments

samch936y ago

MR4D6y ago

Respectfully, I find that people who are not statisticians overwhelming disagree with your point on which should be taught first.

Bayesian is a natural order of inference for people. The whole concept of the black swan ("all swans are white") proves this out.

Frequentist statistics is much less intuitive to people.

My preference is for people to be able to use some statistics, and Bayesian gets them productive faster.

tel6y ago

Bayesian statistics gets a big boost because it's usually taught as a system instead of as a recipe book.

codesushi426y ago

Here's a great video demonstrating how people tend to misunderstand randomness: https://youtu.be/tP-Ipsat90c

loup-vaillant6y ago

One approach gives the right answer. The other approach is more computationally tractable. Computers are pretty powerful now, so we can afford the correct answer much more often than we used to.

samch936y ago

davidmanheim6y ago

1 more reply

techwizrd6y ago

gbrown6y ago

I agree, although I respect those who look for deeper justification for the methods we use. Bayesian statistics/decision theory does have axiomatic foundations after all.

davidmanheim6y ago

So does frequentist stats - they are just different axiomatic foundations and assumptions.

1 more reply

moultano6y ago

In an introductory course, we should be teaching people to collect enough data that any reasonable choice of prior or method doesn't matter that much.

analog316y ago

loup-vaillant6y ago

That is often not possible.

Until we get such a miracle controlled study, our respective priors will still matter.

arafa6y ago

hotdog9996y ago

> like I've suddenly gone soft and introduced a bunch of nebulous, touchy-feely context into the objective truth

This drives me nuts. If you haven't, check out the paper "Beyond subjective and objective in statistics" by Gelman and Hennig (2017).

Right at the beginning they make the point that any analysis includes external information in many ways, such as adjusting variables for imbalance, how we deal with outliers, regularization, etc.

jonathanstrange6y ago

2 more replies

dhfromkorea6y ago

> The idea that priors are somehow ruining an "objective" model is just absurd to me.

RosanaAnaDana6y ago

I think the swedish fish approach is a particularly fun way: https://www.youtube.com/watch?v=3OJEae7Qb_o

mself6y ago

I love the idea of making an approachable version of Ed Jaynes’s classic.

MadWombat6y ago

"For coin tosses both schools of thought work pretty well"

How many coin tosses in a row have to land heads before a frequentist decides that the coin is unfair?

3wolf6y ago

6, if it's a two-sided test.

MadWombat6y ago

Explain?

1 more reply

bryanrasmussen6y ago

>First of all, p-values are not the way sane people answer questions

I think they are pretty close to the way sane people answer some kinds of questions.

gdy6y ago

Can anyone recommend a 'Bayesian statistics the hard way' book?

kgwgk6y ago

For the hard way, look at Bruno de Finetti's Theory of Probability:

https://onlinelibrary.wiley.com/doi/book/10.1002/97811192863...

Jaynes is certainly very deep and some sections are harder than others. It's interesting regardless of your level (this is a book worth rereading several times).

For a less technical, but full of insight, introduction see Dennis Lindley's Understanding Uncertainty:

https://onlinelibrary.wiley.com/doi/book/10.1002/97811186501...

ploika6y ago

Bayesian Data Analysis by Andrew Gelman

http://www.stat.columbia.edu/~gelman/book/

dhfromkorea6y ago

one vote for BDA. For programmers who learn better by implementing things, this book [1] is also good:

[1]: https://www.amazon.com/Bayesian-Methods-Hackers-Probabilisti...

1 more reply

j7ake6y ago

Probability Theory: The Logic of Science by Edwin Jaynes

loup-vaillant6y ago

http://www.med.mcgill.ca/epidemiology/hanley/bios601/Gaussia...

But really, the first two chapters aren't that hard.

datasciencetext6y ago

Statistical Rethinking: A Bayesian Course with Examples in R and Stan is also considered pretty good.

gdy6y ago

Thank you all!

RobertRoberts6y ago

My engineer friend called my PhD friend a "frequentist", like it was a dirty word, despite only having one, maybe two, classes in college about bayesian math/statistics/whatever (my ignorance).

This quote jumped out at me in the article:

"I wanted to write a book on Bayesian statistics that really anyone could pick up and use to gain real intuitions for how to think statistically and solve real problems using statistics."

In the context of the statement, it sounds like he is claimin any non-bayesian statistics is useless (or less valuable/reliable at best) than other forms of statistical analysis?

jmvoodoo6y ago

brylie6y ago

It just reads to me like he wants to make statistics accessible to a wide audience.

jdreaver6y ago

That's not how I'm reading that quote at all. Saying Bayesian stats can solve real problems doesn't imply frequentist stats can't.

j / k navigate · click thread line to collapse