Probabilistic Machine Learning: An Introduction (opens in new tab)

(probml.github.io)

310 pointsjoaorico5y ago56 comments

56 comments

The new edition has been split in two parts. The pdf draft (921 pages) and python code [1] of the first part are now available. The table of contents of the second part is here [2].

From the preface:

"By Spring 2020, my draft of the second edition had swollen to about 1600 pages, and I was still not done. At this point, 3 major events happened. First, the COVID-19 pandemic struck, so I decided to “pivot” so I could spend most of my time on COVID-19 modeling. Second, MIT Press told me they could not publish a 1600 page book, and that I would need to split it into two volumes. Third, I decided to recruit several colleagues to help me finish the last ∼ 15% of “missing content”. (See acknowledgements below.)

The result is two new books, “Probabilistic Machine Learning: An Introduction”, which you are currently reading, and “Probabilistic Machine Learning: Advanced Topics”, which is the sequel to this book [Mur22].

Together these two books attempt to present a fairly broad coverage of the field of ML c. 2020, using the same unifying lens of probabilistic modeling and Bayesian decision theory that I used in the first book. Most of the content from the first book has been reused, but it is now split fairly evenly between the two new books. In addition, each book has lots of new material, covering some topics from deep learning, but also advances in other parts of the field, such as generative models, variational inference and reinforcement learning. To make the book more self-contained and useful for students, I have also added some more background content, on topics such as optimization and linear algebra, that was omitted from the first book due to lack of space.

Another major change is that nearly all of the software now uses Python instead of Matlab."

[1] https://github.com/probml/pyprobml

[2] https://probml.github.io/pml-book/book2.html

jbay8085y ago

It's very encouraging to see Matlab losing ground in the educational space. I don't know why so many engineers let their foundational skills to be locked behind a proprietary ecosystem like that.

SideQuark5y ago

>I don't know why so many engineers let their foundational skills to be locked behind a proprietary ecosystem like that.

Because no open source toolkit can do what Matlab can do.

The same is true of a lot of high end software: Photoshop, pretty much any serious parametric CAD modeling system (say, SolidWorks), DaVinci Resolve, Ableton Live, etc. When a professional costs $100K+ to employ, paying a few grand to make them vastly more productive is a no brainer. If open source truly offered a replacement, then these costly programs would die. But there just isn't anything close for most work.

Matlab is used for massive amounts of precise numerical engineering design, modeling, and running systems. So while Python is good for some tasks, for the places Matlab shines Python is no where near usable. And before Python catches up in this space, I'd expect Julia to get there faster.

amitport5y ago

As someone who helped migrate a university course from Matlab to python I must say proprietary features of Matlab had nothing to do with the reason it lasted so long.

Basically, it was mainly inhertia. Older professors that liked it and rarly used anything else and the fact that generally no one gets rewarded for actually rewriting parts of an existing functioning course.

As an instructor you basically create more work for yourself in the first time you migrate a course's programming language. (And you also annoy some senior staff when forcing them to learn new things)

1 more reply

jbay8085y ago

I guess I'll rephrase - if you can't understand a transfer function or a probability distribution without opening Matlab, then you've allowed your own expertise to be held hostage. Unfortunately, I know a large number of professionals for whom this is true.

If you're more productive in Matlab, that's fine. But if you're at a loss without it, that's not.

It doesn't belong in the education system or in educational books.

1 more reply

abhgh5y ago

In the specific case of ML courses - many of which I have TA-ed or attended classes of, this reason does not ring true at all. Libraries for most standard algorithms are available in some form with a Python interface (or for the more statistical stuff: R). Its almost always the inertia from the initial design of the course.

It is also not true today that not knowing Matlab harms your industry productivity in ML. This might have been true around a decade ago, but most teams outside academia also have moved to non-Matlab resources. And if anything, this has been further reinforced by Deep Learning libraries, the current crop of MLOps tools and cloud-based frameworks.

Matlab might be good for specific areas, but ML has not been a stronghold for a while. It is also important to remember that in the context of numerical accuracy or computation speed, Python is almost always the user-facing layer. You might (correctly) argue that the Python language is slower/faster than X, but this is not a useful metric for comparing libraries and frameworks, where the compute heavy code is probably in C/C++: numpy, tensorflow, pytorch are good examples of this.

Nowado5y ago

Professional costs $100k+ to employ partially because only those able to afford those tools for training get into the field.

1 more reply

scottlocklin5y ago

It's a regression as far as code readability goes for fairly straightforward reasons: almost everything in Matlab is a matrix. Matrices are not first class citizens in Python, and it matters. I use Python a hell of a lot more than Matlab, but for examining how an algorithm works (say, for implementing in another language or modifying it to do tricks), Matlab wins. Go look at these PRML collections in Python and Matlab and see if you disagree:

https://github.com/ctgk/PRML

https://github.com/PRML/PRMLT

svantana5y ago

I used to feel the same, but three years after making the switch, I've changed my mind. Matlab code has brevity, but sometimes at the expense of clarity. For example, sum(x,axis=1) is more clear than sum(x,1). Especially when matlab has functions like diff() where the second argument is not axis.

Broadcasting in python is a lot more clean than the "bsxfun(@plus, ...)" abomination in matlab. If you think all the "np." is too wordy then just do "from numpy import *". For matrix multiplication you can use "@". Numpy code can be dense but most people choose clarity over brevity.

1 more reply

srvmshr5y ago

The only thing I find nice in what Mathworks offers nowadays is their caps & T-shirts at conferences. MATLAB is on Medicare in deep learning times.

JHonaker5y ago

This is probably my favorite introductory machine learning book. The fact that he places almost everything in the language of graphical models is such a good common ground to build off.

This really sets you up to realize that there is (and should be) a lot more to doing a good job in machine learning than simply minimizing an objective function. The answers you get depend on the model you create as do the questions you can hope to answer.

I don't see a clear list of differences between this new edition. Does anyone know what's new?

master_yoda_15y ago

Agree with you. But none of this is useful for practical (applied) machine learning. I don't want to disappoint you but you can read it as machine learning porn, but otherwise don't waste time on it.

JHonaker5y ago

I mean, as a graduate student, it was definitely incredibly useful. As a practicing data scientist, I’d have to say that it’s also incredibly useful.

I’ve used this stuff, and more often, the ideas taught, to break down a problem into a tackle-able set of pieces more times than I can count.

Never underestimate the fundamentals. Too many of my colleagues use models without actually understanding any of it. I’ve debugged so many problems by looking at the technical details in original papers and textbooks.

hnracer5y ago

Are you saying the book itself is ML porn?

master_yoda_15y ago

Yes, unless you are among 20 top researchers who are working on frontier of ml. Bayesian probabilistic techniques does not work or are very slow for any practical purpose.

3 more replies

iamcreasy5y ago

What is the advantage of placing everything in the language of graphical model? How does the other ML book do it?

JHonaker5y ago

Graphical models are just a way to encode relationships between different variables in a probabilistic model. Directed acyclic graphs (DAGs) allow you to specify (most of) the conditional independence structures that you can have between things like parameters and random variables.

This is really useful information because it can help you identify what information is truly relevant for the estimation of certain parameters (so sufficient statistics) or help you crystallize your understanding of the implications of the model you’ve created. In other words, it helps show you the ways in which your model says different aspects of your data should influence others.

This creates testable implications of the model. If your model says that two variables should be conditionally independent given a third, but they’re not, you have an avenue for refinement. You can also clearly identify your assumptions or the implications of your assumptions.

Another great thing about them is that exact inference for certain (most) structures is known to be computationally infeasible. There are a lot of different inference schemes available that can help you with different approximations with various drawbacks/advantages, heuristics that sort of work, or even ways of drawing samples from the true distribution if you can identify the structures. See belief propagation, loopy belief propagation, sequential Monte Carlo, and Markov chain Monte Carlo methods.

On top of this it helps you see everything in a general framework. Lots of the fundamental pieces of ML models are really just slight tweaks to other things. For instance, SVMs are linear models on kernel spaces with a specific structural prior. Same with splines; it’s just a different basis function. All of this helps you see the pieces of different methods that are actually identical. This helps you make connections and learn more effectively, in my opinion.

iamcreasy5y ago

Absolutely. Very strongly agree with the last paragraph, and that's how I aspire to learn in general. Can you point me to some resources(book or otherwise) that goes over all these relationship in a general framework?

1 more reply

abhinav225y ago

For anybody truly serious about this field, I recommend the below book. It has some poor reviews on Amazon, which I was shocked to see, but it is my favourite book and taught me the core of probability theory and statistics, in a way most books don’t. Your understanding of Machine Learning will be better than 90% of those out there, if you can get through the principles in this book.

I topped statistics at the most prestigious university in my country both at the undergrad and postgrad level, and had no problem discussing advanced concepts with Senior PHDs in Quantitative Fields, and I thank this book the most for beginning my journey on this. But, and this is important, make sure to do all the exercises!

https://www.amazon.com/John-Freunds-Mathematical-Statistics-...

harperlee5y ago

> it is my favourite book

Is it your favourite book because of how much your personal history is tied to it, and the time you deboted to it, or are you comparing it against other books based on an analytic review and comparison of several books that you did at some point?

Nothing wrong with the former case, I also have favourites that I recommend, but if it’s the latter the recommendation is more helpful; in that case it would be awesome to detail why this one over others.

This idea of compared review is useful here:

https://fivebooks.com/

And here:

https://www.lesswrong.com/posts/xg3hXCYQPJkwHyik2/the-best-t...

abhinav225y ago

This is a very good post and I agree with your comments.

The book for me was partly great because of its contents and partly because I worked through every problem and realized how much it taught me. I should do a more factual write up on why it’s a great book, I’ll try to when I get some time.

saeranv5y ago

Looking at the table of contents (for someone who is not familiar with the term 'Probabilistic Machine Learning'), is this just covering typical ML methods through the lens of probability?

master_yoda_15y ago

Answer is not so black and white as everything in ml has to use probability. You can ignore this unless you are among 20 top researchers who are working on frontier of ml. Bayesian probabilistic techniques does not work or are very slow for any practical purpose.

amkkma5y ago

But does it aid in understanding regular models, as they might have a bayesian interpretation?

master_yoda_15y ago

Yeh for sure, but it's an overkill. It's like reading quantum mechanics to understand Newtonian mechanics. If you want to get a feel of bayesian ml here is an easier book, "Regression and Other Stories" https://avehtari.github.io/ROS-Examples/

1 more reply

fithisux5y ago

Why ML books should be so big? In many cases the books are various sub books pasted as one or extensive bibliography reviews that just list progress without any pedagogy. I would suggest splitting them in 200-250 pp format in parts that can serve independently.

therobot245y ago

while book 1 looks great, it appears that book 2 is still very rough: https://probml.github.io/pml-book/book2.html

bite_tongue5y ago

Excited to check it out, this was a game changer for me. It turned me on to Gaussian Processes, which I think are a really fun tool.

dhairya5y ago

also recommend probabilistic methods for hackers as another resource to explore this space:

https://camdavidsonpilon.github.io/Probabilistic-Programming...

mark_l_watson5y ago

Thanks, that looks like a good resource for my learning style. I usually learn new things by playing around with code before diving into theory (I know that this sounds backwards).

master_yoda_15y ago

Yeh this is more practical and useful book.

oleg_myrk5y ago

The original 2012 book was awesome!

Would love to get my hands on the draft for "Probabilistic Machine Learning: Advanced Topics".

https://probml.github.io/pml-book/book2.html

singhrac5y ago

In my opinion (having read both books and TA’d courses using both), Murphy is a significantly better book than Bishops’s Machine Learning. I’m very excited about a sequel!

jaredtn5y ago

Quite excited to read this. Murphy does a great job of explaining concepts from first principles.

sjg0075y ago

I think it would be extremely helpful to map the math into code. Nobody has done this as far as I've seen. I mean you can find Github repositories for some papers but really a set of explicit tutorials from the math to the code would be really helpful.

To say something is "machine learning", I think means that you should show the code not just equations and derivations.

I mean if you only show math and derivations, what's the point? To show off what you know? How is that helpful?

joshvm5y ago

This exists actually, it's not complete yet (I think?) but it covers a lot of the material in the book:

https://github.com/probml/pyprobml

master_yoda_15y ago

Yeh at least he gives a full derivation for every equation :)

andrewnc5y ago

I helped with this project last year. It's very cool to see it taking shape

j / k navigate · click thread line to collapse

56 comments

joaoricoOP5y ago

The new edition has been split in two parts. The pdf draft (921 pages) and python code [1] of the first part are now available. The table of contents of the second part is here [2].

From the preface:

Another major change is that nearly all of the software now uses Python instead of Matlab."

[1] https://github.com/probml/pyprobml

[2] https://probml.github.io/pml-book/book2.html

jbay8085y ago

It's very encouraging to see Matlab losing ground in the educational space. I don't know why so many engineers let their foundational skills to be locked behind a proprietary ecosystem like that.

SideQuark5y ago

>I don't know why so many engineers let their foundational skills to be locked behind a proprietary ecosystem like that.

Because no open source toolkit can do what Matlab can do.

amitport5y ago

As someone who helped migrate a university course from Matlab to python I must say proprietary features of Matlab had nothing to do with the reason it lasted so long.

1 more reply

jbay8085y ago

If you're more productive in Matlab, that's fine. But if you're at a loss without it, that's not.

It doesn't belong in the education system or in educational books.

1 more reply

abhgh5y ago

Nowado5y ago

Professional costs $100k+ to employ partially because only those able to afford those tools for training get into the field.

1 more reply

scottlocklin5y ago

https://github.com/ctgk/PRML

https://github.com/PRML/PRMLT

svantana5y ago

1 more reply

srvmshr5y ago

The only thing I find nice in what Mathworks offers nowadays is their caps & T-shirts at conferences. MATLAB is on Medicare in deep learning times.

JHonaker5y ago

This is probably my favorite introductory machine learning book. The fact that he places almost everything in the language of graphical models is such a good common ground to build off.

I don't see a clear list of differences between this new edition. Does anyone know what's new?

master_yoda_15y ago

Agree with you. But none of this is useful for practical (applied) machine learning. I don't want to disappoint you but you can read it as machine learning porn, but otherwise don't waste time on it.

JHonaker5y ago

I mean, as a graduate student, it was definitely incredibly useful. As a practicing data scientist, I’d have to say that it’s also incredibly useful.

I’ve used this stuff, and more often, the ideas taught, to break down a problem into a tackle-able set of pieces more times than I can count.

hnracer5y ago

Are you saying the book itself is ML porn?

master_yoda_15y ago

Yes, unless you are among 20 top researchers who are working on frontier of ml. Bayesian probabilistic techniques does not work or are very slow for any practical purpose.

3 more replies

iamcreasy5y ago

What is the advantage of placing everything in the language of graphical model? How does the other ML book do it?

JHonaker5y ago

iamcreasy5y ago

1 more reply

abhinav225y ago

https://www.amazon.com/John-Freunds-Mathematical-Statistics-...

harperlee5y ago

> it is my favourite book

This idea of compared review is useful here:

https://fivebooks.com/

And here:

https://www.lesswrong.com/posts/xg3hXCYQPJkwHyik2/the-best-t...

abhinav225y ago

This is a very good post and I agree with your comments.

saeranv5y ago

Looking at the table of contents (for someone who is not familiar with the term 'Probabilistic Machine Learning'), is this just covering typical ML methods through the lens of probability?

master_yoda_15y ago

amkkma5y ago

But does it aid in understanding regular models, as they might have a bayesian interpretation?

master_yoda_15y ago

1 more reply

fithisux5y ago

therobot245y ago

while book 1 looks great, it appears that book 2 is still very rough: https://probml.github.io/pml-book/book2.html

bite_tongue5y ago

Excited to check it out, this was a game changer for me. It turned me on to Gaussian Processes, which I think are a really fun tool.

dhairya5y ago

also recommend probabilistic methods for hackers as another resource to explore this space:

https://camdavidsonpilon.github.io/Probabilistic-Programming...

mark_l_watson5y ago

Thanks, that looks like a good resource for my learning style. I usually learn new things by playing around with code before diving into theory (I know that this sounds backwards).

master_yoda_15y ago

Yeh this is more practical and useful book.

oleg_myrk5y ago

The original 2012 book was awesome!

Would love to get my hands on the draft for "Probabilistic Machine Learning: Advanced Topics".

https://probml.github.io/pml-book/book2.html

singhrac5y ago

In my opinion (having read both books and TA’d courses using both), Murphy is a significantly better book than Bishops’s Machine Learning. I’m very excited about a sequel!

jaredtn5y ago

Quite excited to read this. Murphy does a great job of explaining concepts from first principles.

sjg0075y ago

To say something is "machine learning", I think means that you should show the code not just equations and derivations.

I mean if you only show math and derivations, what's the point? To show off what you know? How is that helpful?

joshvm5y ago

This exists actually, it's not complete yet (I think?) but it covers a lot of the material in the book:

https://github.com/probml/pyprobml

master_yoda_15y ago

Yeh at least he gives a full derivation for every equation :)

andrewnc5y ago

I helped with this project last year. It's very cool to see it taking shape

j / k navigate · click thread line to collapse