undefined | Better HN

0 pointsdanielmarkbruce3mo ago0 comments

One needs about 12 to 18 hours of linear algebra to work though the papers, not 12 to 18 months. The vast majority of stuff in AI/ML papers is just "we tried X and it worked!".

0 comments

miki1232113mo ago

You can understand 95+% of current LLM / neural network tech if you know what matrices are (on the "2d array" level, not the deeper lin alg intuition level), and if you know how to multiply them (and have an intuitive understanding why a matrix is a mapping between latent spaces and how a matrix can be treated as a list of vectors). Very basic matrix / tensor calculus comes in useful, but that's not really part of lin alg.

There are places where things like eigenvectors / eigenvalues or svd come into play, but those are pretty rare and not part of modern architectures (tbh, I still don't really have a good intuition for them).

devmor3mo ago

I was about to respond with a similar comment. The majority of the underlying systems are the same and can be understood if you know a decent amount of vector math. That last 3-5% can get pretty mystical, though.

Honestly, where stuff gets the most confusing to me is when the authors of the newer generations of AI papers invent new terms for existing concepts, and then new terms for combining two of those concepts, then new terms for combining two of those combined concepts and removing one... etc.

Some of this redefinition is definitely useful, but it turns into word salad very quickly and I don't often feel like teaching myself a new glossary just to understand a paper I probably wont use the concepts in.

buildbot3mo ago

This happens so much! It’s actually imo much more important to be able to let the math go and compare concepts vs. the exact algorithms. It’s much more useful to have semantic intuition than concrete analysis.

Being really good at math does let you figure out if two techniques are mathematically the same but that’s fairly rare (it happens though!)

whimsicalism3mo ago

> There are places where things like eigenvectors / eigenvalues or svd come into play, but those are pretty rare and not part of modern architectures (tbh, I still don't really have a good intuition for them)

This stuff is part of modern optimizers. You can often view a lot of optimizers as doing something similar to what is called mirror/'spectral descent.'

tomrod3mo ago

Indeed. "Spectral" describes the collection of eigenvalues!

tomrod3mo ago

Eigenvector/eigenvalues: direction and amount of stretch a matrix pushes a basis vector.

cultofmetatron3mo ago

for anyone looking to get into it, mathacademy has a full zero to everythign you need pathway that you can follow to mastery

https://mathacademy.com/courses/mathematics-for-machine-lear...

DenisM3mo ago

There is no mention of llm there?

cultofmetatron3mo ago

if you want to use llms, just download one and play with it. if you want to understand llms enough to push research forward, learn the underlying math

1 more reply

gpjt3mo ago

OP here -- agreed! I tried to summarise (at least to my current level of knowledge) those 12-18 hours here: https://www.gilesthomas.com/2025/09/maths-for-llms

j / k navigate · click thread line to collapse

0 comments

miki1232113mo ago

devmor3mo ago

buildbot3mo ago

Being really good at math does let you figure out if two techniques are mathematically the same but that’s fairly rare (it happens though!)

whimsicalism3mo ago

This stuff is part of modern optimizers. You can often view a lot of optimizers as doing something similar to what is called mirror/'spectral descent.'

tomrod3mo ago

Indeed. "Spectral" describes the collection of eigenvalues!

tomrod3mo ago

Eigenvector/eigenvalues: direction and amount of stretch a matrix pushes a basis vector.

cultofmetatron3mo ago

for anyone looking to get into it, mathacademy has a full zero to everythign you need pathway that you can follow to mastery

https://mathacademy.com/courses/mathematics-for-machine-lear...

DenisM3mo ago

There is no mention of llm there?

cultofmetatron3mo ago

if you want to use llms, just download one and play with it. if you want to understand llms enough to push research forward, learn the underlying math

1 more reply

gpjt3mo ago

OP here -- agreed! I tried to summarise (at least to my current level of knowledge) those 12-18 hours here: https://www.gilesthomas.com/2025/09/maths-for-llms

j / k navigate · click thread line to collapse