undefined | Better HN

0 pointskaimac2y ago0 comments

I found these notes very useful. They also contain a nice summary of how LLMs/transformers work. It doesn't help that people can't seem to help taking a concept that has been around for decades (kernel smoothing) and giving it a fancy new name (attention).

http://bactra.org/notebooks/nn-attention-and-transformers.ht...

0 comments

CyberDildonics2y ago

It's just as bad a "convolutional neural networks" instead of "images being scaled down"

taneq2y ago

“Convolution” is a pretty well established word for taking an operation and applying it sliding-window-style across a signal. Convnets are basically just a bunch of Hough transforms with learned convolution kernels.

j / k navigate · click thread line to collapse

0 pointskaimac2y ago0 comments

http://bactra.org/notebooks/nn-attention-and-transformers.ht...

0 comments

CyberDildonics2y ago

It's just as bad a "convolutional neural networks" instead of "images being scaled down"

taneq2y ago

j / k navigate · click thread line to collapse