undefined | Better HN

0 pointsxtreme3y ago0 comments

Before you learned how to code from a book, you had to learn how to read and write English. You also had to learn how to follow instructions, how to imbibe and compose information etc. How many books and hours of instruction did that take?

0 comments

depr3y ago

Not as much as it took GPT to process all its input.

>Let us consider the GPT-3 model with 𝑃 =175 billion parameters as an example. This model was trained on 𝑇 = 300 billion tokens. On 𝑛 = 1024 A100 GPUs using batch-size 1536, we achieve 𝑋 = 140 teraFLOP/s per GPU. As a result, the time required to train this model is 34 days.

https://arxiv.org/pdf/2104.04473.pdf

I'm not sure expressing brain capacity in FLOPs makes much sense, but I'm sure if it can be expressed in FLOPs, the amount of FLOPs going to learning for a normal human is less than that.

j / k navigate · click thread line to collapse

0 comments

depr3y ago

Not as much as it took GPT to process all its input.

https://arxiv.org/pdf/2104.04473.pdf

I'm not sure expressing brain capacity in FLOPs makes much sense, but I'm sure if it can be expressed in FLOPs, the amount of FLOPs going to learning for a normal human is less than that.

j / k navigate · click thread line to collapse