Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
undefined | Better HN
0 points
ashvardanian
2y ago
0 comments
Share
Nope. Moreover, simulating it even with AVX-512 is quite an experience. Been postponing it for 2 years now... But first of all, you need to choose the version of float8 you want to implement, as the standards differ between GPU vendors.
0 comments
default
newest
oldest
janwas
2y ago
We use it in gemma.cpp [1]. This hybrid of E5M2 and E4M3 decodes to bf16 in ~14 instructions, so we can do that on the fly during dot products.
[1]: github.com/google/gemma.cpp
danielhanchen
2y ago
Congratulations on gemma.cpp!!
j
/
k
navigate · click thread line to collapse