Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
undefined | Better HN
0 points
sergiotapia
0y ago
0 comments
Share
the only reason they are fast is because the models they host are severely quantized so i've heard.
0 comments
default
newest
oldest
jacob019
0y ago
Huh. I heard a podcast with the founder talking about their custom hardware, but quantization would explain it.
christianqchung
0y ago
Quantization alone does not explain it. It's mostly custom hardware[0].
[0]
https://groq.com/the-groq-lpu-explained/
zargon
0y ago
Why repeat this nonsense when it’s so trivial to just check. The reason Groq is fast is because they employ absolutely ludicrous amounts of SRAM. (Which is 10 times faster than the fastest VRAM.)
behnamoh
0y ago
they responded to my tweet last year and said they didn't quantize the models.
boroboro4
0y ago
It's very hard to find right now but I'm sure they said they don't quantize KV cache, but their weights are in fp8.
j
/
k
navigate · click thread line to collapse