Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
undefined | Better HN
0 points
ls612
10mo ago
0 comments
Share
This is a dumb question I know, but how expensive is model distillation? How much training hardware do you need to take something like this and create a 7B and 12B version for consumer hardware?
0 comments
default
newest
oldest
johnb231
10mo ago
The process involves running the original model. You can rent these big GPUs for ~$10 per hour, so that is ~$160 per hour for as long as it takes
qeternity
10mo ago
You can rent H100s for $1.50/gpu/hr these days.
j
/
k
navigate · click thread line to collapse