Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
undefined | Better HN
0 points
cjbprime
1y ago
0 comments
Share
Wouldn't expect that to work at all.
0 comments
default
newest
oldest
hedgehog
1y ago
Ollama (which wraps llama.cpp) supports splitting a model across devices so you get some acceleration even on models too big to fit entirely in GPU memory.
j
/
k
navigate · click thread line to collapse