Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
TinyChat: Large Language Model on the Edge
(opens in new tab)
(hanlab.mit.edu)
2 points
enduku
2y ago
1 comments
Share
TinyChat: Large Language Model on the Edge | Better HN
1 comments
default
newest
oldest
enduku
OP
2y ago
TinyChat is an efficient, lightweight, Python-native serving framework for 4-bit LLMs by AWQ. It delivers 2.3x generation speed up on RTX4090.
Code:
https://github.com/mit-han-lab/llm-awq/tree/main/tinychat
j
/
k
navigate · click thread line to collapse