“This allows runtime-efficient balancing of visual-fidelity and textual-alignment with a single 100KB trained model, which is five orders of magnitude smaller than the current state of the art.”
https://arxiv.org/abs/2305.01644
It's one of those sentences that if you know what it means, you know what it means. That said, the title needs the word "personalization" inserted before the word model, e.g.:
Nvidia intros 100kb text-to-image personalization model called Perfusion
No comments yet.