undefined | Better HN

0 pointsbravura2y ago0 comments

IPAdapter, I am curious if there are useful GUIs for this? Creating image masks through uploading to colab is not so cute.

0 comments

orbital-decay2y ago

Here's one example: https://github.com/Acly/krita-ai-diffusion/

But generally, most other UIs support it. It has serious limitations though, for example it center-crops the input to 224x224px. (which is enough for a surprisingly large amount of uses, but not enough for many others)

ttul2y ago

Yes. I discussed this issue with the author of the ComfyUI IP-Adapter nodes. It would doubtless be handy if someone could end-to-end train a higher resolution IP-Adapter model that integrated its own variant of CLIPVision that is not subject to the 224px constraint. I have no idea what kind of horsepower would be required for that.

A latent space CLIPVision model would be cool too. Presumably you could leverage the semantic richness of the latent space to efficiently train a more powerful CLIPVision. I don’t know whether anyone has tried this. Maybe there is a good reason for that.

j / k navigate · click thread line to collapse

0 comments

orbital-decay2y ago

Here's one example: https://github.com/Acly/krita-ai-diffusion/

ttul2y ago

j / k navigate · click thread line to collapse