undefined | Better HN

0 pointsdoctorpangloss1y ago0 comments

> CLIP is just for an embedding for images and text, right?

Yes, which is what makes text-to-image generation possible. You can go ahead and try using Stable Diffusion models, or even the incredibly high quality Flux, with no text "embedding" (or whatever you want to call it), and judge for yourself if those outputs are useful.

0 comments

drdeca1y ago

I get that, but my question is, “how can using the guidance from CLIP possibly make the resulting image infringe on copyright?”. I’m not saying that the CLIP part isn’t necessary for it to be useful.

j / k navigate · click thread line to collapse

0 pointsdoctorpangloss1y ago0 comments

> CLIP is just for an embedding for images and text, right?

0 comments

drdeca1y ago

j / k navigate · click thread line to collapse