I recreated famous album covers with DALL-E (opens in new tab)

(lucytalksdata.com)

230 pointslucytalksdata3y ago128 comments

128 comments

DALL-E is still highly probabilistic in its judgement. For instance, in this article, it keeps putting "fire" in the the background on something that is likely to be on fire, rather than lighting up the person.

I have a similar experience. In my own experiment, I can't get DALL-E to turn off the street lamp at a bus stop in the darkness. I've tried "no light", "broken street lamp", etc.; no use. Any mention of "street lamp", the scene will include a working street lamp.

It's just more probable that a scene with a lamp in the darkness must have that lamp providing light, and this is something that DALL-E will not break out of.

whirlwin3y ago

I have experienced violent or harmful settings to be avoided by DALL-E. E.g. setting a person on fire. Same with drowning - seems to be impossible/hard to generate

zaik3y ago

Violent images likely have not been part of the training data for obvious reasons.

1 more reply

joshschreuder3y ago

I asked for "The Walking Dead directed by x" and got a content violation, I guess because my prompt included "dead".

1 more reply

joe_the_user3y ago

It didn't get "swimming to a dollar" right either. I think it doesn't "understand" spatial descriptions unless it happens to find an image match the description.

1 more reply

seesaw3y ago

I gave a prompt about a kid reading the Harry Potter book in the bed. It generated a kid wearing Harry Potter glasses reading a book. Pretty close, but also quite different from what I meant.

BrutalCoding3y ago

Hmm, I haven’t tried DALL-E yet but Midjourney mentions that negative prompting dont tend to work well. See here: https://midjourney.gitbook.io/docs/resource-links/guide-to-p...

They got a solution for that, which is using their —no argument. https://midjourney.gitbook.io/docs/imagine-parameters#prompt...

I haven’t checked if DALL-E has that option too.

Otherwise, you could try other variations like:

street light, light pole, lamp pole, lamppost, street lamp, light standard, or lamp standard

I copied that from Wikipedia :)

Best of luck!

babyshake3y ago

I think DALL-E might be programmed to not depict violence, which is why it doesn't "like" rendering humans on fire.

metadat3y ago

How long until an unrestricted DALL-E -level model emerges? Seems only a matter of time.

1 more reply

2OEH8eoCRo03y ago

The person is on fire. On top of the fire in the image.

bagels3y ago

I agree, the picture of the man is literally on the picture of the fire. Might have said one man is engulfed in flames, is immolated, alight, or other different phrasings.

Applejinx3y ago

Clearly it will have trouble making Rene Magritte paintings along those lines, too :)

MuffinFlavored3y ago

Is this "confirmed fixeD"/different in DALL-E 2?

randymy3y ago

Worth noting that DALL-E automatically “rejects attempts to create the likeness of any public figures, including celebrities". So, you wouldn't be able to get an image that included the 4 Liverpudlians. It does allow you to create fake faces. Might be fun to try and recreate Miles Davis Tutu, Aladdin Sane, Piano Man.

CamperBob23y ago

It will yell at you and threaten your account with termination if you try to create anything based on a living person's face, from what I can tell.

j793y ago

Yep. I once tried to create a cartoon dinosaur with hair in the “style” of an ex President (yellow and combed forward), and was warned with a potential ban.

cameronh903y ago

My experience was that if you name a celebrity (and the request isn't blocked) it quite often generated something that has the same general vibe of the target, while also looking entirely unlike them.

It reminds me of how TV shows often have a president that resembles the current president in superficial ways, while being distinct enough that they won't get sued.

I'd be interested to know why this happens.

Kaibeezy3y ago

There’s a high standard of harm for libel of public figures — knowingly false plus actual malice, iirc. Seems more likely it’s just bad at these details. How would you test it, since it’s not going to have sufficient or reliable source data for ordinary people?

ETA: OK, well, based on the following comments, it has a prohibition on living people, but you can’t libel the dead. So it either is bad at faces or it has a prohibition there too. The article would have mentioned if DALL-E said it wouldn’t render Lennon or Harrison. QED, bad at faces?

Michelangelo113y ago

Man, after seeing Stable Diffusion's output, DALL-E's looks just janky. Like watching a propeller plane after seeing a jet.

Crazy how fast the tech is moving.

twostorytower3y ago

DALL-E is capable of very high quality photorealistic images with the right prompts.

Here's one I made: https://imgur.com/yAzKkHb

“High detail, macro portrait photo, a handsome Australian man with a strong jaw line, blue eyes and brown hair, smiles at the camera, set in an outdoor pub at golden hour, shot using a ZEISS Supreme Prime lens”

Terretta3y ago

This particular prompt form, including the lens type, is from this reddit thread:

https://reddit.com/r/dalle2/comments/wsi97q/_/ikyjqhh/?conte...

> High detail, macro portrait photo, a [physical descriptor, regional identity, etc] man/woman with [eye color] eyes and [hair color] hair, smiles at the camera, set in a [field/dimly lit room/whatever] at golden hour, shot using a ZEISS Supreme Prime lens

The suggestions in that thread are quite effective, particularly the notion of reducing synthetic ‘beauty’ for a more human appearance.

svnpenn3y ago

> shot using a ZEISS Supreme Prime lens

that seems too specific. You wouldn't even request that in real life, unless you were trying to be ironically pretentious.

2 more replies

cowmix3y ago

After getting access to the beta, combined with all these HN posts -- I've determined DallE2 is neat but no where as great as the initial samples made me believe.

twostorytower3y ago

It is actually incredibly capable but if you're looking for photorealistic images of people, it needs very specific directions. I learned a lot from this person creating AI portrait photography: https://old.reddit.com/r/dalle2/comments/wsi97q/some_of_my_p...

nprateem3y ago

An upvote for whoever can give me a prompt to generate an image of someone who's been massaged so much their body has been flattened, as if they were made of dough or jelly or something.

I spent ages on this earlier getting nowhere. I'm starting to think DALL-E is better if you don't really know what you want and you're just fishing for ideas.

doerinrw3y ago

Ok you owe me $3! This is a really hard prompt, and only got close-ish with inpainting. Got the base figure with "massaged relaxed flattened person, flat, flat, flat, flat, claymation", then finally got it to add a not-too-terrifying face with "photograph of smiling white woman laying on the ground, promotional photography". Final tweaks to erase some artifacts (it really didn't want to believe the figure on the left was the referenced woman) was "photograph of a wooden floor with a white mat and small plants, overhead shot".

DALLE is hard! Curious to see if I can be beat.

https://imgur.com/a/tuyGjxp

nprateem3y ago

Ha ha that's much closer than I got, well done! Yeah it's difficult. I didn't think of flat, flat, flat.

GenericPoster3y ago

What are the results when you describe "a person without bones receiving a massage"?

_the_special_3y ago

> an image of someone who's been massaged so much their body has been flattened, as if they were made of dough or jelly

do you want a realistic looking one? 3d rendered? what do you have in mind exactly?

nprateem3y ago

Anything really. I tried cartoons, digital art, watercolours, pixar style. None worked.

birdyrooster3y ago

also what is the PCs budget?

fzfaa3y ago

I didn't know that I needed this.

soneca3y ago

Have anyone given a prompt to Dall-e of designing a company website and included “make it pop!”?

Maybe the AI will finally get what designers always complained about annoying clients.

codetrotter3y ago

Prompt:

> Create a website design for a company that sells propane and propane accessories. The name of the company is Strickland Propane, a local propane dealership. Make it pop.

Results:

* https://i.imgur.com/Jv7NJEN.png

* https://i.imgur.com/5Uiyg1R.png

* https://i.imgur.com/LL1DC11.png

* https://i.imgur.com/buv5BvS.png

So there you have it :p

codetrotter3y ago

Another one.

Prompt:

> Create a website design for ACME Corporation, a company which produces a wide array of products that are dangerous, unreliable or preposterous. Include customer quotes from a dissatisfied Wile E. Coyote prominently on the page. Make it pop.

Results:

* https://i.imgur.com/WK3QBj9.png

* https://i.imgur.com/Bghgzjt.png

* https://i.imgur.com/XLyYx76.png

* https://i.imgur.com/QTSyFTc.png

2 more replies

twic3y ago

STONIA PRONGAND! I would not buy a propane container which looked like that.

gerdesj3y ago

There are some quite intriguing answers to your query. We now have a thing that generates "art" in response to an input.

I suggest learning how to use a few graphics packages instead when you have a need. eg, this evening my wife wanted to print a birthday card for a dog that she is boarding. I fired up Inkscape and discovered a bug wrt handling an imported photo, scaling it and cropping it with a mask. lol! I was able to export it to a .png and print from something else.

Anyway, I was a little unfortunate and IT is still rather shakey. However, discussions about DALL-E are way more interesting than actually using it yourself. You see some seriously intelligent solutions to getting a desired image that is surely a form of programming. DALL-E does not even have a real natural language pre-processor because it is hamstrung to dump pejorative terms etc. This means that I can draw the classic, dreadful plan of the cargo hold of a slaver ship from the 17/18C and its cargo but I doubt DALL-E can.

The q&a about this thing are way more complex and interesting than its actual output! A is right, I it isn't. DALL-E is quite interesting but it really isn't intelligent. The intelligent bit is getting the input right for the desired output. Perhaps another model could be developed for that.

doerinrw3y ago

This isn't really how DALLE works AFAIK but a very fun idea nonetheless. Here's a quick more simplified experiment than the great work above: "website design mockup.", with and without "Make it pop!"

https://imgur.com/a/fy2Uq4x

yummybear3y ago

I love seeing people experiment with this technology. You can feel we’re on the cusp of something great - whatever it is, we’re just not quite there yet.

teddyh3y ago

> The question is, when the music blows up and the artwork becomes a signature, like the Rolling Stones' Tongue & Lips, who will own the copyright?

That’s what trademarks are for.

sieabah3y ago

If they had bothered to read the license agreement they would know whoever generates the art owns the copyright. Since it's a pay-for action the copyright is owned by the payee.

So really they generated these and never bothered to do the research of their own question.

teddyh3y ago

I have, like the article author presumably also does, a profund doubt as to whether generated works of this kind can be free of any copyright as long as the tool used is itself created using myriads of copyrighted works (as training data). I certainly do not trust the claims of the tool creators; they have all the incentive to ignore any copyright problems in order to get a tool which is usable.

And, as the article states:

> But seriously, how creative and original can you be with something that is trained on the works of millions of other creators?

> To me, it is unclear whether you can actually call these works your 'own' at all, because there's always someone else's touch on it.

> […] users of DALL-E will also never be sure whether they are generating something that is 'theirs' or just a knockoff of someone else's work.

hedora3y ago

OK, so I give you license to use this URL I just generated to generate your own stuff.

It's pay for action (send me a penny if you find anything worthwhile), and the copyright is owned by the payee:

https://images.google.com/

1 more reply

w0mbat3y ago

How do you know that the album covers are not part of the corpus of images that DALL-E was trained on in the first place?

phonescreen_man3y ago

Interestingly related, I just used AI image generation to create my EP cover.. first I tried running luciddrains dall e 2 PyTorch implementation using the prompt “death by Tetris EP album cover 2022” unfortunately I am using a Mac Pro so the gpu was not able to work. Then I tried imagen PyTorch implementation and used same keyword. This time it was working with the CPU unfortunately 2 days in we had a power outage so I had something but nothing complete. So I fed the generated image into the google dream generator and got my album cover!

https://willsimpson.hearnow.com/

fimdomeio3y ago

There are a lot of articles focusing on how close does DALL-E match some prompt, but I wonder if this is a suboptimal way to explore the medium.

What if you can get a lot more out of it by embracing the unexpected responses. Can it be a tool for exploring lateral thinking? You provide a prompt computer responds with images that are a prompt to human. A baby swiming next to a dolar bill outputs a distorted person face inside a dolar bill with some baby features, could be the start to a rabbit hole of prompts and images where you'll end up with something completly different than your initial expectations.

tsimionescu3y ago

It's interesting that the prompts that would do badly in a Google image search also seem to be the ones that make poor prompts. Basically, it seems that rather than describing a scene, you have to try to give an analogy for some image(s) that it might have in its training set - which is why, I believe, "banana in the style of Andy Warhol" produces a much higher quality result than "Outline of prism on a black background in the middle of scene splits a beam of light coming from the left side into rainbow on the right side".

1 more reply

xwdv3y ago

Although AI artists will destroy a lot of jobs, it will also create demand for new jobs for people who specialize in “paint overs” – taking a high concept output created by AI artists and touching it up to perfection.

Or perhaps even beyond just a paint over, and into the realm of recreating an entire AI artwork but with a human touch to get details just right.

Looking forward to it.

vanadium1st3y ago

I am already doing exactly that, and am getting paid for it.

I am a logo artist and I sell pre-made logo designs. Before the current AI services I had to come up with visual ideas by myself, like a caveman. Now I use the AI to generate a bunch of sketches and blurry ideas, and then use my graphic design experience to polish them up to a usable level. Here's how it looks. https://imgur.com/a/DKTsKdC

I am absolutely sure that a lot of people are doing the same right now, just keeping quiet about it.

xorcist3y ago

Thank you for that perspective. The linked work is clearly the work of a skilled professional.

I am intrigued by the use of AI as a form of creativity assist. As someone without any talent for this, the left pictures are useless for me, as I don't know how to take them into something like the pictures on the right. The point of a sketch is to show them to a customer, but if you would show these sketches to me, I wouldn't know which one would turn out great and which one wouldn't.

Given that, do you feel that the generated sketches are useful as a base sketch? I mean, you could probably have used any of the existing NFL teams logos and as inspiration, instead of letting the software remix them for you.

1 more reply

SturgeonsLaw3y ago

Just tested it after looking through your album, and DALL-E seems pretty good for getting a logo concept generated.

This is what I got for the prompt "honey badger logo for an NFL sports team"

https://imgur.com/a/i2nnii2

Dunno what's going on with that last guy though, perhaps he's had one too many concussions on the field...

1 more reply

avian3y ago

Here's another spin on this:

Imagine you're a software developer. In the near future, your manager wants a feature implemented in your company's app. He throws together a short mail with requirements and sends it to the prompt engineering department.

The prompt engineers fix a few typos, clean up the grammar and pepper a few secret sauce keywords around like "in the style of firefox", "in the style of kde". This get thrown into Microsoft Copilot 3.0 that barfs out a bunch of code.

Copilot's code has inconsistent indentation, three different method naming conventions and some variables named in a foreign language. It runs, but crashes if you tap on the lower left corner of the screen and allows you to drag the order quantity below 0. But that's ok, it's why we employ software engineers like you in our company. You will use your years of coding experience to touch up AI code to perfection. Better get the details just right until Monday all-hands!

Still looking forward to it?

PoignardAzur3y ago

Honestly, kind of yeah.

Something I've come to realise is that I'm much better at fixing other people's code than coding from scratch.

I could definitely work with that kind of tool.

chihuahua3y ago

That sounds better than my current job where I try to clean up other people's shitty code with atrocious dynamic typing, until the type checker is happy with it.

xwdv3y ago

sounds to me like I still get paid handsomely

avian3y ago

> taking a high concept output created by AI artists and touching it up to perfection.

When you put it like that, it sounds like a nightmare up side down world. It's not AI that's the tool for enhancing human creativity. It's humans that are the AI's tool, cleaning up the edge cases the AI artist can't handle (yet). It destroys creative jobs that give joy to people and creates assembly line jobs for them to slog in.

notahacker3y ago

The humans are also the people that come up with the concept in the first place...

Also sounds like an upside down use of the tool, since AI generated art really isn't that great at composition once you've got over the magic of it being able to respond to the prompt at all (but is much better at texture and filling in boring details), and the current state of the art AI tools can produce images which conform to a human guideline sketch...

8note3y ago

The AI is enabling the creativity of laypeople. That's still an enhancement, even if they don't have the technique to make a polished product

krapp3y ago

The purpose of jobs has never been to give people joy, but to extract value from their labor as quickly and efficiently as possible. Getting any kind of emotional satisfaction from one's work is a privilege which arguably points to an inefficiency in the market, as that energy is wasted which could better be put to productivity.

Artists, programmers and everyone else will have to find their joy somewhere other than than selling themselves to a corporation, once AI driven markets optimize away any room for "joy" and the like, and that's going to be one of the few good things about automation. The sooner we break people from the Puritan delusion that work defines a person's meaning and the value of their expression, the sooner we can once again decouple culture from the machinery of capitalism.

teddyh3y ago

The new copyright washing industry is nearly upon us.

soganess3y ago

This is the most underrated/prescient comment I've seen on hn. Once prompt engineering become a mature field this is going to be a serious issue.

Finally, the crossover of creative writing x cs... for graphic design? I can't wait to watch the lawyers recoil. ::Prepares Popcorn::

egypturnash3y ago

As a professional artist, this sounds like hell.

waveywaves3y ago

DALL E works really well if you are specific enough. When you don't get the intended result, it helps to identify the element which wasn't generated properly and then improve your description of the same.

"Two men, one of whom is on fire, shaking hands on an industrial lot." can be rewritten as, "Two men, shaking hands, standing on an industrial lot. Person on the right is on fire. Camera is 30 metres away."

You can go into more specifics of the framing and the angle from which you want the picture to be take. By default, DALL E will give you the most realistic generations to your prompts unless you mention "digital art" or a particular art style. I have gotten the best results when generating art instead of photos.

spike0213y ago

I haven't gotten to try it for myself, but I've read a few of these blogs that take you through generating examples or even look-alikes to older art pieces.

It surprisingly reminds me a lot of when I traveled to Japan without knowing really any Japanese. I needed to communicate not only with friends who don't know much English either, but also other people (like restaurant wait staff, train station staff, etc.).

I used Google Translate often, but many times I or my friend(s) (or the other people) would need to re-write our statements a few times until the translation result clicked well enough in each other's languages to be understandable.

google2341233y ago

Is the issue with faces a deliberate choice by the devs?

mcintyre19943y ago

In the case of celebrities yep. It can generate original photorealistic (or whatever style) faces, but they won’t let you generate the faces of real public figures AFAIK.

pjgalbraith3y ago

I've been recreating the 50 worst heavy metal album art using AI as well, currently at 30. Recently I've found Stable Diffusion plus DALL-E inpainting to be a good combination.

https://twitter.com/P_Galbraith/status/1560469019605344256

_HMCB_3y ago

Very cool. But it just goes to show the impact of human creativity. The conceptual aspect.

bryanrasmussen3y ago

No Smell the Glove cover, this is a black day for rock and roll!

remote_phone3y ago

If you gave those same instructions to humans I’m sure the output would be just as varied. I’d be interested to see a comparison between dall-e and humans.

wodenokoto3y ago

I'd love to see what it had come up with if simply prompted for "Album cover for Nevermind by Nirvana"

kaffeeringe3y ago

I wonder, how much energy is being burned for these kinds of experiments.

lalopalota3y ago

probably less than the amount of energy being burned by people browsing hn.

dsign3y ago

It's going to leave all those artists without a job, you just wait!!

NonNefarious3y ago

Went to use my invite, and OpenAI demands your PHONE NUMBER.

No excuse for it. Screw that.

sgt1013y ago

Look, it's trained on these images.

It's really great and cool and all - but it's retrieving things that it was trained on.

Show me something original it did.

l33tman3y ago

None of the AI generators retrieve things they were trained on, they don't work that way. Everything is original. However our definition of "original" might vary a bit, but so it will vary for any work of art any human artist do as well, as they are also trained on the same images. In the end, a lawsuit and a courtroom might have to decide if by chance someone or some AI creates a picture used commercially that seems similar to someone else's trademark or copyright.

Most of the images I've generated using Dalle 2 feels completely original. Just have a look at the reddit r/dalle2 and I'm pretty sure you'll also decide they're "original works".

SturgeonsLaw3y ago

Here's what I got with the prompt "something original":

https://imgur.com/a/RsQ2q1d

So I guess "something original" means "everything you've seen before on Instagram"

Engineering-MD3y ago

It’s hard when it was trained on everything pretty much. That’s the same problem as with GPT3. In my mind it’s still brute forcing a solution but instead of endless computation it’s endless examples

kgeist3y ago

Is this "bruteforcing" really different from what our brains do? We see thousands, millions of little things (examples) every day. Then we combine what we've seen into something new. Probably the only difference is that DALLE's training was done once while our brains are trained every day for 80+ years.

1 more reply

machinekob3y ago

Is i do smth with DALL-E auto top hacker news post i saw like 20 post like that in past 2 weeks.

system23y ago

DALL-E still seems very useless. Reminds me of the hype of Cardano.

andreyk3y ago

I wonder how long the novelty of DALL-E will persist. HN seems to upvote anything titled "I did X with DALL-E". This is a fun post, but it's not that interesting or surprising. Still worth a look don't get me wrong, but personally didn't learn anything new from it. (eg recreating the famous pink Floyd cover with "Outline of prism on a black background in the middle of scene splits a beam of light coming from the left side into rainbow on the right side" unsurprisingly worked somewhat well).

zone4113y ago

"DALL-E" in titles is about to be replaced with "Stable Diffusion." The beta website is already live but the interesting part will be specialized fine-tuned models based on public weights. There should be more technical experimentation since Stable Diffusion weights are only a few GBs and inference can be run on any recent GPU. There might also be more controversies because it can create more uncensored images.

Somebody posted a nice comparison between DALL-E 2, Mid Journey, and Stable Diffusion: https://twitter.com/fabianstelzer/status/1561019187451011074.

CamperBob23y ago

I dunno, I don't see a lot of examples on that page where Stable Diffusion outperformed DALL-E 2.

1 more reply

DubiousPusher3y ago

I actually thought there were some interesting things revealed here. Particularly interesting to me were the Nevermind and Abbey Road results.

Nevermind because it showed this weakness in the model in understanding what a dollar bill is. The most novel result being the image where the baby's visage appears inside the dollar.

Regarding Abbey Road, I found it interesting that the model's concept of a public person spans their lifetime evidenced by the images where the contemporary images are used. Also interesting to me is the model's weakness in understanding specific people.

Then again though, I haven't been clicking on every DALL-E post so maybe this is old news.

fifilura3y ago

For me, what was interesting to see was that it didn't manage very well with the Abbey Road question.

I had imagined that some parts of the training data would consist of the actual image, and it would find a good match for it somewhere deep into it's artificial consciousness.

2 more replies

hombre_fatal3y ago

It will probably get boring the same time HNers stop confusing “hmm this isn’t that interesting guys” as notable commentary that needs to be shared.

andreyk3y ago

Well then it will never get boring, will it... Fair enough, it's just been striking to see how many of these sort of posts have been popping up. Like I said, it's still fun and worth a look.

Perhaps as a blogger I am extra salty when relatively low effort stuff gets upvoted over things that take a lot of work to write (in some cases, those being things I have written). But hey, that's life.

1 more reply

Apocryphon3y ago

DALL-E generate us some pics of unhappy HN curmudgeons

3 more replies

tsimionescu3y ago

To me the most interesting thing about the article was actually just how bad the results are. It's interesting that you picked the Dark side of the moon one, as to me that seemed by far the worse one - none of the pictures really resembled the simple geometry of the original. While I understand why recreating the Nevermind cover was difficult, it was surprising that recreating such a simple geometric pattern failed so spectacularly.

The only cover that really worked from my point of view was the Velvet Underground one, and perhaps the Rolling Stones one. Abbey Road came closer than what I thought it would, but was pretty bad ultimately, and the other three really had nothing usable.

rjtavares3y ago

Just like images after text, pretty sure once the novelty of images ends we'll move to animations, music, movies, computer games, and so on.

Apocryphon3y ago

For the past few years, I’ve long considered Hollywood failed adaptations to be a non-issue because so long as industrial civilization exists, media companies are going to squeeze what they can out of IPs. So while Game of Thrones might have ended quite poorly, I figure we’re only a few decades away from a different rights-holder giving it another go.

Thus, I look forward to the AI-generated versions of famous works that deepfake the original cast into speaking (hopefully) better-written dialogue. Imagine when this technology is widespread, fanfic authors rendering their interpretation of works with the descendants of DALL-E. Everyone gets the dream adaptations and sequels and finales they want.

1 more reply

xor993y ago

Generative art and by extension DALL-E has a very fast attention decay imo. We can't avoid noticing patterns and getting bored by them. This makes art and music fun because the things that stick and stay interesting JUST DO for some unknown reason.

felipelalli3y ago

I love this kind of bad humor of HN.

Kaibeezy3y ago

Then you’re gonna loooove these computer-generated “jokes”…

http://joking.abdn.ac.uk/jokebook.shtml

j / k navigate · click thread line to collapse

128 comments

powersnail3y ago

It's just more probable that a scene with a lamp in the darkness must have that lamp providing light, and this is something that DALL-E will not break out of.

whirlwin3y ago

I have experienced violent or harmful settings to be avoided by DALL-E. E.g. setting a person on fire. Same with drowning - seems to be impossible/hard to generate

zaik3y ago

Violent images likely have not been part of the training data for obvious reasons.

1 more reply

joshschreuder3y ago

I asked for "The Walking Dead directed by x" and got a content violation, I guess because my prompt included "dead".

1 more reply

joe_the_user3y ago

It didn't get "swimming to a dollar" right either. I think it doesn't "understand" spatial descriptions unless it happens to find an image match the description.

1 more reply

seesaw3y ago

I gave a prompt about a kid reading the Harry Potter book in the bed. It generated a kid wearing Harry Potter glasses reading a book. Pretty close, but also quite different from what I meant.

BrutalCoding3y ago

Hmm, I haven’t tried DALL-E yet but Midjourney mentions that negative prompting dont tend to work well. See here: https://midjourney.gitbook.io/docs/resource-links/guide-to-p...

They got a solution for that, which is using their —no argument. https://midjourney.gitbook.io/docs/imagine-parameters#prompt...

I haven’t checked if DALL-E has that option too.

Otherwise, you could try other variations like:

street light, light pole, lamp pole, lamppost, street lamp, light standard, or lamp standard

I copied that from Wikipedia :)

Best of luck!

babyshake3y ago

I think DALL-E might be programmed to not depict violence, which is why it doesn't "like" rendering humans on fire.

metadat3y ago

How long until an unrestricted DALL-E -level model emerges? Seems only a matter of time.

1 more reply

2OEH8eoCRo03y ago

The person is on fire. On top of the fire in the image.

bagels3y ago

I agree, the picture of the man is literally on the picture of the fire. Might have said one man is engulfed in flames, is immolated, alight, or other different phrasings.

Applejinx3y ago

Clearly it will have trouble making Rene Magritte paintings along those lines, too :)

MuffinFlavored3y ago

Is this "confirmed fixeD"/different in DALL-E 2?

randymy3y ago

CamperBob23y ago

It will yell at you and threaten your account with termination if you try to create anything based on a living person's face, from what I can tell.

j793y ago

Yep. I once tried to create a cartoon dinosaur with hair in the “style” of an ex President (yellow and combed forward), and was warned with a potential ban.

cameronh903y ago

It reminds me of how TV shows often have a president that resembles the current president in superficial ways, while being distinct enough that they won't get sued.

I'd be interested to know why this happens.

Kaibeezy3y ago

Michelangelo113y ago

Man, after seeing Stable Diffusion's output, DALL-E's looks just janky. Like watching a propeller plane after seeing a jet.

Crazy how fast the tech is moving.

twostorytower3y ago

DALL-E is capable of very high quality photorealistic images with the right prompts.

Here's one I made: https://imgur.com/yAzKkHb

Terretta3y ago

This particular prompt form, including the lens type, is from this reddit thread:

https://reddit.com/r/dalle2/comments/wsi97q/_/ikyjqhh/?conte...

The suggestions in that thread are quite effective, particularly the notion of reducing synthetic ‘beauty’ for a more human appearance.

svnpenn3y ago

> shot using a ZEISS Supreme Prime lens

that seems too specific. You wouldn't even request that in real life, unless you were trying to be ironically pretentious.

2 more replies

cowmix3y ago

After getting access to the beta, combined with all these HN posts -- I've determined DallE2 is neat but no where as great as the initial samples made me believe.

twostorytower3y ago

nprateem3y ago

An upvote for whoever can give me a prompt to generate an image of someone who's been massaged so much their body has been flattened, as if they were made of dough or jelly or something.

I spent ages on this earlier getting nowhere. I'm starting to think DALL-E is better if you don't really know what you want and you're just fishing for ideas.

doerinrw3y ago

DALLE is hard! Curious to see if I can be beat.

https://imgur.com/a/tuyGjxp

nprateem3y ago

Ha ha that's much closer than I got, well done! Yeah it's difficult. I didn't think of flat, flat, flat.

GenericPoster3y ago

What are the results when you describe "a person without bones receiving a massage"?

_the_special_3y ago

> an image of someone who's been massaged so much their body has been flattened, as if they were made of dough or jelly

do you want a realistic looking one? 3d rendered? what do you have in mind exactly?

nprateem3y ago

Anything really. I tried cartoons, digital art, watercolours, pixar style. None worked.

birdyrooster3y ago

also what is the PCs budget?

fzfaa3y ago

I didn't know that I needed this.

soneca3y ago

Have anyone given a prompt to Dall-e of designing a company website and included “make it pop!”?

Maybe the AI will finally get what designers always complained about annoying clients.

codetrotter3y ago

Prompt:

> Create a website design for a company that sells propane and propane accessories. The name of the company is Strickland Propane, a local propane dealership. Make it pop.

Results:

* https://i.imgur.com/Jv7NJEN.png

* https://i.imgur.com/5Uiyg1R.png

* https://i.imgur.com/LL1DC11.png

* https://i.imgur.com/buv5BvS.png

So there you have it :p

codetrotter3y ago

Another one.

Prompt:

Results:

* https://i.imgur.com/WK3QBj9.png

* https://i.imgur.com/Bghgzjt.png

* https://i.imgur.com/XLyYx76.png

* https://i.imgur.com/QTSyFTc.png

2 more replies

twic3y ago

STONIA PRONGAND! I would not buy a propane container which looked like that.

gerdesj3y ago

There are some quite intriguing answers to your query. We now have a thing that generates "art" in response to an input.

doerinrw3y ago

This isn't really how DALLE works AFAIK but a very fun idea nonetheless. Here's a quick more simplified experiment than the great work above: "website design mockup.", with and without "Make it pop!"

https://imgur.com/a/fy2Uq4x

yummybear3y ago

I love seeing people experiment with this technology. You can feel we’re on the cusp of something great - whatever it is, we’re just not quite there yet.

teddyh3y ago

> The question is, when the music blows up and the artwork becomes a signature, like the Rolling Stones' Tongue & Lips, who will own the copyright?

That’s what trademarks are for.

sieabah3y ago

If they had bothered to read the license agreement they would know whoever generates the art owns the copyright. Since it's a pay-for action the copyright is owned by the payee.

So really they generated these and never bothered to do the research of their own question.

teddyh3y ago

And, as the article states:

> But seriously, how creative and original can you be with something that is trained on the works of millions of other creators?

> To me, it is unclear whether you can actually call these works your 'own' at all, because there's always someone else's touch on it.

> […] users of DALL-E will also never be sure whether they are generating something that is 'theirs' or just a knockoff of someone else's work.

hedora3y ago

OK, so I give you license to use this URL I just generated to generate your own stuff.

It's pay for action (send me a penny if you find anything worthwhile), and the copyright is owned by the payee:

https://images.google.com/

1 more reply

w0mbat3y ago

How do you know that the album covers are not part of the corpus of images that DALL-E was trained on in the first place?

phonescreen_man3y ago

https://willsimpson.hearnow.com/

fimdomeio3y ago

There are a lot of articles focusing on how close does DALL-E match some prompt, but I wonder if this is a suboptimal way to explore the medium.

tsimionescu3y ago

1 more reply

xwdv3y ago

Or perhaps even beyond just a paint over, and into the realm of recreating an entire AI artwork but with a human touch to get details just right.

Looking forward to it.

vanadium1st3y ago

I am already doing exactly that, and am getting paid for it.

I am absolutely sure that a lot of people are doing the same right now, just keeping quiet about it.

xorcist3y ago

Thank you for that perspective. The linked work is clearly the work of a skilled professional.

1 more reply

SturgeonsLaw3y ago

Just tested it after looking through your album, and DALL-E seems pretty good for getting a logo concept generated.

This is what I got for the prompt "honey badger logo for an NFL sports team"

https://imgur.com/a/i2nnii2

Dunno what's going on with that last guy though, perhaps he's had one too many concussions on the field...

1 more reply

avian3y ago

Here's another spin on this:

Still looking forward to it?

PoignardAzur3y ago

Honestly, kind of yeah.

Something I've come to realise is that I'm much better at fixing other people's code than coding from scratch.

I could definitely work with that kind of tool.

chihuahua3y ago

That sounds better than my current job where I try to clean up other people's shitty code with atrocious dynamic typing, until the type checker is happy with it.

xwdv3y ago

sounds to me like I still get paid handsomely

avian3y ago

> taking a high concept output created by AI artists and touching it up to perfection.

notahacker3y ago

The humans are also the people that come up with the concept in the first place...

8note3y ago

The AI is enabling the creativity of laypeople. That's still an enhancement, even if they don't have the technique to make a polished product

krapp3y ago

teddyh3y ago

The new copyright washing industry is nearly upon us.

soganess3y ago

This is the most underrated/prescient comment I've seen on hn. Once prompt engineering become a mature field this is going to be a serious issue.

Finally, the crossover of creative writing x cs... for graphic design? I can't wait to watch the lawyers recoil. ::Prepares Popcorn::

egypturnash3y ago

As a professional artist, this sounds like hell.

waveywaves3y ago

spike0213y ago

I haven't gotten to try it for myself, but I've read a few of these blogs that take you through generating examples or even look-alikes to older art pieces.

google2341233y ago

Is the issue with faces a deliberate choice by the devs?

mcintyre19943y ago

In the case of celebrities yep. It can generate original photorealistic (or whatever style) faces, but they won’t let you generate the faces of real public figures AFAIK.

pjgalbraith3y ago

I've been recreating the 50 worst heavy metal album art using AI as well, currently at 30. Recently I've found Stable Diffusion plus DALL-E inpainting to be a good combination.

https://twitter.com/P_Galbraith/status/1560469019605344256

_HMCB_3y ago

Very cool. But it just goes to show the impact of human creativity. The conceptual aspect.

bryanrasmussen3y ago

No Smell the Glove cover, this is a black day for rock and roll!

remote_phone3y ago

If you gave those same instructions to humans I’m sure the output would be just as varied. I’d be interested to see a comparison between dall-e and humans.

wodenokoto3y ago

I'd love to see what it had come up with if simply prompted for "Album cover for Nevermind by Nirvana"

kaffeeringe3y ago

I wonder, how much energy is being burned for these kinds of experiments.

lalopalota3y ago

probably less than the amount of energy being burned by people browsing hn.

dsign3y ago

It's going to leave all those artists without a job, you just wait!!

NonNefarious3y ago

Went to use my invite, and OpenAI demands your PHONE NUMBER.

No excuse for it. Screw that.

sgt1013y ago

Look, it's trained on these images.

It's really great and cool and all - but it's retrieving things that it was trained on.

Show me something original it did.

l33tman3y ago

Most of the images I've generated using Dalle 2 feels completely original. Just have a look at the reddit r/dalle2 and I'm pretty sure you'll also decide they're "original works".

SturgeonsLaw3y ago

Here's what I got with the prompt "something original":

https://imgur.com/a/RsQ2q1d

So I guess "something original" means "everything you've seen before on Instagram"

Engineering-MD3y ago

kgeist3y ago

1 more reply

machinekob3y ago

Is i do smth with DALL-E auto top hacker news post i saw like 20 post like that in past 2 weeks.

system23y ago

DALL-E still seems very useless. Reminds me of the hype of Cardano.

andreyk3y ago

zone4113y ago

Somebody posted a nice comparison between DALL-E 2, Mid Journey, and Stable Diffusion: https://twitter.com/fabianstelzer/status/1561019187451011074.

CamperBob23y ago

I dunno, I don't see a lot of examples on that page where Stable Diffusion outperformed DALL-E 2.

1 more reply

DubiousPusher3y ago

I actually thought there were some interesting things revealed here. Particularly interesting to me were the Nevermind and Abbey Road results.

Nevermind because it showed this weakness in the model in understanding what a dollar bill is. The most novel result being the image where the baby's visage appears inside the dollar.

Then again though, I haven't been clicking on every DALL-E post so maybe this is old news.

fifilura3y ago

For me, what was interesting to see was that it didn't manage very well with the Abbey Road question.

I had imagined that some parts of the training data would consist of the actual image, and it would find a good match for it somewhere deep into it's artificial consciousness.

2 more replies

hombre_fatal3y ago

It will probably get boring the same time HNers stop confusing “hmm this isn’t that interesting guys” as notable commentary that needs to be shared.

andreyk3y ago

Well then it will never get boring, will it... Fair enough, it's just been striking to see how many of these sort of posts have been popping up. Like I said, it's still fun and worth a look.

1 more reply

Apocryphon3y ago

DALL-E generate us some pics of unhappy HN curmudgeons

3 more replies

tsimionescu3y ago

rjtavares3y ago

Just like images after text, pretty sure once the novelty of images ends we'll move to animations, music, movies, computer games, and so on.

Apocryphon3y ago

1 more reply

xor993y ago

felipelalli3y ago

I love this kind of bad humor of HN.

Kaibeezy3y ago

Then you’re gonna loooove these computer-generated “jokes”…

http://joking.abdn.ac.uk/jokebook.shtml

j / k navigate · click thread line to collapse