undefined | Better HN

0 pointsEZ-E2y ago0 comments

Out of curiosity I fed ChatGPT 4 a few of the challenges through a photo (unclear if Gemini takes live video feed as input but GPT does not afaik) and it did pretty well. It was able to tell a duck was being drawn at an earlier stage before Gemini did. Like Gemini it was able to tell where the duck should go - to the left path to the swan. Because and I quote "because ducks and swans are both waterfowl, so the swan drawing indicates a category similarity (...)"

0 comments

nuccy2y ago

Gemini made a mistake, when asked if the rubber duck floats, it says (after squeaking comment): "it is a rubber duck, it is made of a material which is less dense than water". Nope... rubber is not less dense (and yes, I checked after noticing, rubber duck is typically made of synthetic vinyl polymer plastic [1] with density of about 1.4 times the density of water, so duck floats because of air-filled cavity inside and not because of material it is made of). So it is correct conceptually, but misses details or cannot really reason based on its factual knowledge.

P.S. I wonder how these kind of flaws end up in promotions. Bard made a mistake about JWST, which at least is much more specific and is farther from common knowledge than this.

1. https://ducksinthewindow.com/rubber-duck-facts/

elbasti2y ago

This is exactly the failure mode of GPTs that make me worry about the future idiotization of the world.

"Rubber ducks float because they are made of a material less dense than water" both is wrong but sounds reasonable. Call it a "bad grade school teacher" kind of mistake.

Pre-gpt, however, it's not the kind of mistake that would make it to print: people writing about rubber ducks were probably rubber duck experts (or had high school level science knowledge).

Print Is cite-able. Print perpetuates and reinforces itself. Some day someone will write a grade school textbook built with GPTs, that will have this incorrect knowledge, and so on.

But what will become of us when most gateways to knowledge are riddled with bullshit like this?

vineyardmike2y ago

I think the exact opposite will happen. When I was in school, we were taught never to trust online sources, and students always rolled their eyes at teachers for being behind the times. Meanwhile, the internet slowly filled up with junk and bad information and horrible clickbait and “alternative facts”. GPT hallucinations are just the latest version of unreliable “user generated content”. And it’s going to be everywhere, and indistinguishable from any other content.

People will gladly tell you there’s so much content online and it’s so great that you don’t need college anymore (somewhat true). The internet has more facts, more knowledge, updated more often, than any written source in time. It’s just being lost in a sea of junk. Google won’t be able to keep up at indexing all the meaningless content. They won’t be able to provide meaningful search and filtering against an infinite sea of half truths and trash. And then they’ll realize they shouldn’t try, and the index will become a lot more selective.

Today, no one should trust online information. You should only trust information that genuinely would have editors and proof teams and publishers. I think this will finally swing the pendulum back to the value of publishers and gatekeepers of information.

myaccountonhn2y ago

Yup! With search results being so bad these days, I've actually "regressed" to reading man pages, books and keeping personal notes. I found that I learn more and rely less on magic tools in the process.

da39a3ee2y ago

Have you heard of Wikipedia? It’s actually rather good.

alright25652y ago

> will become of us when most gateways to knowledge are riddled with bullshit like this?

I think we're already here. I asked Google Bard about the rubber ducks, then about empty plastic bottles. Bard apparently has a "fact check" mode that uses Google search.

It rated "The empty water bottle is made of plastic, which has a density lower than water" as accurate, using a Quora response which stated the same thing as a citation. We already have unknowlagable people writing on the internet; if anything these I hope these new AI things and the increased amount of bullshit will teach people to be more skeptical.

(and for what it's worth, ChatGPT 4 accurately answers the same question)

thehappypm2y ago

Some rubber is less dense than water, and certainly the type in a rubbery ducky would be

HarHarVeryFunny2y ago

FWIW those bathtub ducks are made of vinyl, not rubber, but more to the point given that it's hollow it's not the density of the material that determines whether it floats. A steel aircraft carrier floats too.

1 more reply

tim3332y ago

Modern 'rubber ducks' similar to the one in the picture aren't even made out of rubber but plastic. They get called rubber ducks because they were make of rubber when invented in the late 1800s. Amazing what you can learn on Wikipedia.

ec1096852y ago

GPT also fails at this:

> Which weighs more a pound of feathers or a pound of feathers

< A pound of feathers and a pound of bricks weigh the same. Both are one pound. The difference lies in volume and density: feathers take up more space and are less dense, while bricks are denser and take up less space.

Bard does better but still doesn't "get" it:

< Neither! Both a pound of feathers and a pound of feathers weigh the same, which is exactly one pound. In other words, they have the same mass.

< This is a classic riddle that plays on our expectations and assumptions. We often associate weight with density, so we might initially think that feathers, being lighter and fluffier than other materials, would weigh less than something more compact like metal. However, as long as both piles of feathers are measured to be exactly one pound, they will weigh the same.

At least it recognizes its limitations:

> My reason for mentioning other materials was likely due to my training data, which contains a vast amount of information on various topics, including the concept of weight and density. As a large language model, I sometimes tend to draw on this information even when it is not directly relevant to the current task. In this case, I made the mistake of assuming that comparing feathers to another material would help clarify the point, but it only served to complicate the matter.

For ChatGPT if you ask it to solve it step by step, it does better: https://chat.openai.com/share/7810e5a6-d381-48c3-9373-602c14...

jiggawatts2y ago

I noticed the same thing, and it's relevant to the comparison results of Gemini vs ChatGPT that GPT 3.5 makes the exact same mistake, but GPT 4 correctly explains that the buoyancy is caused by the air inside the ducky.

kolinko2y ago

I showed the choice between a bear and a duck to GPT4, and it told me that it depends on whether the duck wants to go to a peaceful place, or wants to face a challenge :D

z72y ago

Tried the crab image. GPT-4 suggested a cat, then a "whale or a similar sea creature".

bookmark12312y ago

The category similarity comment is amusing. My ChatGPT4 seems to have an aversion to technicality, so much that I’ve resorted to adding “treat me like an expert researcher and don’t avoid technical detail” in the prompt

EZ-EOP2y ago

My custom ChatGPT prompt, hope it helps. Taken from someone else but I cannot remember the source...

Be terse. Do not offer unprompted advice or clarifications. Speak in specific, topic relevant terminology. Do NOT hedge or qualify. Do not waffle. Speak directly and be willing to make creative guesses. Explain your reasoning. if you don’t know, say you don’t know. Remain neutral on all topics. Be willing to reference less reputable sources for ideas. Never apologize. Ask questions when unsure.

imjonse2y ago

The source is gwern

mptest2y ago

I wonder with "do not waffle" if it has any accidental aversion to anything waffle related.

civilitty2y ago

It creates a terminal pancake bias.

j / k navigate · click thread line to collapse