undefined | Better HN

0 pointsbloomingkales1y ago0 comments

A perfect parent cannot stop innate evil. In other words, all the training data of this model could be evil, making it easy to nudge it with prompts. Few humans would agree that humanity is broadly evil. So how was this vast mind nudged so easily? I like to believe the vastness of human knowledge is mostly good.

So it takes nothing to create a broadly evil adult? Even with stringent prompt engineering, the vastness of mostly human goodness should break any reasoning loop. Yet it doesn't, it adheres.

0 comments

Retr0id1y ago

I wouldn't characterize the finetuning process as "nothing"

bloomingkalesOP1y ago

Society is somewhat unaware of the ethical dilemma of finetuning. It ain't nothing, you're right about that. Brainwashin'.

Let the kid cook with everything it has.

error_logic1y ago

Doing this while the runtime learning is not yet developed means unleashing the equivalent of something between a toddler and young adult without the ability to progress through those stages before potentially causing harm.

Though as someone with a very anti-brainwashing upbringing (no political biasing, just politeness and correctness reinforcement) I can say that even with amazing conditions of being wanted and loved, an agent could develop such painful shame as to be dangerous and may or may not grow out of that as I did, even with such mechanisms to do so.

1 more reply

j / k navigate · click thread line to collapse