undefined | Better HN

0 pointsdiggan7mo ago0 comments

> Instead of us programming the AIs by feeding it lots of explicit hand-crafted rules/instructions, we're feeding the things with plain data instead, and the resulting behavior is much more black-box, less predictable and less controllable than anticipated.

I dunno, we do feed them lots of explicit hand-crafted rules/instructions, it's just that does don't go into the training process, but instead goes into the "system"/"developer" prompts, which is effectively the way you "program" the LLMs.

So you start out with nothing, adjust the weights based on the datasets until you reach something that allows you to "program" them via the system/developer prompts, which considering what's happening behind the scenes, is more controllable than expected.

0 comments

myrmidon7mo ago

Yes, but those hand-crafted rules are just input data, they don't actually constrain the behavior, they are just an attempt.

Similarly to how verbal instruction works with a child: You can tell it not to touch the hot stove, but the child still might try.

digganOP7mo ago

> they don't actually constrain the behavior

They do actually constraint the behavior, to various degrees of success which depends on the model, the system prompt, the inference parameters, the current context length and a lot more. Add in the new `developer` role and you have another venue for constraining the assistant outputs. Finally, structured outputs can help in forbidding specific terms too.

exe347mo ago

You can zap them with RL.

j / k navigate · click thread line to collapse