undefined | Better HN

0 pointscolordrops4d ago0 comments

Maybe I'm misunderstanding how these models work, but isn't it more the responsibility of the harness and its prompts rather than the model itself to make sure that a result is generated with explicit sources?

0 comments

PaulRobinson4d ago

Probably.

"All" a model is doing is predicting the next words, based on the statistical distribution of words it has seen similar to the ones read/produced so far.

We push a model towards a particular set of distributions through context. If I ask a model "What is the capital of France?", there is a non-zero chance it goes down the dad joke answer of "The letter F". The far more likely option is "Paris", because the joke appears much less often in training material, but if I wanted to be absolutely sure of getting a consistent geography answer I'd address that with additional context. We can add context via prompts, RAG, agents, skills and so on.

However, when training a model, we select the material. We could show it a lot more geography information (or dad jokes!), and skew the statistical distribution in the direction we wanted. We could also decide to design the system prompt towards the direction we prefer - which the user would interpret as "the model" - and so nudge the context model-wide. We can also construct the interaction to iterate on context with a specific framing and call it "reasoning".

In this specific example, you could therefore solve the problem by a) training skewed towards mathematical papers, which likely degrades performance in general and likely for the specific case too, b) train the user to provide better context/prompts for mathematical work, shifting the workload to them which feels very "a la 2024", c) publish agents and skills that are tailored to mathematics work (very "a la 2026"), d) tweak the system prompt for when the model is doing mathematics work, which the user would see as "the model" doing the change, but you and I might look under the hood and say that is in the harness or a specific type of prompt, or e) add "reasoning" execution that is set to focus on mathematical formatting, or f) a mixture of the above.

Right now we're probably looking at agents and skills. I think over time we're going to see smaller models targets towards domains with a mixture of all of it, where some of this sits at user configurable levels, and some is "baked in" via training, system prompts and execution modes, but from a user perspective it's all just "the model".

peepee19824d ago

I don't think you are misunderstanding how models work, but I think the parent comment meant that the training of the models should push them to include attributions in their native output so they will more likely do so without reinforcement through the harness.