undefined | Better HN

0 pointsjamesblonde17d ago0 comments

The reference in the text to Anthropic’s “Towards Understanding Sycophancy in Language Models” is related to RLHF (reinforcement learning with human feedback).

Claude code uses primarily different "pathways" in Anthropic LLMs that were not post-trained with RLHF, but rather with RLVF (reinforcement learning with verifiable rewards).

So, his point about code being produced to please the user isn't valid from where I am sitting.

0 comments

No comments yet.

0 pointsjamesblonde17d ago0 comments

The reference in the text to Anthropic’s “Towards Understanding Sycophancy in Language Models” is related to RLHF (reinforcement learning with human feedback).

Claude code uses primarily different "pathways" in Anthropic LLMs that were not post-trained with RLHF, but rather with RLVF (reinforcement learning with verifiable rewards).

So, his point about code being produced to please the user isn't valid from where I am sitting.