Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
0 points
xandrius
16d ago
0 comments
Share
I think people are misunderstanding reward functions and LLMs.
LLMs don't actually have a reward system like some other ML models.
undefined | Better HN
0 comments
default
newest
oldest
storus
16d ago
They are trained with one, and when you look at DPO you can say they contain an implicit one as well.
j
/
k
navigate · click thread line to collapse