Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
Evaluating AI agents: Real-world lessons from building agentic systems at Amazon | Better HN
Evaluating AI agents: Real-world lessons from building agentic systems at Amazon
(opens in new tab)
(aws.amazon.com)
3 points
bpedro
1mo ago
1 comments
Share
1 comments
default
newest
oldest
lumpilumpi
1mo ago
I get the justification but I found it hard to understand how the actual evaluation at each step is carried out. For example, is there any calibration to some human gold standard involved or is the AI evaluating the AI without calibration/oversight?
j
/
k
navigate · click thread line to collapse