Does this mean I should not garden because it's a variable reward? Of course not.
Sometimes I will go out fishing and I won't catch a damn thing. Should I stop fishing?
Obviously no.
So what's the difference? What is the precise mechanism here that you're pointing at? Because sometimes life is disappointing is a reason to do nothing. And yet.
Anthropic's optimization target is getting you to spend tokens, not produce the right answer. It's to produce an answer plausible enough but incomplete enough that you'll continue to spend as many tokens as possible for as long as possible. That's about as close to a slot machine as I can imagine. Slot rewards are designed to keep you interested as long as possible, on the premise that you _might_ get what you want, the jackpot, if you play long enough.
Anthropic's game isn't limited to a single spin either. The small wins (small prompts with well defined answers) are support for the big losses (trying to one shot a whole production grade program).
The majority of us are using their subscription plans with flat rate fees.
Their incentive is the precise opposite of what you say. The less we use the product, the more they benefit. It's like a gym membership.
I think all of the gambling addiction analogies in this thread are just so strained that I can't take them seriously. Even the basic facts aren't even consistent with the real situation.
they want me to not spend tokens. that way my subscription makes money for them rather than costing them electricity and degrading their GPUs
If you're on anything but their highest tier, it's not altogether unreasonable for them to optimize for the greatest number of plan upgrades (people who decide they need more tokens) while minimizing cancellations (people frustrated by the number of tokens they need). On the highest tier, this sort of falls apart but it's a problem easily solved by just adding more tiers :)
Of course, I don't think this is actually what's going on, but it's not irrational.
Understood.
> they want me to not spend tokens.
No, they want you to expand your subscription. Maybe buy 2x subscriptions.
I mean this only works if Anthropic is the only game in town. In your analogy if anyone else builds a casino with a higher payout then they lose the game. With the rate of LLM improvement over the years, this doesn't seem like a stable means of business.
So, if there's a way to get people addicted to AI conversations, that's an excellent way to make money even if you are way behind your competitors, as addicted buyers are much more loyal that other clients.
> Intermittent variable rewards ... will induce compulsive behavior
As a dog owner this is why you need to give a variable amount of treats (sometimes zero) to your dog when they obey a command, for the "jackpot effect".
For example, if I land a trick while skating it gives me a boost. Is that addictive behaviour? not sure. It gets me to exercise.
My point is that variability is probably part of what gets you back to pepper planting and fishing. Intermittent variable rewards reinforcing a behaviour seems to be a fact. If this is good or bad regarding a specific activity is left as an exercise for the reader.
EDIT: grammar
or a hobbyist gardener?
Dealing with organic and natural systems will, most of the time, have a variable reward. The real issue comes from systems and services designed to only be accessible through intermittent variable rewards.
Oh, and don't confuse Claude's artifacts working most of the time with them actually optimizing to be that way. They're optimizing to ensure token usage. I.E. LLMs have been fine-tuned to default to verbose responses. They are impressive to less experienced developers, often easier to detect certain types of errors (eg. Improper typing), and will make you use more tokens.