Next month, we're introducing new weekly rate limits for Claude subscribers, affecting less than 5% of users based on current usage patterns.
Claude Code, especially as part of our subscription bundle, has seen unprecedented growth. At the same time, we’ve identified policy violations like account sharing and reselling access—and advanced usage patterns like running Claude 24/7 in the background—that are impacting system capacity for all. Our new rate limits address these issues and provide a more equitable experience for all users.
What’s changing: Starting August 28, we're introducing weekly usage limits alongside our existing 5-hour limits: Current: Usage limit that resets every 5 hours (no change) New: Overall weekly limit that resets every 7 days New: Claude Opus 4 weekly limit that resets every 7 days As we learn more about how developers use Claude Code, we may adjust usage limits to better serve our community. What this means for you: Most users won't notice any difference. The weekly limits are designed to support typical daily use across your projects. Most Max 5x users can expect 140-280 hours of Sonnet 4 and 15-35 hours of Opus 4 within their weekly rate limits. Heavy Opus users with large codebases or those running multiple Claude Code instances in parallel will hit their limits sooner. You can manage or cancel your subscription anytime in Settings. We take these decisions seriously. We're committed to supporting long-running use cases through other options in the future, but until then, weekly limits will help us maintain reliable service for everyone.
We also recognize that during this same period, users have encountered several reliability and performance issues. We've been working to fix these as quickly as possible, and will continue addressing any remaining issues over the coming days and weeks.
–The Anthropic Team
I feel like someone is going to reply that I'm too reliant on Claude or something. Maybe that's true, but I'd feel the same about the prospect of loosing ripgrep for a week, or whatever. Loosing it for a couple of days is more palatable.
Also, I find it notable they said this will affect "less than 5% of users". I'm used to these types of announcements claiming they'll affect less than 1%. Anthropic is saying that one out of every 20 users will hit the new limit.
*edited to change “pro” to “plus”
I regularly hit the the Pro limits 3 times a day using sonnet. If I use claude code & claude its over in about 30 minutes. No multi 24/7 agent whatever, no multiple windows open (except using Claude to write a letter between claude code thoughts).
I highly doubt I am a top 5%er - but wont be shocked if my week ends on a wednessday. I was just starting to use Claude chat more as it is in my subscription but if I can not rely on it to be available for multiple days its functionally useless - I wont even bother.
Very good point, I find it unlikely that 1/20 users is account sharing or running 24/7 agentic workflows.
Just to nitpick: When the limit is a week, going over it does not mean losing access for a week, but for the remaining time which would assuming the limits aren't overly aggressive mean losing access for at most a couple of days (which you say is more palatable).
I wouldn't say you're too reliant, but it's still good to stay sharp by coding manually every once in a while.
Well, not the entire week, however much of it is left. You said you probably won't hit it -- if you do, it's very likely to be in the last 36 hours (20% of a week) then, right? And you can pay for API usage anyway if you want.
the principle: let's protect against outliers without rocking the behavior of the majority, not at this stage of PMF and market discovery
i'd also project out just how much the compute would cost for the outlier cohort - are we talking $5M, $100M, $1B per year? And then what behaviors will simply be missed by putting these caps in now - is it worth missing out on success stories coming from elite and creative users?
I'm sure this debate was held internally but still...
(2) I interpret this change as targeting people who are abusing the single Pro account, but using it more like a multi-developer business would maximizing the number of tokens (multiple sessions running 24/7 always hitting the limits). Anthropic has a business interest in pushing those users to use the API (paying per token) or upgrade to the $200/mo subscription.
(3) While I fear they might regularly continue to push the top x% usage tier users into the higher subscription rate, I also realize this is the first adjustment for token rates of Claude Pro since Claude Code became available on that subscription.
(4) If you don’t want to wait for the next unthrottling, you can always switch to the API usage and pay per token until you are unblocked.
I know entire offices in Bangladesh share some of these accounts, so I can see how it is a problem.
Now your vibes can be at the beach.
These limits will only get worse. Let this be a wake-up call for you to not put all your development eggs in the A.I. basket.
If it's affecting 5% of users, it might be people who are really pushing it and might not know (hopefully they get a specialized notice that they may see usage differences).
Given that I rarely hit the session limits I’m hopeful I won’t be affected, but the complete and utter lack of transparency is really frustrating.
I'm pretty sure they calibrated it so that only the people who max out every 5 hour window consistently get hit by the weekly quota.
as many other services did, and even some tangible products are implementing, the introduced limit will later on be used to create more tiers and charge you more for the same without providing anything extra. #shrinkflation
Sorry, I'll just be "that guy" for a moment. Assuming that access is cut at a random time during the week, the average number of days without Claude would be 3.5. That's not reasonable as it's dependant on usage. So assume that you've always been just shy of hitting the limit, and you increase usage by 50%, then you'd hit the limit 4.67 days in. Just 2-3 hours shy of the weekend - a sort of reward for the week's increased effort.
Have a blessed Thuesday.
Internet, text messages, etc are roughly that: the direct costs are so cheap.
That’s not the case with LLM’s at this moment. There are significant direct costs to each long-running agent.
That's why using the API directly and paying for tokens anything past that basic usage feels a bit nicer, since it's my wallet that becomes the limitation then, not some arbitrary limits dreamed up by others. Plus with something like OpenRouter, you can also avoid subscription tier related limits like https://docs.anthropic.com/en/api/rate-limits#rate-limits
Though for now Gemini 2.5 Pro seems to work a bit better than Claude for my code writing/refactoring/explanation/exploration needs. Curious what other cost competitive options are out there.
Mask off completely and just make it completely usage based for everyone. You could do something for trial users like first 20 (pick your number here) requests are free if you really need to in order to get people on board. Or you could do tiered pricing like first 20 free, next 200 for X rate, next 200 for X*1.25 rate, and then for really high usage users charge the full cost to make up for their extreme patterns. With this they can still subsidize for the people who stay lower on usage rates for market share. Of course you can replace 200 requests with just token usage if that makes sense but I'm sure they can do the math to make it work with request limits if they work hard enough.
Offer better than open-router pricing and that keeps people in your system instead of reaching for 3rd party tools.
If your tool is that good, even with usage based it will get users. The issue is all the providers are both subsidizing users to get market share, but also trying to prohibit bad actors and the most egregious usage patterns. The only way this 100% becomes a non-issue is usage based for everything with no entry fee.
But this also hurts some who pay a subscription but DONT use enough to account for the usage based fees. So some sales people probably don't like that option either. It also makes it easier for people to shop around instead of feeling stuck for a month or two since most people don't want multiple subs at once.
LLMs will become more efficient, GPUs, memory and storage will continue to become cheaper and more commonplace. We’re just in the awkward early days where things are still being figured out.
No wonder that access to an expensive API which is an LLM is also rate-limited.
What does surprise me is that you can't buy an extra serving by paying more (twice the limit for 3x the cost, for instance). Either subscriptions don't make enough money, or their limits are at their datacenters and they have no spare capacity for premium plans.
Unless/until I start having problems with limits, I'm willing to reserve judgment. On a max plan, I expect to be able to use it throughout my workday without hitting limits. Occasionally, I run a couple instances because I'm multitasking and those were the only times I would hit limits on the 5x plan. I can live with that. I don't hit limits on the 20x plan.
Well, it is a limited resource, I'm glad they're making that clear.
The stuff that we do now, my 13 year old self in 1994 would never dream of! When I dialed my 33.6kbps modem and left it going the whole night, to download an mp3.
It's exciting that nowadays we complain about Intelligent Agents bandwidth plans!! Can you imagine! I cannot imagine the stuff that will be built when this tech has the same availability as The Internet, or POTS!
Opus at 24-40 looks pretty good too. A little hard to believe they aren't losing a bunch of money still if you're using those limits tbh.
I know nobody else really cares.. In some ways I wish I didn't think like this.. But its at this point not even an ethical thing, its just a weird fixation. Like I can't help but feel we are all using ovens when we would be fine with a toasters.
If anything pops this bubble, it won’t be ethics panels or model tweaks but subscription prices finally reflecting those electricity bills.
At that point, companies might rediscover the ROI of good old meat based AI.
I don't really know how it's sustainable for something like SOTA LLMs.
NOTHING breaks flow better than "Woops! Times up!"; it's worse than credit quotas -- at least then I can make a conscious decision to spend more money or not towards the project.
This whole 'twiddle your thumbs for 5 hours while the gpus cool off' concept isn't productive for me.
'35 hours' is absolutely nothing when you spawn lots of agents, and the damn thing is built to support that behavior.
I wouldn't call "spawning a lot of agents" to be a typical use case of the personal plan.
That was always in the domain of switching to a pay as you go API. It's nice that they allowed it on the fixed rate plans, but those plans were always advertised as higher limits, not unlimited.
Slowly bringing up prices as people get dependent sounds like a pretty decent strategy if they have the money to burn
> "Most Max 20x users can expect 240-480 hours of Sonnet 4 and 24-40 hours of Opus 4 within their weekly rate limits."
In this post it says:
> "Most Max 5x users can expect 140-280 hours of Sonnet 4 and 15-35 hours of Opus 4 within their weekly rate limits."
How is the "Max 20x" only an additional 5-9 hours of Opus 4, and not 4x that of "Max 5x"? At least I'd expect a doubling, since I'm paying twice as much.
Transformer self-attention costs scale roughly quadratically with context window size. Servicing prompts in a 32k-token window uses much more compute per request than in an 8k-token window.
A Max 5× user on an 8k-token window might exhaust their cap in around 30 hours, while a Max 20× user on a 32k-token window will exhaust theirs in about 35 to 39 hours instead of four times as long.
If you compact often, keep context windows small etc, I'd wager that your Opus 4 consumption would approach the expected 4× multiplier... In reality, I assume the majority of users aren't clearing their context windows and just letting the auto-compact do it's thing.
Visualization: https://codepen.io/Sunsvea/pen/vENyeZe
1- “More throughput” on the API, but stealth caps in the UI - On Jun 19 Anthropic told devs the API now supports higher per-minute throughput and larger batch sizes, touting this as proof the underlying infra is scaling. Yay!?? - A week later they roll out weekly hard stops on the $100/$200 “Max” plans — affecting up to 5 % of all users by their own admission.
Those two signals don’t reconcile. If capacity really went up, why the new choke point? I keep getting this odd visceral reaction/anticipation that each time they announce something good, we are gonna get whacked on an existing use case.
2- Sub-agents encourage 24x7 workflows, then get punished… The Sub-agent feature docs literally showcase spawning parallel tasks that run unattended. Now the same behavior is cited as “advanced usage … impacting system capacity.”
You can’t market “let Claude handle everything in the background” and then blame users who do exactly that. You’re holding it wrong?
3 Opaqueness forces rationing (the other poster comments re: rationing vs hoarding, I can’t reconcile it being hoarding since its use it or lose it.)
There’s still no real-time meter inside Claude/CC, only a vague icon that turns red near 50%. Power users end up rationing queries because hitting the weekly wall means a seven day timeout. Thats a dark dark pattern if I’ve seen one, id think not appropriate for developer tooling. (CCusage is a helpful tool that shouldn’t be needed!)
The, you’re holding it wrong, seems so bizarre to me meanwhile all of the other signaling is about more usage, more use cases, more dependency.
Yeah, the new sub-agents feature (which is great) is effectively unusable with the current rate limits.
One user consumed tens of thousands in model usage on a $200 plan. Though we're developing solutions for these advanced use cases, our new rate limits will ensure a more equitable experience for all users while also preventing policy violations like account sharing and reselling access.
This is why we can’t have nice things.
It's amazing how fast you go from thinking nobody could ever use that much of your service to discovering how many of your users are creatively abusing the service.
Accounts will start using your service 24/7 with their request rating coming at 95% of your rate limiter setting. They're accessing it from a diverse set of IPs. Depending on the type of service and privacy guarantees you might not be able to see exactly what they're doing, but it's clearly not the human usage pattern you intended.
At first you think you can absorb the outliers. Then they start multiplying. You suspect batches of accounts are actually other companies load-splitting their workload across several accounts to stay under your rate limits.
Then someone shows a chart of average profit or loss per user, and there's a giant island of these users deep into the loss end of the spectrum consuming dollar amounts approaching the theoretical maximum. So the policy changes. You lose those 'customers' while 90+% of your normal users are unaffected. The rest of the people might experience better performance, lower latencies, or other benefits because the service isn't being bombarded by requests all day long.
Basically every startup with high usage limits goes through this.
clearly that's abusive and should be targeted. but in general idk how else any inference provider can handle this situation.
cursor is fucked because they are a whole layer of premium above the at-cost of anthropic / openai etc. so everyone leaves goes to cc. now anthropic is in the same position but they can't cut any premium off.
you can't practically put a dollar cap on monthly plans because they are self exposing. if you say 20/mo caps at 500/mo usage then that's the same as 480/500 (95%) discount against raw API call. that's obviously not sustainable.
there's a real entitled chanting going on too. i get that it sucks to get used to something and have it taken away but does anyone understand that just the cap/opex alone is unsustainable let alone the RD to make the models and tools.
I’m not really sure what can be done besides a constant churn of "fuck [whoever had to implement sustainable pricing], i'm going to [next co who wants to subsidize temporarily in exchange for growth]".
i think it's shitty the way it's playing out though. these cos should list these as trial periods and be up front about subsidizing. people can still use and enjoy the model(s) during the trial, and some / most will leave at the end, but at least you don't get the uproar.
maybe it would go a long way to be fully transparent about the cap/op/rdex. nobody is expecting a charity, we understand you need a profit margin. but it turns it from the entitled "they're just being greedy" chanting to "ok that makes sense why i need to pay X to have 1+ tireless senior engineers on tap".
Worked great for years, decades even, until crypto miners caught on - and maxed out the usage. Ruined it for the other 99.99% of renters.
> This is why we can’t have nice things.
We're living in the worst world that Stallman could have predicted. One in which even HN agrees that people shouldn't be allowed to share or resell what they pay for.
pointing to the most extreme example as if you can't stop it in it's tracks is a bad argument. its like saying we will now restrict sending of emails for everyone because this one spammer was sending 1000x the amount of an avg or even power user when you should just be solving the actual problem (identifying and stopping those that disrupt).
All AI companies are hitting the same thing and dealing with the same play - they don't want users to think about cost when they're prompting, so they offer high cost flat fee plans.
The reality is though there will always be a cohort of absolute power users who will push the limits of those flat fee plans to the logical extremes. Startups like Terragon are specifically engineered to help you optimize your plan usage. This causes a cat and mouse game where they have to keep lowering limits as people work around them, which often results in people thinking about price more, not less.
Cursor has adjusted their limits several times, now Anthropic is, others will soon follow as they decide to stop subsidizing the 10% of extreme power users.
Just offer metered plans that let me use the web interface.
The problem is this would reveal how expensive it _actually_ is to service interference right now at the scale that people use it for productive things.
I have no idea if I’m in the top 5% of users. Top 1% seems sensible to rate limit, but top 5% at most SaaS businesses is the entire daily-active-users pool.
Using the $20 Pro sub and for anything above Hello World project size, it's easy to hit the 5 hour window limit in just 2 hours. Most of the tokens are spent on Claude Code own stupidity and its mistakes quickly snowballing.
1. set up your dozens of /\.?claude.*\.(json|md)/i dotfiles? 2. give insanely detailed prompts that took longer to write than the code itself? 3. Turn on auto-accept so that you can only review code in one giant chunk in diff, therefore disallowing you to halt any bad design/errors during the first shot?"
> ...easy to hit the 5 hour window limit in just 2 hours
I've had this experience. Sucks especially when you're working in a monorepo because you have client/server that both need to stay in context.
Probably phrased to sound like little but as someone used to seeing like 99% (or, conversely, 1% down) as a bad uptime affecting lots and lots of users, this feels massive. If you have half a million users (I have no idea, just a ballpark guess), then you're saying this will affect just shy of the 25 thousand people that use your product the most. Oof!
Option 1: You start out bursting requests, and then slow them down gradually, and after a "cool-down period" they can burst again. This way users can still be productive for a short time without churning your servers, then take a break and come back.
Option 2: "Data cap": like mobile providers, a certain number of high requests, and after that you're capped to a very slow rate, unless you pay for more. (this one makes you more money)
Option 3: Infrastructure and network level adaptive limits. You can throttle process priority to de-prioritize certain non-GPU tasks (though I imagine the bulk of your processing is GPU?), and you can apply adaptive QoS rules to throttle network requests for certain streams. Another one might be different pools of servers (assuming you're using k8s or similar), and based on incoming request criteria, schedule the high-usage jobs to slower servers and prioritize faster shorter jobs to the faster servers.
And aside from limits, it's worth spending a day tracing the most taxing requests to find whatever the least efficient code paths are and see if you can squash them with a small code or infra change. It's not unusual for there to be inefficient code that gives you tons of extra headroom once patched.
I’m just curious how this decision came about. In most cases, I’ve seen either daily or monthly limits, so the weekly model stood out.
I also have to wonder how much Sub Agents and MCP are adding to the use, sub agents are brand new and won’t even be in that 95% statistic.
At the end of this email there a lot of unknowns for me (am I in the 5%, will I get cut off, am I about to see my usage increase now that I added a few sub agents?). That’s not a good place to be as a customer.
Some stuff I’ve used it for in the last day: figuring out what a family member needs for FAFSA as a nontraditional student, help identify and authenticate some rare first editions and incunabula for a museum collection I volunteer at, find a list of social events in my area (based on my preferences) that are coming up in the next week (Chatgpt Agent works surprisingly well for this too), adapting Directus and Medusa to my project’s existing schema and writing up everything I need to migrate, and so on.
Deep research really hits the Claude limits hard and that’s the best way to avoid hallucinations when asking an important question or making it write complex code. I just switch from Claude to ChatGPT/Gemini until the limits reset but Claude’s deep research seems to handily beat Gemini (and OpenAI isnt even in the running). DR queries take much longer (5-10 min in average) but have much more in depth and accurate answers.
I assume that the people hitting limits are just letting it cycle, but doesn't that just create garbage if you don't keep it on a tight leash? It's very eager but not always intelligent.
I see so many folks claiming crazy hardware rigs and performance numbers so no idea where to begin. Any good starting points on this?
(Ok budget is TBD - but seeing a you get X for $Y would atleast help make an informed decision).
- 2x 4070 Ti (32 GB total VRAM) - $2200
- 64 GB RAM - $200-250
- Core i9/Ryzen 9 CPU - $450
- 2 TB SSD - $150
- Motherboard, cooler, case, PSU - $500-600
Total - ~$3500-3700, say $4000 with extras.
1. switch models using /model 2. message 3. switch back to opus using /model
Help me help you (manage usage) by allowing me to submit something like "let's commit and push our changes to github #sonnet". Tasks like these rarely need opus-level intelligence and it comes up all the time.
The problem is, we have no visibility into how much we’ve actually used or how much quota we have left. So we’ll just get throttled without warning—regularly. And not because we’re truly heavy users, but because that’s the easiest lever to pull.
And I suspect many of us paying $200 a month will be left wondering, “Did I use it too much? Is this my fault?” when in reality, it never was.
that's exactly what they've done? they've even put rough numbers in the email indicating what they consider to be "abusive"?
[1]: https://epoch.ai/data-insights/llm-inference-price-trends
I'll keep openAI and they dont even let me use CLI's with it, but they're at least Honest about their offerings.
Also their app doesnt tell you to go fuck off ever, if you're Pro
I'd be pretty surprised if I were to get rate limited, but I do use it a fair amount and really have no feel for where I stand relative to other users. Am I in the top 5%? How should I know?
We don't know what the limits are, what conditions change the limits dynamically, and we cannot monitor our usage towards the limits.
1. 5 hour limit
2. Overall weekly limit
3. Opus weekly limit
4. Monthly limit on number of 5 hour sessions
Seems like some people are account-sharing or scripting/repackaging to such an extent that they were able to "max out" the rate limit windows.
Ultimately - this all gets priced in over time; whether that's in a subscription change or overall rate limit change, etc.
So if you want to simply use it as intended, over time stopping this kind of pattern is better for us?
Charging a low flat fee per use and still warning when certain limits hit is possible. But it's market segmentation not to do it. Just charge a flat fee, then lop off the high-end, and you maximize profit.
Waiting for higher valuations till someone pulls the trigger for acquisition.
IPOs I don't see to be successful because not everyone gets a conman like Elon as their frontman that can consistently inflate the balloon with unrealistic claims for years.
Is this way too complicated? It feels complicated to me and I worked on it, so I presume it is?
I don't want to end up in some "you can work for X number of hours" situation that seems... not useful to engineers?
How do real world devs wanna consume this stuff and pay for it so there is some predictability and it's useful still?
Thank you. :)
Anyway, I've been resigned to this for a while now (see https://x.com/doodlestein/status/1949519979629469930 ) and ready to pay more to support my usage. It was really nice while it lasted. Hopefully, it's not 5x or 10x more.
- It would be nice to know if there was a way to know or infer percentage wise the amount of capacity a user is currently using (rate of usage) and has left, compared to available capacity. Being scared to use something is different than mindful.
- Since usage it can feel a little subjective/relative (simple things might use more tokens, or less, etc) to things beyond a user's usage alone, it would be nice to know how much capacity is left both on the current model and in 1 month now to learn.
- If there is lower "capacity" usage rates available at night vs the day, or just slower times, it might be worth knowing. It would help users who would like to, plan around it, compared to people who might be just making the most of it.
Hopefully they sort it out and increase limits soon. Claude Code has been a game-changer for me and has quickly become a staple of my daily workflows.
I can understand setting limits, and I'd like to be aware of them as I'm using the service rather than get hit with a week long rate limit / lockout.
Notice they didn't say 5% of Max users. Or 5% of paid users. To take it to the extreme - if the free:paid:max ratio were 400:20:1 then 5% of users would mean 100% of a tier. I can't tell what they're saying.
This is also exactly why I feel this industry is sitting atop a massive bubble.
you...already were? it already had a variety of limits, they've just added one new one (total weekly use to discourage highly efficient 24/7 use of their discounted subscriptions).
This one thing that bugs me is the visibility of how far through your usage you are. Being told when you're close to the end means I cannot plan. I'm not expecting an exact %, but a few notices at intervals (eg: halfway through) would help a lot. Not providing this kinda makes me worry they don't want us to measure. (I don't want to closely measure, but I do want to have a sense of where I am at)
My issue is: a request made during peak usage is treated the same as a request made during low usage times even though I might not be able to get anything useful/helpful out of the LLM during those busy hours.
I've talked with coworkers and friends who say the same.
This isn't a problem with Claude specifically - seems to happen with all the coding assistants.
There are people that will always try to steal, but there may also be those that just don't understand their pricing.
Also some people keep going forever in the same session, causing it to max out - since the whole history is sent in every request. Some prompting about things like that (your thread has gotten long..) would probably save quite a bit of usage and prevent innocent users from getting locked out for a week.
If I’m on annual Pro, does it mean these won’t apply to me till my annual plan renews which is several months away.
Or is that a silly idea, because distillation is unlikely to be stopped by rate limits (i.e., if distillation is a worthwhile tactic, companies that want to distill from Anthropic models will gladly spend a lot more money to do it, use many, many accounts to generate syntheitc data, etc.)?
What are the reasonable local alternatives? 128 GB of ram, reasonably-newish-proc, 12 GB of vram? I'm okay waitign for my machine to burn away on LLM experiments I'm running, but I don't want to simply stop my work and wake up at 3 AM to start working again..
I think you're just confused about what the Pro plan was, it never included being used for 168 hours/week, and was extremely clear that it was limited.
> What are the reasonable local alternatives? 128 GB of ram, reasonably-newish-proc, 12 GB of vram? I'm okay waitign for my machine to burn away on LLM experiments I'm running, but I don't want to simply stop my work and wake up at 3 AM to start working again..
a $10k mac mini with 192GB of vram with any model you can download still isn't close to Claude Sonnet.
Upshot - I will probably go back to api billing and cancel. For my use cases (once or twice a week coding binges) it’s probably cheaper and definitely less frustrating.
It’s an all you can eat buffet, you’re just not allowed takeout!
The daily limits are probably there to fix the account sharing issue. For example I wanted to ask a friend who uses the most expensive subscription for work, if I could borrow the account at night and on weekends. I guess that's the kid of pattern they want to stop.
I switched to Claude Code because of Cursor’s monthly limits.
If I run out of my ability to use Claude Code, I’m going to just switch back to Cursor and stay there. I’m sick of these games.
If you think it’s ok, then make Anthropic dog food it by putting every employee in the pro plan and continue to tell them they must use it for their work but they can’t upgrade and see how they like it.
You having the same issue kills the point of using you.
That said, there's no fucking way I am getting what they claim w/Opus in hours. I may get two to three queries answered w/Opus before it switches to Sonnet in CC.
If you work on some overengineered codebase, it will produce overengineered code; this requires more tokens.
The two models are not just the best models for coding at this point (in areas like UX/UI and following instructions they are unmatched); they come package with possibly the best command line tool today.
The invite developers to use them a lot. Yet for the first time ever, I can feel how I cannot 100% fully rely on the tool and feel a lot of pressure, when using it. Not because I don't want to pay, but because the options are either:
> A) Pay $200 and be constantly warned by the system that you are close to hitting your quota (very bad UX) > B) Pay $$$??? via the API and see how your bill grows to +$2k per month (this is me this month via Cursor)
I guess Anthropic has the great dilemma now: should they make the models more efficient to use and lower the prices to increase limits and boost usage OR should they cash in their cash cows while they can?
I am pretty sure no other models comes even close in terms of developer-hours at this point. Gemini would be my 2nd best guess, but Gemini is still lagging behind Claude, and not that good at agentic workloads.
Frustrated users, who are probably using the tools the most will try other code generation tools.
It makes no sense to me that you would tell customers “no”. Make it easy for them to give you more money.
this entire thread is people whinging about the "you get some reasonable use for a flat fee" product having the "reasonable use" level capped a bit at the extreme edge.
We're going to punish the 5% that are using our service too much.
https://openrouter.ai/z-ai/glm-4.5
It's even possible to point Claude Code CLI to it
Who does that benefit? Does number of accounts beat revenue in their investor reports?
Then again, to scale is human
How about adding ToS clause to prevent abuse? wouldn't that be better than having a statement with negative effect on the rest of 95%?
i just found ccusage, which is very helpful. i wish i could get it straight from the source, i dont know if i can trust it... according to this ive spent more my 200$ monthly subscription basically daily in token value.. 30x supposed cost
ive been trying to learn how to make ccode use opus for planning and sonnet for execution automatically, if anyone has a good example of this please share
Anthropic seems like they need to boost up their infra as well (glad they called this out), but the insane over-use can only be hurting this.
I just can't cosign on the waves of hate that all hinges on them adding additional limits to stop people from doing things like running up $1k bills on a $100 plan or similar. Can we not agree that that's abuse? If we're harping on the term "unlimited", I get the sentiment, but it's absolutely abuse and getting to the point where you're part of the 5% likely indicates that your usage is abusive. I'm sure some innocent usage will be caught in this, but it's nonsense to get mad at a business for not taking the bath on the chunk of users that are annihilating the service.
Claude is vital to me and I want it to be a sustainable business. I won't hit these limits myself, and I'm saving many times what I would have spent in API costs - easily among the best money I've ever spent.
I'm middle aged, spending significant time on a hobby project which may or may not have commercial goals (undecided). It required long hours even with AI, but with Claude Code I am spending more time with family and in sports. If anyone from Anthropic is reading this, I wanted to say thanks.
> sounds like it affects pretty much everyone who got some value out of the tool
Feels that way.
But compared to paying the so-called API pricing (hello ccusage) Claude Code Max is still a steal. I'm expecting to have to run two CC Max plans from August onwards.
$400/mo here we come. To the moon yo.
Part of the reason there is so much usage is because using claude code is like slot machine, where SOMETIMES it's right, most times it needs to rework what it did, which is convenient for them. Plus their pricing is anything but transparent as for how much usage you actually get.
I'll just go back to ChatGPT. This is not worth the headache.
PS. Ah! Of course. Agents ...
Why not use the user's timezone?
Economists are having a field day.
This is the most exciting business fight of our time and I’m chomping popcorn with glee.
I think Anthropic is grossly overestimating the addressable market of a CLI tool, while also falsely believing they have a durable lead right now in their model, which I’m not so sure of. Also their treatment of their partners has been…shall we say…questionable. These are huge missteps at a time they should be instead hitting the gas harder imo.
They’re getting cocky. Would love to see a competitor to swoop in and eat their lunch.
It has become a kind of goal to hit it twice a day. It means I've had a productive day and can go on and eat food, touch grass, troll HN, read books.
I'm on Claude Code after hitting Cursor Pro for the month. It makes more sense to subscribe to a bunch of different tools at $20/month than $100/month on one tool that throws overloaded errors. We'll probably get more uptime with the weekly restriction.
I assume this is the end of the viability of the fixed price options.
can someone please find a conservative, sustainable business model and stick with it for a few months please instead of this mvp moving target bs
Say an 8xB200 server costs $500,000, with 3 years depreciation, so $166k/year costs for a server. Say 10 people share that server full time per year, so that's going to need $16k/year/person to break even, so ~$1,388/month subscription to break even at 10x users per server.
If they get it down to 100 users per server (doubt it), then they can break even at $138/month.
And all of this is just server costs...
Seems AI coding agents should be a lot more expensive going forward. I'm personally using 3-4 agents in parallel as well..
Still, it's a great problem for Anthropic to have. "Stop using our products so much or we'll raise prices!"
leveling the playing field i see lol