Microsoft starts canceling Claude Code licenses (opens in new tab)

(theverge.com)

474 pointsrobertkarl3d ago458 comments

458 comments

The comments I see recommending selective use of cheaper models doesn't match the reality I experience working in the industry. I have the constant threat hanging over my head of being fired if I don't churn out code quickly enough. I'm not willing to gamble with my livelyhood by using a less effective model.

Saving money on tokens isn't something that's rewarded during performance reviews; particularly because it's difficult to quantify how much you saved versus hypothetically using a more expensive model.

10 more replies

iamflimflam12d ago

From reading the article. They offered their developers both Claude code and Copilot.

What they wanted was for them to use both and feedback which was better.

The developers voted with their feet and didn’t use Copilot.

What Microsoft were hoping was that the opposite would happen...

gofreddygo2d ago

For months, Employees had the option to choose claude code or copilot. Now they dont.

Underlying model choice still has no restrictions. Opus 4.6 is by far the most popular. there's still big $$$ bills going anthropic's way.

comboy2d ago

Curious if anyone around here stayed on 4.6 (having a choice to use 4.7)

EdwardDiego2d ago

I went to 4.7, didn't have a choice, found it unsatisfactory, then Claude quietly added in the option to use 4.6, so I'm back on 4.6, and I'm not the only one in my company.

I had far more hallucinations with 4.7 than 4.6.

I'll try it again after a few more months for them to get it right, but 4.6 is what changed my mind on LLMs as a tool, and 4.7 felt like a step backwards, so for now I'm sticking with something that has delivered me value, instead of arguing with a model ostensibly better that was making shit up 1 - 2 times a day. It was really disappointing.

I can give examples if needed, I screenshotted the most aggravating ones, but what worries me is which ones I didn't recognise.

3 more replies

zmmmmm2d ago

I have stuck with 4.6. I fully believe 4.7 can be smarter for truly complex and long running agentic use. But I prefer the more direct, literal mechanistic style and 4.6 seems to be peak Opus for that.

fendy30022d ago

Stay with 4.6 if you can, it is disabled (afaik) on vscode claude code extension.

4.7 IMO is around 10-20% worse at understanding your prompt intention. You need more effort to explain your intention clearer so it doesn't divert.

3 more replies

lifthrasiir2d ago

4.7 turned out to be a disaster in multilingual settings, so I sticked to 4.6 so far. 4.7 seemed to be optimized for (very specific slice of) coding at the expense of everything else.

1 more reply

pimeys2d ago

I still use 4.6 if I need Opus. It's mostly GPT-5.5 for me. Only if I know it cannot do some thing like push without running the tests (because AGENTS.md said so), I switch to 4.6.

Although GPT's been acting weird since Thursday...

SequoiaHope2d ago

I’ve stayed on 4.6. Was thinking of trying 4.7 though just today. Still, I did not jump on it day one.

nijave2d ago

Switched back when 4.7 had an issue last week and it was wayyy faster. I assume mostly because a lot of people have moved off but might consider using it more just for the speed boost.

willtemperley2d ago

I don't want to change from 4.6 because I'm finding it so good (I could change).

I've spent the last couple of days building Swift bindings to a monster CPP lib and I've actually had fun.

zuppy2d ago

i use 4.6 and i've configured advisor to be on 4.7, so, when something's more complex the advisor can help. at least that's how i do with claude code, not sure of the others have implemented the concept of advisors.

vasco2d ago

Wouldn't they be forced into API pricing instead of per-seat like that though? That would potentially be a massive cost increase. But I've discovered through talking to colleagues some companies are already doing that. I can't understand why you'd ever do that when you can get VC subsidized pricing for now. At least for all initial in-plan usage. I doubt many developers go past the limit anyway and for those you switch just the extra usage to on demand anyway.

bdavbdav2d ago

Teams is the only one with seat pricing. Teams has a user cap of 150. Enterprise is usage based pricing only now (with a £20/user service charge)

fortran772d ago

I use copilot cli and I can pick Anthropic models. The Microsoft interface seems fine to me, and equivalent. Not sure what the big deal is.

gwerbin2d ago

Funny I had the opposite experience. The Claude models seemed equivalent to GPT-5.4/5 in a generic harness like Copilot CLI or Opencode or Pi, but Claude Code the app/harness is so much better than all the others that I switched at work, even though I'd much prefer to use a non-proprietary harness (and eventually I do want to get Pi set up to be comparable).

1 more reply

krzyk2d ago

Harness makes a difference. Also in copilot you have smaller context for Claude models.

And you get a token based pricing since June 1.

verdverm2d ago

Anthropic's Claude harness is much better than Copilot, i.e. the tools and instructions in each harness are different. Anthropic is just that much better (for claude models, likely an amount of co-development).

Personally, I looked into Copilot's prompt and saw things that made me put it down immediately to start working on my own. I'm now using OpenCode for reasons and I like it better than any Big Ai tool. Using OC with Qwen3.6-MoE (for context) and generally happy with the results.

ryanhecht1d ago

> The developers voted with their feet and didn’t use Copilot.

This was true in January -- since then, the Copilot CLI team has spent countless hours with engineering leaders and the biggest Claude Code users at the company to understand Copilot's shortcomings, define evals to properly test them head-to-head, and close the gap between the products.

The result? Claude Code usage was organically decreasing and Copilot CLI usage was organically increasing -- when this announcement was made, internal Copilot CLI usage had been greater than Claude Code usage for weeks!

verst2d ago

Most of us never had the option for work to pay for Claude Code -- some internal orgs did this. That being said I had a personal Claude Code subscription for a bit.

Honestly I find GitHub Copilot CLI (and now also the new GitHub Copilot app) quite decent. I mostly use it with Opus 4.7, or rarely with GPT-5.5. The VSCode extension is ok, but CLI or app are the better experience IMO.

RA_Fisher2d ago

Do people bring their own then (considering work doesn’t pay for it)?

krzyk2d ago

Our corp specifically prohibits that, because of code leak/training.

cfunderburg2d ago

I wish I could understand the appeal of using Claude Code inside VScode rather than Copilot. I feel like I'm missing something obvious.

tags2k2d ago

I'm with you there. I can't stand the CLI that wants to take you away from the mostly bad code it writes. Give me the structure, let me finesse it - to do that I need to actually see it no matter how much Anthropic pretends that it's perfect.

lanstin2d ago

I run Claude code inside an emacs vterm for moderately long lived work streams, and an ever shifting set of tmuxes for quick small features or bug fixes. The way I ensure I read the code at least a bit is the same as for wholly hand written code: I never do git add . only for one file at a time, and I got diff each file just prior to adding it (except sometimes for code genned files). I also arrange mostly to do incremental dev, sort of agile where I am the client and claude is the dev team and I check the utility of each feature one by one, so what I end up with delights me. It does tend to do more than is needed, so I will mostly delete code it has written rather than fix things. Like really not every module tunable constant needs to be over rideable from env vars. I am happy with the resulting systems, they have not collapsed into unmaintainable messes yet; the Claude in vterm in emacs is nice where I can think and run shell commands and look at code or git history while having a longer running discussion is nice UX.

bakies2d ago

I just have git diff open in another terminal. Everything I do is in the terminal.

rplnt2d ago

Slightly related (me not understanding) is why the Copilot in VS code is essentially just CLI interface. Why can't it use the IDE tools (search, LSP, ...). All it ever does is trying to execute grep.

jwoq911819h ago

I'm not sure how you can believe that GitHub Copilot in VS Code is just a CLI interface when the former existed long before the CLI. It's not. For a certain amount of time, the two teams weren't even working together. The CLI was adding things customers were wondering when would it reach VS Code. So, it's not just a CLI harness. They added the ability to call the CLI from VS Code but GitHub Copilot in VS Code existed before the CLI and is remains a separate thing that's just interfacing with the CLI now.

All I can say is I know because I know. There's been some "synergizing" among the corporation about the CLI team running off to do their own thing and adding features to the CLI that amount to trying to force a Terminal to act like a GUI.

AlexMoffat2d ago

Claude’s prompt heavily pushes it towards grep. We have an internal cross repo semantic search mcp and to get Claude to consistently use it a skill and prompting was not enough. A pre tool use hook is the answer. Claude will even write one for you if you describe the problem to it :)

a1o2d ago

There is an option to turn on semantic indexing and search on copilot in vscode. Although I have no perceptual differences when I turn it on. The docs mention something about it.

https://code.visualstudio.com/docs/copilot/reference/workspa...

mattmanser2d ago

Someone mentioned here the other day that when you try and give Claude those tools throughan MCP or skill it tends to go a bit loopy.

At the moment it seems like the way it's been trained has been tightly coupled with grep.

It does feel bizarre though that it doesn't use the symbol servers.

skywhopper2d ago

Because it’s far far easier to make a text-generation machine generate text that has decades of how-to explanations on the Internet than to correctly work an internal editor API that changes often and isn’t as well-documented.

Especially if you want effective results.

2 more replies

ninjagoo2d ago

> I wish I could understand the appeal of using Claude Code inside VScode rather than Copilot

MS thinks CoPilot is the Clark Griswold of LLMs when it's really Cousin Eddie...

gbro3n2d ago

Same, with regard to TUIs in general. The VS code copilot chat extension has really nice integration for 'human in the loop' style agentic development. I build some tooling - https://www.agentkanban.io to integrate a taskboard and git worktrees with copilot chat

RA_Fisher2d ago

Claude Code will write the whole thing for you. Whereas doesn’t Copilot require input along the way of coding? ie- it doesn’t do all the programming for you

mirekrusin2d ago

It can code the whole thing for you, copilot in vscode is simply better, people just never tried it.

__mharrison__2d ago

If you give Copilot a file with a list of tasks to complete, it will try to churn through them (just like most other harness would do these days).

1 more reply

stanac2d ago

I think they were comparing CLIs, not VS extensions.

mattmanser2d ago

I'm a little the opposite, what's the point of using an IDE with AI? I genuinely don't get it?

These days I just use Claude Code Desktop or Claude Code in powershell. Standalone, not inside and IDE. Honestly, I'm using Desktop more and more as it gets more features.

The IDE is for me. No AI in it at all. If I want to get Claude to do something specific to a file I just @ the file.

vovavili2d ago

Productivity. You generate the skeleton of the code with Codex/Claude Code/et. al. and refactor it manually. It's kind of unlikely that an AI agent will be able to one-shot every bit of code in the exact way you want, even with a fat AGENTS.md file. With a smart AI-native IDE like Zed, it will quickly be able to pick up what manual change you intent to do without you fully typing out anything, especially if they're repetitive. This helps enormously when you're debugging or profiling your code.

2 more replies

serf2d ago

the obvious answer is because it's easier , faster, and more efficient to flip a true to false right in front of you than it is to prompt an llm.

if your response is "my prompts don't produce code that needs values flipped, ever." then I would wager you're only touching very simple things with an LLM.

for me I don't care about the token cost and prompt writing so much as the fact that it's just faster to change 0 to 1 and leaves me twiddling my thumbs for an llm output less.

5 more replies

Sharlin2d ago

That’s like asking why anyone would use IDE autoformatting, linting, or build tools rather than constantly swapping to a terminal to run their command line versions. As in, why use tool integration in an integrated development environment? Because that’s the entire point. Classic IDE refactoring and code generation tools are limited to explicitly programmed operations, but a well-integrated LLM can do much more and smarter manipulations without you having to context switch and explain the context of what you want done.

1 more reply

subscribed2d ago

> what's the point

Tab completion.

Smart model can cut down time to write complex firewall yaml dramatically, relying both on the existing file and the ugly draft (eg comma delimited details of the rules I need) I put out. It makes it 5 minutes lead time and 20 presses of tab instead of writing a shell/python full of edge cases or just copying existing rules as a template and laborously editing them -- smart model knows what the specific firewall needs.

But I'm not a developer, so I use both - haiku via github for tab completion and CC for cli.

harimau7772d ago

For Windsurf at least, it makes it easier to control context. I can simply drag and drop a file from the IDE into the chat.

I can also click on a file referenced by the AI and have it open immediately in the IDE so that I can inspect it.

Finally, it is a pain to write long, multi-line prompts in a CLI where you can't easily click around to edit different parts.

The primary weakness I've found in IDE based UI is that it struggles to get through the corporate security in order to run commands.

fendy30022d ago

For me I need to compare the code generated before committing. Also I need to read markdown plans generated for review before commit to execution. VSCode CC extension also generate clickable links to the file directly if the query has something to do with it.

All of them are valid usecase of VSCode CC extension for me.

cameronh902d ago

Microsoft have historically tended to dogfood their own products.

Obviously you want to be aware of what else is on the market, and use the right tool for the job -- but equally if you have a directly competing product, you'd prefer your org's telemetry and suggestions are directed towards improving your own software rather than your competitors'.

Anon10962d ago

This was always a little weird to be because Microsoft internally is actively hostile to cross-org collaboration. If you worked in most of Azure you basically have 0 lanes of communication with someone from the Windows team and vice versa. Triply so for stuff like Kusto or Teams which you'd be dogfooding daily. I guess if there's a horrible stop the world bug it'd get surfaced through telemetry but normal user feedback is not a thing.

Compared to working at other big techs, where I was able to direct msg the engineers on the team for internal protobuf or datalake services in addition to user groups that were generally responsive it was just strange. Also Microsoft doesn't have a monorepo so you can't just commit patches to their service because you don't have access to their repos which I pretty regularly do elsewhere.

ryanhecht1d ago

> Microsoft internally is actively hostile to cross-org collaboration

The Copilot CLI has ushered in the beginning of a change in this dogma -- I've helped dozens of Microsoft engineers get access to GitHub source code so they can contribute to Copilot CLI! It's fun to subvert expectations when a Microsoft IC pitches an improvement and I can respond with "submit a PR!"

Quothling2d ago

Maybe it's just Microsoft moving to more model agnostic tech within their copilot. I recently started using Microsoft 365 Copilot because corporate added Cowork which runs on Opus 4.7 which was better than the alternative we have available. Unlike the "real" Claude Code or Cowork this only has access to files in a specific onedrive folder in your personal sharepoint container, so it's much more compliant to things like NIS2.

Technically we're using Copilot and we're playing for it through Microsoft licenses, but it's using Opus 4.7. Even before this, most of our custom agents within m365 copilot were one of the GPT models.

Or maybe you're right and they want their developers to use the copilot models.

pwarner1d ago

Copilot Cowork seems to be the best part of M365 Copilot by a huge margin.

Quothling1d ago

I really dislike that I can't customize it with permanent config files, similar to how I can configure a regular GPT model agen. I guess it's probably because it's in the fancy word they use for "beta".

I haven't really used any other Copilot product in a while since they were so bad compared to our other corporate options, but I'm rather impressed with Cowork inside it. Exactly because we can actually use it without breaking any EU laws.

__mharrison__2d ago

Copilot was great when folks were semi-attempting write their own coffee and needed auto complete.

There's a large (and growing!) contingent of people who don't write code these days. (Many don't even use the keyboard.)

Insanity2d ago

Wonder if Amazon will do the same with CC and Kiro now that we internally have access to both.

I think Kiro might have some “first mover” advantage internally, but CC feels better to use.

fg1372d ago

I never understand why Amazon even bothers to build their own coding agent.

GitHub Copilot is in a somewhat similar place as Microsoft's toy but still different -- it was more or less the first coding agent/assistant, and GitHub/VSCode/Microsoft has enough user base and impact to influence individual users and enterprises' choices.

For Amazon's coding agent -- I just never see anyone outside Amazon even mentions Kiro or Amazon Q. Maybe a little bit when Kiro was offering tons of free credits. But I don't think it's even remotely relevant these days. I don't see news about companies adopting Kiro.

To me, it's just a matter of time before they are sunset, like Chime or a bunch of AWS products.

Insanity2d ago

In fairness, Chime had tons of internal use and I quite liked it.

For Kiro, I agree with you, it seems like wasted effort and Anthropic / OpenAI are miles ahead in their tooling.

cameronh902d ago

Is there any proprietary Amazon end-dev/ops facing service that's worth using? I've never had a good experience with any I've tried - CodeBuild, Cloud9, Q, SageMaker, WorkMail, WorkDocs, Chime, OpsWorks,...

I love AWS at the infrastructure level, but their PaaS tends to be meh, and their end-user directed stuff is usually atrocious.

1 more reply

tra33d ago

There's definitely a way to use Claude code that is token conscious.

I've tried throwing unsupervised agentic software factory workflows against the wall, and they burned through my tokens like nobody's business but didn't produce much.

Supervised, human-in-the-loop process on the other hand is much more productive but doesn't consume nearly as much. Maybe that's why everyone's pushing agentic approaches so much.

CoolestBeans3d ago

The current thinking is automated agents is what turns this from an industry in the tens of billions to a multi trillion dollar one. So yes you are right on the money, agents stimulate demand for this thing they've built.

dualvariable2d ago

"The bureaucracy is expanding to meet the needs of the expanding bureaucracy"

visarga2d ago

AI is expanding to meet the needs of expanding AI. Why worry about jobs? AI will provide plenty of work. If anything, I worry we'll be working more, not less. All that AI will need someone to vouch for it and to scapegoat when it makes mistakes.

beardyw2d ago

I didn't know that one. Loosely said to be Oscar Wilde.

1 more reply

matheusmoreira2d ago

Yeah. Claude does good work but reviewing it all properly takes quite a bit of time. It got to the point I started having trouble maxing out my weekly allocation.

Dealt with that by going all out and making an agentic parallel code review skill. Basically an infinite TODO list generator. Now I'm definitely getting 100% of the usage I paid for. It really burns tokens like nobody's business, and catches a lot of issues while at it. I've been looping this review/fix process every week. It's dramatically reduced the amount of stuff I need to pay attention to during my human review sessions.

kixxauth2d ago

I really don't like how the payment plans work with the providers right now. I feel this pressure to use all my tokens for the week, often just "wasting" them. But also, I want to take advantaged of the subsidized tokens in Claude Code and Codex for as long as I can.

There is this real danger that our thinking, and the things we make, become bloated without constraints.

IMO software has gone to shit since both mobile phones and laptops mostly have massive amounts of compute. We always seem to use it to the limit, just because it's there.

matheusmoreira2d ago

It's the gacha of software development. We've got periodically resetting timers. Prompting is like booster packs: we have a finite number of dice rolls before the timer resets. We might even get a Legendary Ultra Rare pull whenever Claude happens to be feeling extra motivated. Before you even know what's happening, it's hijacked the brain's reward circuits to the point you're waking up at 3 AM because that's when the timer resets. Gotta saturate those timers with pulls and minmax everything in sight.

At least it's doing something productive instead of just sinking money into literal gambling simulators. Mercifully, unlike video games, automation is not "cheating".

krzyk2d ago

Any corp (> 150 seats) has to use API pricing, so e.g. I don't have pressure or weekly limits, just a set budget I can use each month.

1 more reply

jdsnape2d ago

I’m interested in how this works in practise - I guess you’ve written a skill to do code review, then your Claude.md file tells it to use it after every change as a bg task? So does this work as a background task while Claude is working on the next ‘feature’?

matheusmoreira2d ago

I just committed the skill to my dotfiles repository.

https://github.com/matheusmoreira/.files/tree/master/~/.clau...

There are many "critics", one for each quality I want reviewed. Correctness, consistency, maintainability, security, testing... Everything I could think of, and I keep adding more.

https://github.com/matheusmoreira/.files/tree/master/~/.clau...

The scrutinize skill is the entry point. The Opus I'm talking to becomes an agent coordinator. He explores and autodiscovers the project's structure, subdivides it into logical sections.

Then he runs a truly absurd critic x section matrix against the entire project. Literally hundreds of these agents running in parallel, each focusing on one area. Ten minutes of this is enough to exhaust my Max 5x five hour window and put a serious dent in the weekly usage numbers.

It literally takes days to run a full agent sweep. I designed it around the rate limiting. The agents do file system style journaling in order to resume cleanly. They commit all of their findings as they go into an orphan branch in the repository. Further review runs can build on it and avoid searching for known issues.

The way it works in practice is I just run /scrutinize sweep and then go work on something else, or just go do my actual job, live my life, play video games, write an article for my blog or something. Come back five hours later to either resume the process or check the literally hundreds of issues that have been found by all the agents. Then Claude and myself will go in and evaluate and fix all of those issues one by one. Then review again. Then evaluate/fix again. I'm just gonna keep looping this over and over until zero issues are found. For all of my projects.

Going from solo hobbyist programmer to this was pretty insane. I can only imagine what these corporations with infinite money must be doing.

4 more replies

visarga2d ago

I did the same thing - task oriented work, each task a md file. I have a harness based on it: https://github.com/horiacristescu/claude-playbook-plugin

jstummbillig2d ago

I think it's great. People at a broad scale are getting first hand experience with resource management. It's a fairly cheap way of doing it too (in contrast to: learning this by managing humans) and we can all benefit from the skill transfer.

visarga2d ago

I find myself observing how my lead manages meetings ... "ah, this is like when I do that with Claude", "this is where he wants to understand what happened, like when I ask Claude" ... it's funny.

SubiculumCode3d ago

At the enterprise level though, its going to be hard to want to use a service in which costs are not predictable, and keeping those costs under control requires employee training.

mrgoldenbrown2d ago

>...use a service in which costs are not predictable, and keeping those costs under control requires employee training.

Isn't this a (mildly exaggerated) description of AWS, which is a very successful service?

noodletheworld2d ago

Mmm… but for AWS its pay for external use right?

So your costs scale with the number of users you have.

Thats an op ex that you can explain.

For tokens for developers its maybe closer, cost/outcome wise, to hiring an external consulting company to write your code; money paid scales with work done, no promise of delivery, arbitrary unpredictable external price changes.

Its not quite the same; though, similarly lucrative for consultants.

1 more reply

sidewndr462d ago

Am I losing my mind, aren't there multiple headlines each day about companies penalizing employees for not using AI enough?

iSnow2d ago

That was roughly 3 weeks ago, with the reprising of Claude 4.7 and GPT 5.5, things have become more spicy.

2 more replies

basch2d ago

since those headlines started ive felt it just encouraged inefficiency. "say as much as you can without saying anything." if you were accomplishing your task the need for more would end, thus there is incentive to never succeed.

jochem92d ago

You can put a limit on token spend and provide training (and even pre-configured workflows) on how to limit token spend.

Like the other commenter said: cloud spend can also spin out of control if you don't pay attention, yet we've found ways to keep it under control (training, guardrails, limits, transparancy).

harimau7772d ago

The problem that I see is what you do if someone runs out of tokens. It doesn't very well work to say "well I guess you just get fired because you can't work at full speed for the rest of the month".

Personally, this feels like its just trying to push the work of managers in allocating resources onto developers so that they have more work to do and can be blamed if anything goes wrong.

layer82d ago

To be fair, the cost of software development has always been fairly unpredictable. What may be different is that the cost used to be roughly proportional to man-hours spent, while now the number of agents running in parallel may be less predictable.

xienze2d ago

> To be fair, the cost of software development has always been fairly unpredictable.

Yes, but in a "oops this is gonna take another two months to finish" kind of way, not the "oops this is the 12th time this month 8 developers have burned $2K in tokens in a single day and no one really knows how it happened" kind of way.

1 more reply

ilovecake19842d ago

The cost per month is 100% known and always has been. What has been variable is the rate of delivery. AI is different and can be substantial in countries with lower wages.

salawat3d ago

There's no fucking training to mitigate a slot machine.

serf2d ago

that analogy is so boring now with so many real world examples of actual LLM work.

people still can't get over the unreasonable effectiveness of algorithms.

2 more replies

LPisGood2d ago

There’s actually been a ton of research on how to optimize “slot machines,” at least in a generalized sense. For more reading, check out the literature on multi armed bandits.

__mharrison__2d ago

Odd, I train teams (at large companies) to use harnesses effectively. So some training does exist.

I get the anti/skeptic sentiment. I've been called a lot of horrible things by a vocal contingent when they hear that I help train folks to learn software engineering best practices and then apply AI to that.

dgellow2d ago

Games like Diablo are basically a whole bunch of slot machines, and there are strategies you can follow to optimize your run.

1 more reply

subscribed2d ago

LOL, that's a sophisticated and sometimes slightly unpredictable multitool.

If this is the "analogy" you go for, you don't seem to be suited to make that comparison.

KronisLV2d ago

> There's definitely a way to use Claude code that is token conscious.

Colleague used Sonnet 4.6 on some pretty normal agentic coding tasks through AWS Bedrock to keep the data in the EU, 100 EUR usage in a single day. In comparison, the Mistral subscription costs about 20 EUR per month and we tested that for similar tasks it was okay, the usage got to around 10% of that monthly limit in a single day. Or Anthropic's own Max (5x) plan where you get way, way more tokens to do with as you please.

I feel like the sweet spot is having a monthly subscription with any of the providers (you're subsidized a bunch), but if you have to pay per tokens, now I'd just look in the direction of what tasks DeepSeek would be okay for, sadly probably not in the situation above. For a startup, though...

On the other hand, this feels a bit hypocritical:

> It was part of an effort to get project managers, designers, and other employees to experiment with coding for the first time, and sources tell me that Claude Code has proved very popular inside Microsoft over the past six months.

They're gonna say that the future is all AI... until they get the bill.

phillc732d ago

I was a Mistral Le Chat Pro subscriber (the €20/month plan). Yesterday I hit my monthly limit. Switching to PAYG I burned through another €40 in one evening, working on the same project, with the same tasks.

I upgraded my plan last night to Mistral Le Chat Teams. This now costs me €60 per month for two users. Limits have been reset, but I have no idea now if my per seat limit is higher than the Pro plan, or if the limit is shared between the seats, it’s really not clear. I guess I will find out next month. The limits reset on the first of the month and I really hope I don’t hit them in the next seven days.

I use Mistral Vibe CLI and I’ve written and implemented a couple of new skills[1]. Caveman, based on an idea I found online somewhere, this skill removes all extraneous response text, including articles. Makes for some fun reading, but supposedly reduces output tokens significantly. Hash-anchors, this one is based on a concept from Dirac[2], reduces search failures and also includes multi-file dispatch. It will be hard to measure, but Vibe tells me these two should result in roughly a 40% reduction in token burn.

[1] https://codeberg.org/MimosaDev/skills

[2] https://dirac.run/

michaelbuckbee2d ago

I was trying to get a better sense of the time cost quality matrix of these, so I threw together a quick eval of Sonnet 4.6, Mistral's dev model, and Opus 4.7 (figuring it's what you'd use if you were on Max).

The results for a function implementation and test of levenshtein distance in js are pretty similar but Mistral is 30x cheaper than Opus 4.7 and 4x faster than Sonnet 4.6.

https://5m6qnuhyde.evvl.io/

kaoD2d ago

But that's not very informative.

Levenshtein distance is not only a well-understood problem, it's small, self-contained, and extremely well-represented in the training data. The kind of problem where even small/bad models can excel. The golden standard for those tasks is just "use a library" so no wonder the beefy models are expensive: you're chartering a commercial airplane to go grocery shopping.

My personal benchmarks are software engineering tasks (ideally spanning multiple packages in a monorepo) composed of many small decisions that, compounded, make or break the implementation and long-term maintainability.

There's where even frontier models struggle, which makes comparisons meaningful.

2 more replies

KronisLV2d ago

The one detail I did forget to mention is that if anyone goes with the Mistral subscription (instead of paying per-token), then the Mistral Vibe tool gives you their Medium 3.5 model by default, with a 200k token context. It will probably be enough for plenty of tasks, though there's also a noticeable difference between that and up to 1M.

dgellow2d ago

> They're gonna say that the future is all AI... until they get the bill.

I mean, the will continue to say so, they just want to be the ones being paid for the service, not anthropic :)

tracker13d ago

My experience as well... I've only hit Antrhopic's 5hr threshold a few times, and two of them was within a half hour of the window. Also, all three times I'd already accomplished a LOT.

I tend to work with the agent, and observe what's going on as well as review/test and work through results/changes. I spend a lot more time planning tasks/features than the execution, even using the agent as part of planning and pre-documentation. It works really well. I don't think people burning through the 5hr allotment in under an hour are actually reviewing/QC/QA the results of what they're doing in any meaningful way, and likely producing as much garbage as good (slop).

I'm really curious as to HOW the MS employees were using the agents as much as what they were doing.

kristjansson3d ago

I suspect subscription limits are quite a bit higher than the equivalent tokens their dollar cost could purchase. I similarly feel like I can get a lot done with a $20/mo Claude Pro subscriptions, but also can easily spend $10-20/day at API pricing with similar usage.

brookst2d ago

Yep. I get $6k - $8k worth of tokens (at api rates) using the $200 max subscription.

lawn2d ago

I don't understand why people are using the API pricing instead of the Pro/Max subscriptions? What am I missing?

6 more replies

skeledrew2d ago

Can verify that I've gotten about $400 worth of tokens from my $20 sub.

1 more reply

brookst2d ago

I get 98.6% cache hits on Claude code. Short of drastic arch changes it’s hard to imagine it getting much better.

gobdovan2d ago

98.6% cache hits doesn't distinguish an efficient workflow from an overly chatty linear agent repeatedly reusing the same context. Plus, it says nothing directly that the process has good useful progress per token.

kridsdale12d ago

We are all going to be graded by (tickets closed / tokens burned) soon enough.

2 more replies

hedgehog2d ago

You pay for cache hits on every turn and even with the newest architectures longer context is slower/more energy intensive. Constructing concise turns that reuse prefix and stop when the new context is no longer useful help, as does pushing generation down into cheaper models while using stronger models for verification.

blitzar2d ago

> There's definitely a way to use Claude code that is token conscious.

By buying a subscription and dealing with the limits, using claude code and paying per token seems like the fast lane to the poor house.

nurettin2d ago

---- Before it was:

Me: We need to do this this that.

Claude: <random stuff that approximates human outout>

Me: Are you sure?

Claude: Well actually there is a bug <more random stuff that looks right this time>

----- Now it is:

Me: We need to do this this that.

Claude: <random stuff that approximates human outout>

Claude: Let me consult the advisor on that.

Claude: advisor came up with some advice, adjusting according to that. <more random stuff that looks right this time>

thegreatpeter2d ago

yeah, by using codex

relevant_stats2d ago

So, snippet from the article says the following:

> I understand that Microsoft is planning to remove most of its Claude Code licenses and push many of its developers to use Copilot CLI instead. While Claude Code has been a popular addition, it has also undermined Microsoft’s new GitHub Copilot CLI coding tool — a command line version of GitHub Copilot that runs outside of development apps like Visual Studio Code.

And people here are interpreting this as related mainly to the Claude burning too much tokens too quickly and suggesting Microsoft should rather use SomeOtherLLM©?

Is this Hacker News or rather Marketing Wars?

s_dev2d ago

So "Microsoft chooses to eat its own dogfood" is a more accurate title?

ninjagoo2d ago

> Is this Hacker News or rather Marketing Wars?

No public forum is naturally immune to the spread of (guerilla) marketing. [1]

[1] Internet Rule #48

johnnypangs2d ago

I don’t think people read the article, I didn’t until I saw your comment. The article feels like clickbait tbh.

righthand2d ago

It's a forum called Hacker News that's been hacked and covertly refactored into Marketing Wars. Being their primary goal is to foster a space to draw-in (marketing) projects/start-ups.

RobRivera2d ago

Por que no los dos?

Eso mensaje de hijo de Carlos

relevant_stats2d ago

Äh, was?

proxysna3d ago

Feels about right.

I've launched an internal demo of Claude Code and Deepseek on the same day and we burned through our monthly allowance for Claude in just over a week, with more than a half of that budget being spent in one day. With DS people are unable to go through that same amount of money in a month, not even close.

With that Claude feels like an expensive toy, while DS is a shovel, purely because developers do not feel like they are eating into a precious resource while using it. Also it does not feel like there is much of a difference in capability between Claude and DS-pro. DS-pro and flash do feel like sonnet/opus and haiku, but flash is still very-very capable.

onlyrealcuzzo2d ago

I rage canceled Claude today.

After 2 weeks of Claude getting progressively worse and worse, today was the final straw.

I don't care if they have a phone app. The model is COMPLETE garbage after you subscribe long enough and they think they've "got you".

I can't code on my phone if the model literally moves in the wrong direction and does the opposite of what I tell it to. If I wanted to make my code worse, I'd just randomly commit garbage. I don't need a mobile app for that.

couchdb_ouchdb2d ago

I've seen a lot of this sentiment over the previous six months from people on reddit. I have yet to experience this myself as a developer with over 20 years of experience.

fendy30022d ago

As always, I think this happen more to vibe coder. They don't understand that bigger project means worse AI performance. On top of that Opus felt being nerfed at understanding prompt so if your spec is bad you won't get good result.

colechristensen2d ago

What it does seem like is that they're tuning some knobs up and down or releasing new versions of models or system prompts that result in the model getting dumber and smarter in waves.

Opus has been dumb this week.

Claude was having a lot of capacity problems and downtime and then this week that has been much less obvious... and the model is dumber.

It could also just be luck and my impressions are false... who knows.

Our_Benefactors2d ago

It’s because it’s not true, there’s no evidence for it that passes the sniff test. No lab is “shipping a worse model once they’ve got you”. People have a bad few days and blame the model providers instead of stepping back to fix their workflow.

1 more reply

dgellow2d ago

Opus 4.7 has been a real downgrade for me. I’m back to mid 2025 when I had to catch all the completely intermediary goals/assumptions the model is creating for itself

3 more replies

johnfink82d ago

I see a lot of the "4.7 is a downgrade" sentiment. 4.7 does (mostly) what you ask it to do. 4.6 does what it thinks it should do. As someone with 20 years writing my own code I want the former, but the loud contingent online wants the latter.

When you're on a mature codebase with 500k+ lines of code, I haven't seen anything else be as effective as 4.7.

1 more reply

solenoid09372d ago

It's the same phenomenon as when you learn a new vocabulary word you see it everywhere.

People heard "Claude is nerfed" and now they see it everywhere, they notice failures a lot more than they would have otherwise.

Doesn't matter that Claude is not, in fact, nerfed. Perception is powerful and most humans are not rational.

2 more replies

mmusc2d ago

All these tools have almost feature parity. The GitHub cli allows remote sessions and can run anthropic models anyway

shomp2d ago

When you say "code on your phone" ... you don't mean what I think you mean do you? Like, are you actually using your phone to make code commits?

onlyrealcuzzo2d ago

Yes, you can do that with Claude Code.

Tell it what to do.

Commit, push to origin, review on GitHub.

Tell it to make changes, amend the commit, push --force-with-lease.

I'm attempting to make a memory safe language like Rust but with a substantially lower learning curve and added safety (but non-zero cost abstractions) fully with AI, almost entirely from my phone, commuting, getting coffee, walking the dog, between sets at the gym, replacing doom scrolling before bed and during lunch, etc.

Mostly to test how much LLMs can actually scale development.

Depending on how long it takes them to clean up some architectural slop in the MIR lowering phase, the results could either be very impressive or not.

From a purely cost basis perspective, it's hard to argue they aren't killing it.

But from a multiplier perspective, it's up in the air how great they are.

It's proven to be a really nice experiment, because much of what I wanted to solve with a language is the problems inherent to LLM development.

So at the self hosting phase, I get a great opportunity to see if the language can actually deliver on what I dream for.

1 more reply

kridsdale12d ago

Considered Gemini?

operatingthetan2d ago

Gemini got a big reduction in usage limits this week. There was backlash and they added 3x usage for Antigravity a day later but I haven't really tried it out to get a feel for it yet.

seabrookmx2d ago

Google rug pulled Code Assist and Gemini CLI. They're moving everything to Antigravity and we would need to reinstall all our tooling, reconfigure any automations, and the mechanism to subscribe via GCP is much clunkier.

This was all supposed to be worked out prior to Cloud Next, but it wasn't. Ironically, they mentioned Claude in a few of their presentations at next.

And that was our solution. We are a big GCP customer but our whole team is on Claude now and much happier.

saulpw2d ago

Google has burnt all of its goodwill in dev communities so no, I don't think Gemini is worth consideration.

Akamant1d ago

It seems like new Holy Wars will rage between those who can afford models like the Opus 4.7 or GPT 5.5 Pro, which in my opinion are unlikely to have been used for anything serious, as they're simply several times more expensive than any human effort. There's a speed advantage (still questionable), and quality isn't guaranteed, but the price completely kills everything. And between those who actually write the code and calculate the development costs. I won't talk about large corporations that can afford it, much less those who develop models. "Mere mortals" simply don't have such opportunities. 17 GoLang microservices for a serious project were written perfectly using the latest version of QWEN. The only areas where we really had to work hard were documentation and a very serious task breakdown. All of this was tested, and yes, a review was required, but everything was within reason. The deadline was 10 days of 24/7 work, including the review. When attempting to submit the same task, Opus 4.7/4.6 had to be stopped after three hours. If you have significant resources for experimentation, you can certainly try. For us, the choice is absolutely clear at this point.

pratikel1d ago

I’m curious to learn of your problem where you needed 17 Golang microservices (assuming these are newly created).

zkmon3d ago

My experience is, Claude Code burns way more tokens compared to other agents, probably to ensure high levels of perceived quality, which is, most of the times not worth the bloat for the user. The bloat works for Anthropic as an advertisement at the cost of your tokens.

andrekandre2d ago

its kind of weird tho, jensen also said we should be burning tons of tokens as well... 'perceived quality' cant be the only reason these ceos pushing token usage so hard can it?

verdverm2d ago

reasons for token usage beyond expectations

1. right now, usage correlates with experimentation and learning, few if anyone knows how to make these things effective on their own over long sessions of activity

2. long term, you should be using more than one agent at a time, because they are running in the background based on events (new direct message / something happened in eg. github)

rnxrx2d ago

Thus does kind of beg the question: If developers are being laid off because AI is better/faster/cheaper or makes all their people 10x or whatever fig leaf, what happens if the required tooling ends up being more expensive? From the investor’s point of view is the drag of employee costs better or worse than a ballooning expense item?

andrewl-hn2d ago

They lay people off and look good in front of investors. Then they hire people, talk about "growth", and once again look good in front of investors.

This would never fly if stock market was rational. But it never is.

dawnerd2d ago

And if/when companies need to scale back their ai investments they can spin it too and the stock market will eat it up.

marcosdumay2d ago

They are just being AI efficient, and doing more while spending less on it :)

I wonder if this will happen before they have some obligatory debloating of the investors exposition to the company.

dividedbyzero2d ago

I suppose if it all works out it'll end up way more expensive than the employees the models displaced ever were. These kinds of technologies usually end up as an oligopoly at best, and those players will have a wide moat by then, and the things these models build will be tweaked such that no other model or human being can realistically work on them anymore, and then they can price gouge everyone to the brink of unprofitability.

kridsdale12d ago

At least the models don’t need health insurance, office space, a cafeteria, or have a threat of unionizing.

dividedbyzero2d ago

The model provider would be like a union, at least if unions had absolute control over their members, could take them all away at any time forever with no substantial negative consequences to itself, and spend billions on employer lock-in so switching to the competition is worse than paying the 12% model salary raise.

thewebguyd2d ago

Shh, that's the quiet part the investors don't want to say outloud.

Applejinx2d ago

Because they are not people or alive, you can literally torture them if it gives you a mild increase in performance. For all practical purposes you can't do that to living humans. What is the price to put on being able to do that? It might weight the scales a bit for some employers.

user342832d ago

There's 10-15 labs near the frontier, and like 30 serious inference providers, over 70 total on OpenRouter.

With research and hardware near guaranteed to bring the efficiency way up, I'm not scared here of massive price hikes.

There is no moat.

JumpCrisscross2d ago

> If developers are being laid off because AI is better/faster/cheaper

This is, in my opinion, tripe. SWEs are being laid off because of post-Covid over-hiring. The only evidence for labour destruction is in junior hires. But not because anyone is being fired, but because entry-level jobs are being cannibalised.

Ekaros2d ago

In general economy that is not the stock market is looking less and less great. Answer to this is to tighten the belt and that means losing employees. Especially as there has not been any new great revenue sources outside AI in recent years.

visarga2d ago

> Especially as there has not been any new great revenue sources outside AI in recent years.

Nobody can make a profit with AI. Any clever idea can be cloned with AI, competition makes it unprofitable. No moat, no arbitrage opportunity. "During the gold rush, the only people making money were the men selling shovels."

We can definitely do amazing things with AI, and it makes us have superpowers, but so does everyone else. My competition also uses AI. I have to keep up with an AI powered competition now.

1 more reply

thewebguyd2d ago

I suspect AI would have to get drastically more expensive before it starts looking worse than payroll. If one developer using Claude Code can effectively substitute for 2 developers, you are already coming out ahead at current API pricing assuming very heavy usage, your cost is going to be ~1.5x developer (factoring in beyond salary - benefits, PTO, the other overhead that comes with having employees).

So you're getting 2 for the price of 1.5. Scale that up to 500 devs at a big company and it's a big chunk of change saved on payroll.

Keeping your headcount or hiring humans instead, AI would have to start to cost upwards of $15k/month/developer or more before it costs more than hiring. You're looking at about 4 billion tokens per month before humans start to break even or are cheaper.

jayd162d ago

You're starting from the assumption that its a 2x benefit. That's a massive leap.

thewebguyd2d ago

True, that was more hypothetical if it got good enough to 2x.

But even taking a more realistic 1.25x (20% time savings) gain, lets say you drop from 500 to 400 devs, you'd have to hit around $4,000/dev/month in token spend before hiring humans again would break even.

Payroll is just expensive, in most companies it's by far the biggest expense. AI still has to cost drastically more before investors would call it out as being worse than increasing headcount, from a pure dollars perspective.

mrgoldenbrown2d ago

Also assuming that current API pricing is sustainable and not subsidized.

ilovecake19842d ago

This is economy dependant. It’s really Indians why will take the brunt of AI job losses.

__mharrison__2d ago

Interesting point. Outsource the outsourcers...

ako2d ago

More expensive is a difficult calculation: faster can sometimes warrant the higher cost, if it means you can go faster to market. Also, LLMs work 24x7, and can be scaled up and down as needed. Faster to off board an LLM than to fire an employee (especially here in Europe). So, even if AI is more expensive than a developer, from TCO and ROI perspective it can still make business sense.

ngc2482d ago

"AI" is just a cover for laying ppl off and saving cost. But the pendulum will swing the the other way and the companies will realise that knowledgeable ppl are still required to generate and utilize the generated code. No serious company can run with vibe-coded apps generated by laymen.

ares6232d ago

There is no profit, expense, revenue. Those don't matter. Only thing that matters is stock price goes up, and laying off makes stock price go up. When laying off make stock price go down, then laying off stop.

stock_toaster2d ago

I imagine layoffs are also very much "this quarter and next quarter" with regards to investor visibility.

While LLM Opex is "some future quarter" and very easy to co-mingle with other expenses.

o104493663d ago

I switched from Anthropic to OpenAI after spending ~$40K in equivalent token costs using Claude over 3 months.

I found Opus 4.7 to be slow and wasteful with token usage. It's shocking how inefficient it is with tasks like bash tool usage and web searching, delegating them to a dozen subagents only to get stuck and never return until you esc and intervene. That, in addition to all of the broken tooling Anthropic built in to limit token usage like the broken monitoring tool made managing Claude a chore. I was happy to pay $200/month for Opus 4.5 when they had more capacity, but 4.7 felt like a huge step back and no longer worth the price and inconvenience.

I remember an OpenAI employee comment on the GPT5.5 release post about how they specifically geared it towards long-horizon tasks and its been a breathe of fresh air in that regard. I have five two-week long sessions going right now and there's been no degradation in performance or efficiency. It's much better at carrying rules/learnings forward even in long-running sessions and grounding/refreshing itself in verified facts when it loses context.

Its funny because in two weeks I've gotten way more done with GPT5.5 with way fewer tokens and way less handholding. I think this goes to show how important tooling and the harness is and how a capable model like Opus 4.7 can be severely handicapped by bad product decisions.

gnat2d ago

Being able to mange context over long running sessions is a function of the harness, not the model. Are you using Claude Code with GPT5.5? Codex? piclaw? They’ll all have different context management strategies to let you keep going when you would otherwise have filled up context and be forced to stop.

beering2d ago

It doesn’t matter how good the harness is if the model does a bad job of planning and continuing from long context. A good harness cannot overcome a weak model.

robertkarlOP3d ago

Cancellation effective June 30. This was a _pilot_ launched in December that accidentally consumed their 2026 yearly target spend on AI!

I expect the r/LocalLLaMA guys to be going nuts about this news.

thewebguyd2d ago

From the article

> It was part of an effort to get project managers, designers, and other employees to experiment with coding for the first time.

I suspect they weren't as efficient as they could be with token use either. Sounds like they were trying to encourage non-developers to vibe code stuff

xienze2d ago

I'd argue you have a lot more to worry about with developers as far as token usage goes because they're the ones who know how to rig up these wild workflows where tens of agents simulate an entire software development team. The non-developers are probably going to be sticking more in the realm of iterating via chat.

1 more reply

plaidfuji2d ago

Our shop is forced to use Copilot on gov cloud, and it’s so useless I usually stick to manually coding. Its syntax is messy, it randomly combines lines together, flips order, or drops a couple tokens worth of output in the middle of a line, and for some reason it consistently drops the last line of every code block. I assume we’re getting a few versions back of GPT under the hood. But it does make me appreciate how the models of the past year or so crossed the threshold from interesting to truly productivity-enhancing.

Between Copilot, Claude, and Gemini, I still actually prefer Gemini. I do a lot of scientific writing in addition to coding and Gemini is the only model I can trust to “just be right”. This trust then transfers over to its code output.

totalhack2d ago

If you are talking about the Copilot built into vs code, that's not been my recent experience at all. Very capable in agent mode since gpt 5.4 came out.

siva72d ago

Especially since gpt 5.5 it's on par with Opus 4.7 or Sonnet.

cbdevidal2d ago

I’ve been quite content with CoPilot’s $10/mo plan. Still offers access to Claude models (limited tokens) but has no time limits like the $20 Claude plan, so no interruptions in work flow. I use one of the free models for the more pedestrian tasks then sic Claude on the particularly thorny problems. Works very well for me.

mellosouls2d ago

I'm not sure if you are referring to the old or new plan?

Github Copilot offered probably the best value and was IMO underappreciated for a long time; I've been an annual subscriber since day 1.

The changes announced a few days ago completely revoke that value proposition, I doubt I'll continue with it.

cbdevidal2d ago

There’s a new plan? Ugh. I signed up about five months ago.

mellosouls2d ago

Yes, unfortunately. eg. discussions below - I think I have seen multipliers of 9x cost to existing use cases:

Changes to GitHub Copilot individual plans

https://news.ycombinator.com/item?id=47838508

GitHub Copilot is moving to usage-based billing

https://news.ycombinator.com/item?id=47923357

Multipliers for annual subscribers:

https://docs.github.com/en/copilot/reference/copilot-billing...

1 more reply

__mharrison__2d ago

Copilot was the best deal in AI tooling for a few weeks there.

New pricing model changes that. I will still keep it around for autocompletion (for the rare times when I open up an editor).

cbdevidal2d ago

Can even buy more premium tokens for more Claude use, which I have done once. But most of the time the tokens included in the plan are sufficient.

keyle2d ago

The title is somewhat bait. It reads like MSFT is using less AI, while in fact it's just a force swap to Copilot.

Arguably, Copilot is GPT 5? Not sure what the CLI offers behind the covers.

meowkit2d ago

Copilot is the name for the harness / wrapper of MSFT products

The CLI can swap to whatever model (/models) based on your subscriptions.

The copilots on desktop or Office Apps are likely just GPT5 nano or other tiny models with cheap inference

golf10522d ago

Employees (at least on my team) get access to the Claude models as well when using Copilot CLI.

patentlyze2d ago

I disagree. As someone who just got a new Windows laptop with Copilot baked(forced) in I've tested Copilot a lot.

It. is. so. bad.

It feels like it's at least 1-2 years behind the current top models.

gbro3n2d ago

But there isn't a copilot model is there? Just a harnesse, and the vscode copilot extension is pretty good (haven't tried the tui)

alternatex2d ago

Copilot cannot be behind any models because it's a harness, not a model. You can use any of the popular models through it, including Claude models. Though people have been saying that Claude CLI is a better experience.

keyle2d ago

Your Copilot free offering isn't the Copilot they're using within the company for coding assistant. It's confusing I know.

tored2d ago

Copilot is not the same agent as GitHub Copilot.

andrewl-hn2d ago

I'm surprised they even had them in a first place. Doesn't Microsoft have a deep partnership with OpenAI? Aren't all Copilot things powered by various GPT models? I would assume the two companies have barter agreements of sorts.

RevEng2d ago

They do have agreements, but they aren't exclusive, and Microsoft and Open AI have had a rather public falling out over the last year.

tyleo3d ago

Lots of these places measure employee token use with managers having dashboards. It seems like performative code production rather than making anything useful.

Speed without judgement always compounds badly.

andrewl-hn2d ago

Tokens are current era' "lines of code per month"

https://www.folklore.org/Negative_2000_Lines_Of_Code.html

skeledrew2d ago

Well, that's the inevitable outcome of token-maxxing :shrugs:

thisislife22d ago

More here: Microsoft reports are exposing AI's real cost problem: Using the tech is more expensive than paying human employees - https://fortune.com/2026/05/22/microsoft-ai-cost-problem-tok...

guluarte3d ago

I think tech companies are doing layoffs partly because they need to cover AI operating expenses.

stock_toaster2d ago

I think so too, otherwise why wouldn't you put that (purported) increased capacity/output into improving your existing products or creating new ones, with the headcount that you already have?

maxignol2d ago

This might actually be clever since Microsoft dev will be longing claude code features and might result in copilot getting way better

ryanhecht1d ago

That's what we've spent the last five months doing! Have you tried the Copilot CLI recently? We've onboarded loads of feedback from Microsoft devs who were switching from Claude Code -- I'm proud of how far the team has come! This announcement comes at a time where Copilot CLI usage has been greater than Claude Code usage at Microsoft for several weeks; we've been winning hearts and minds!

gradientsrneat2d ago

Related: Microsoft-owned GitHub recently switched to token-based billing:

https://github.blog/news-insights/company-news/github-copilo...

Claude tokens are priced by GitHub at a disproportionately premium price compared to Gemini and OpenAI. I wonder why?

https://docs.github.com/en/copilot/reference/copilot-billing...

dsagent3d ago

I think whats funny is that employees were most likely already covering the cost for these tools because they are useful. Companies didn't believe employees were using these tools and now have forced their usage and no longer have the costs subsidized.

Similarly companies seem to reward high token usage as a sign of someone willing to play ball with AI and again have forced higher costs on themselves for people reward hacking or using tokens out of spite.

QuiEgo2d ago

There is no world where I can put my company’s data through an external site without their express consent and security sign off. I suspect at most companies there’s zero path for people to have been paying for it themselves.

kridsdale12d ago

An enormous percentage of America’s white collar work force has been doing this since 2023.

Fun fact, up until you face a consequence for crime, all crime is free! Have fun and go win the competition game against your co-workers.

cityofdelusion2d ago

None of the 5 places I have worked is this possible, but they are also all highly regulated industries. Firewalls block virtually everything by default.

QuiEgo2d ago

Fair, but I assume everything on my work laptop is key logged. Surely they would notice Claude phoning home from my company laptop? I suspect a network rule to look for that traffic is trivial?

1 more reply

InsideOutSanta2d ago

My guess is that at most companies, employees are prohibited from doing this, but not prevented.

uniclaude3d ago

That's very interesting to reconcile with the fact that not too far, Amazon employees feel incentivized to use as many tokens as possible.

HDThoreaun2d ago

"incentivize to use as many tokens as possible" = "Upper management knows people dont like change so we are forcing them to come up with ways to use this thing". It does not mean that management will encourage wastefulness in the future, and it also doesnt mean that token usage from now wont be reviewed in the future. Whats to stop them from dinging your performance in november because you wasted a hundred thousand on tokens with nothing to show for it?

boelboel2d ago

Makes sense why Anthropic wants to IPO as soon as possible as the growth right now comes from temporary wastefulness. Makes all the investments more risky.

sreekanth8502d ago

If you properly keep documents, architecture, and decision records, token consumption can be pretty less. Iam managing everything with two codex plus sub. Repo size is 300 k loc ( backend).

usernametaken292d ago

I switched to OpenRouter and OpenCode a while ago. It is much cheaper, much much cheaper, and A LOT more reliable. Particulary Gemini was a piece of trash when it came to uptime

zabil2d ago

I switched from Claude code to the GitHub copilot app recently. Since our repositories are hosted on GitHub I find the copilot app better integrated for the PR workflow with PR management available in the app. I don’t think I miss any of the features of Claude code I never thought I would make the switch but copilot upped the game.

Also it became very hard to convince management to keep both Claude code and GitHub Copilot enterprise licenses.

loloquwowndueo2d ago

Reminds me of when Steve Ballmer forbade his children to use iPods and pushed towards the Zune instead. Hahaha

1 more reply

andyfilms13d ago

Surely a company as large as Microsoft is actively attempting to build their own models. They couldn't possibly have expected to stake the future of their software development on the conditions of a third party company?

mrweasel3d ago

Okay, but what if you're not Microsofts size and don't have and R&D budget large enough to fund development of your own models and tools?

This is a warning to any company, not building their own AI, that AI assisted development could become really expensive really fast and most likely won't pay off. What Microsoft is suggesting is that the current price is to high, but it's still not high enough for e.g. Anthropic to be profitable, or AI coding tools are only as good as the developers using them. So you can't meaningfully do layoffs by replacing the developers with AIs, because the cost is to high.

How does Microsoft plan to fix CoPilot, so that the cost will be so much lower than Claude, that budget overruns won't be a problem for their own customer?

andyfilms12d ago

I expect in the next year or so, we'll stop seeing headlines like "Anthropic buys $15b of compute from SpaceX" and we'll start seeing headlines like "Uber's AI department licenses GPT 6.2 as the foundation for their internal model," or something like that.

Smaller companies will have departments that distill larger models into something more specifically manageable and useful for them. At least, that's my personal prediction :)

mrweasel2d ago

How would that help with pricing? The cost of hardware is already subsidized to hell and back by investors and that's not dropping costs enough. I'm not concerned about Uber, they are way to big. I'm thinking sub 1000 employees in total and maybe 50 - 100 people in the IT department. Are they just going to be cut off from AI tools, because the cost of running them would ruin the company?

I do think your prediction makes sense, because the AI really isn't the product, it needs to be baked into something and licensing the models saves you the R&D and cost of implementing your own.

jcgrillo2d ago

> Smaller companies will have departments that distill larger models into something more specifically manageable and useful for them.

In order to do that they'd have to make a concrete business case to justify the headcount and compute costs. They'd be facing the same fundamental economic problems Anthropic, OpenAI, MSFT, etc are facing just at a department level instead of a megacorp level. I hope they try it, sunlight is the best disinfectant.

However, when the pressure is turned up and people have to actually show results--and, like, be accountable--instead of just buying a subscription and externalizing the accountability, I don't think we'll see so much enthusiasm about AI coding. Whether or not an engineer is actually more or less productive with AI (not merely whether they feel more productive) will begin to matter a lot more. I don't see how people continue using AI in this hypothetical small company under those adverse conditions.

kridsdale12d ago

Giving your workforce Claude is like giving everyone in the USPS a Ferrari.

There may be a spot of “good enough to pay for and make a profit” that exists.

onlyrealcuzzo2d ago

MSFT and Apple are taking the same approach.

The frontier model space costs 1000x as much to develop as the small language models, and is only 1.5 years ahead.

Factually, the frontier models have not paid for themselves. So, if you're MSFT and Apple, you don't need to run in a race where even the winner loses massively.

You can try to train models 1.5 years behind that are highly likely to be profitable, given your market position.

The average person is lagging behind what AI is capable of by 3+ years anyway...

So you can save 1000x on training and 10x on inference and just use SOTA small models.

Why spend $5B training a model that's for sure not going to make $5B (after inference costs) when you can spend $5M building one that WILL make far more than that after inference costs?

NitpickLawyer3d ago

> attempting to build their own models.

At one point there were rumours that they'd do that. They also have the rigts to oAI models for a few more years still, so they could always use that but apparently they're also compute starved (like anyone else).

kridsdale12d ago

MSFT does have a frontier AI Lab. My friend works there. I don’t know what they’re doing. But MSFT is one of like 5 entities that actually have the talent and physical infrastructure to compete in model-building.

rglover3d ago

Curb Your Enthusiasm theme starts playing.

andrekandre2d ago

i was thinking more arrested development but that works as well

goldylochness2d ago

after having used claude for quite some time, i would buy puts on microsoft

visualphoenix1d ago

Good luck to them! I recently had the misfortune of fighting Copilot on a Github PR and it made me want to never contribute to the project again.

killerstorm3d ago

The way coding agent work is fantastically wasteful. All the megabytes of code are processed over and over and over, sometimes withing just one session.

There are papers describing KV cache precomputation for commonly used documents (e.g. KVLink), but, of course, it's not a priority for model providers: they'd rather sell you more tokens, also they would rather get to AGI/ASI first than optimize usage of existing models...

brookst2d ago

Claude code gets >98% KV cache hits. It’s not reprocessing unless you let the cache go cold (5 minutes, which is annoyingly short).

killerstorm2d ago

I meant caching on a bigger level. If you're an organization with 100 developers each doing 10 sessions a day, you're paying for 10000x tokens in frequently used document even if you had 100% KV cache hits within one session. Apparently that's too costly even for companies with trillion dollar market cap...

Normally KV cache works only if your context prefix is identical, but there are papers which demonstrate documents can be cached between different contexts.

brookst2d ago

Ah, understood, and thanks for the clarification!

beoberha2d ago

I believe OP is talking about new sessions or after compaction. He’s getting at the fact that LLMs are stateless and have to rediscover your codebase on every new session.

iainmerrick2d ago

To be fair, on the Monday morning after a holiday, that’s exactly what I’m like too.

1 more reply

dgellow2d ago

Are you sure that hitting the cache mean you’re not paying for those tokens?

brookst2d ago

You pay, at 10% the price (in quota or dollars) for non-cached. See https://platform.claude.com/docs/en/about-claude/pricing

1 more reply

geoffbp2d ago

How efficient is Claude at cleaning up unused code and making things more simple - as good as it is at adding code / features?

wg02d ago

Microsoft should host DeepseekV4 internally for its developers. And you're welcome.

chris_money2022d ago

Microsoft does self host claude and gpt for GHCP

rvz2d ago

This is the smartest solution to do, to self host the model locally on premise.

kridsdale12d ago

And by that, you mean, in Azure, surely.

wolvoleo2d ago

What's the point of eating your own dog food when the only thing you are doing is reselling other people's dog food? Microsoft don't have any competing LLM.

fredcallagan1d ago

I have noticed particularly in recent weeks and maybe couple of months that token costs are just ridiculous. I can understand the upcoming IPOs and instinctive pressure to show profits ... but let's be honest, showcasing burning 1.3 million USD in tokens by a single developer in a month is the most ridiculous thing I have seen in my entire life. The general principles still apply. You expect investing X and have a return on such investment. Unfortunately that's not so easy to promise or expect. There's no real 1 to 1 correlation between amount of code written and returns, and even less between tokens burned and returns. I start to believe that the current token pricing approach, followed at the moment by all leading labs (especially considering OS models capabilities), is bordeline delusional ...

gmerc2d ago

They got DeepSeek on Azure, would cut costs by 10x … if they ran it on Huawei

matt32102d ago

Tokens aren’t that much of an issue when your not evaluated on the usage

sergiomattei2d ago

My impression is they're being cancelled in favor of full internal adoption of Copilot CLI, which has got much better over the past few months.

Shalomboy2d ago

I'm also a big fan of Copilot CLI, especially after demoing it to a coworker who liked Claude Code.

Kapura2d ago

"everybody needs to use these new AI tools or you will be left behind. no! not like that! the cheap, worser ones!"

heisenbit2d ago

How would one call such a strategy? Embrace and extend comes to mind.

lou13062d ago

This has really little to do with embrace and extend. They are not taking over an open standard or anything like that.

If anything, it's forced dogfooding, i.e., forcing their own workforce to beta-test their product.

dminik2d ago

To be fair, Microsoft dogfooding something for once would be great.

la647102d ago

It seems that people are using LLMs to generate code but many complain of sub par code. I recall the early days of virtualization when folks will use it but complain about performance. HW capacity continued to improve until virtualization became de facto standard. I wonder if sub par code will become better as more powerful agents models or compute become available.

jgalt2122d ago

What per cent of internal Microsoft IP runs through Anthropic? Do they not care about trade secrets, or certain groups allowed or not allowed to use tools that expose IP to external vendors?

nobodywillobsrv2d ago

This feels like these kind of bad incentive problems we always here about on here ... Like bugs and vipers.

DeathArrow2d ago

Doesn't MS have the compute to run GPT 5.5 for all its employees?

jadar2d ago

It's been said that technologies are not product. CC might be better, but at the end of the day M$ is going to want to cut costs and have employees use their own technology. Perhaps Copilot CLI is close enough, and the CC product doesn't justify the cost of the Claude (technology) license when M$ has their own technology to leverage.

Side note, it's so frustrating that The Verge puts a paywall at the fold. It makes me feel like the rest of the story is not worth reading. I'm not inclined to pay $2 to read a link that was posted on an aggregator.

ndiddy3d ago

This is an AI generated summary of a blog post (https://www.thelowdownblog.com/2026/05/microsoft-cancels-int...) which is a summary of an AI generated article (https://blazetrends.com/microsoft-cancels-claude-code-pilot-...) which is a summary of another AI generated article (https://www.themodelwire.com/article/microsoft-starts-cancel...) which is a summary of an article from The Verge (https://www.theverge.com/tech/930447/microsoft-claude-code-d...). I guess it would be better to link the Verge article instead.

m1323d ago

The absolute state of the Hacker News main page in 2026. Thank you for taking your time to put it all together.

fishtoaster3d ago

https://archive.is/WfCta

ajd5553d ago

2nd link doesn't work. That would be a neat tool, to find the original article and see how many levels of AI summary it has gone through, a game of AI telephone!

OnionBlender3d ago

I had thought about creating something like that for finding comments for articles. For a given article, display links to comments for HN, lobsters, reddit, etc. However, I feel I already waste too much time reading comments. I shouldn't make it easier and more tempting.

robertkarlOP3d ago

My bad. I had trouble finding the original source when I googled for it and grabbed a link. I was originally shown a screenshot of a x.com post.

robertkarlOP3d ago

I emailed dang to politely ask to make the link point to the Verge article since I can't update it.

q3k3d ago

i swear i'm going to start an amish community and internet where we forbid any technological development past 2019

call me a luddite, i'll be wearing it as a badge of honor

BoiledCabbage2d ago

Man, maybe it's time for me to give the verge a subscription. There the only ones actually doing any journalism here and a bunch of AI blogs skimming off the top.

siva73d ago

boy i'm leaving the internet. sun is shining. was a good time here while it lasted.

scarmig3d ago

The artificial centipede.

sashank_15093d ago

Welp, this is the future we live in now

josefritzishere2d ago

AI slop ruined a story about AI? This thread is a story about itself.

thadk3d ago

Microsoft poorly manages token use of most expensive models in a pilot. Then they use that failure to advertise/position their own Github Copilot agents to procurement teams, over the now widely validated Claude Code-based agents.

At least Codex is trying to win validation on merit.

j / k navigate · click thread line to collapse

458 comments

harimau7772d ago

10 more replies

iamflimflam12d ago

From reading the article. They offered their developers both Claude code and Copilot.

What they wanted was for them to use both and feedback which was better.

The developers voted with their feet and didn’t use Copilot.

What Microsoft were hoping was that the opposite would happen...

gofreddygo2d ago

For months, Employees had the option to choose claude code or copilot. Now they dont.

Underlying model choice still has no restrictions. Opus 4.6 is by far the most popular. there's still big $$$ bills going anthropic's way.

comboy2d ago

Curious if anyone around here stayed on 4.6 (having a choice to use 4.7)

EdwardDiego2d ago

I went to 4.7, didn't have a choice, found it unsatisfactory, then Claude quietly added in the option to use 4.6, so I'm back on 4.6, and I'm not the only one in my company.

I had far more hallucinations with 4.7 than 4.6.

I can give examples if needed, I screenshotted the most aggravating ones, but what worries me is which ones I didn't recognise.

3 more replies

zmmmmm2d ago

fendy30022d ago

Stay with 4.6 if you can, it is disabled (afaik) on vscode claude code extension.

4.7 IMO is around 10-20% worse at understanding your prompt intention. You need more effort to explain your intention clearer so it doesn't divert.

3 more replies

lifthrasiir2d ago

4.7 turned out to be a disaster in multilingual settings, so I sticked to 4.6 so far. 4.7 seemed to be optimized for (very specific slice of) coding at the expense of everything else.

1 more reply

pimeys2d ago

I still use 4.6 if I need Opus. It's mostly GPT-5.5 for me. Only if I know it cannot do some thing like push without running the tests (because AGENTS.md said so), I switch to 4.6.

Although GPT's been acting weird since Thursday...

SequoiaHope2d ago

I’ve stayed on 4.6. Was thinking of trying 4.7 though just today. Still, I did not jump on it day one.

nijave2d ago

Switched back when 4.7 had an issue last week and it was wayyy faster. I assume mostly because a lot of people have moved off but might consider using it more just for the speed boost.

willtemperley2d ago

I don't want to change from 4.6 because I'm finding it so good (I could change).

I've spent the last couple of days building Swift bindings to a monster CPP lib and I've actually had fun.

zuppy2d ago

vasco2d ago

bdavbdav2d ago

Teams is the only one with seat pricing. Teams has a user cap of 150. Enterprise is usage based pricing only now (with a £20/user service charge)

fortran772d ago

I use copilot cli and I can pick Anthropic models. The Microsoft interface seems fine to me, and equivalent. Not sure what the big deal is.

gwerbin2d ago

1 more reply

krzyk2d ago

Harness makes a difference. Also in copilot you have smaller context for Claude models.

And you get a token based pricing since June 1.

verdverm2d ago

ryanhecht1d ago

> The developers voted with their feet and didn’t use Copilot.

verst2d ago

Most of us never had the option for work to pay for Claude Code -- some internal orgs did this. That being said I had a personal Claude Code subscription for a bit.

RA_Fisher2d ago

Do people bring their own then (considering work doesn’t pay for it)?

krzyk2d ago

Our corp specifically prohibits that, because of code leak/training.

cfunderburg2d ago

I wish I could understand the appeal of using Claude Code inside VScode rather than Copilot. I feel like I'm missing something obvious.

tags2k2d ago

lanstin2d ago

bakies2d ago

I just have git diff open in another terminal. Everything I do is in the terminal.

rplnt2d ago

Slightly related (me not understanding) is why the Copilot in VS code is essentially just CLI interface. Why can't it use the IDE tools (search, LSP, ...). All it ever does is trying to execute grep.

jwoq911819h ago

AlexMoffat2d ago

a1o2d ago

There is an option to turn on semantic indexing and search on copilot in vscode. Although I have no perceptual differences when I turn it on. The docs mention something about it.

https://code.visualstudio.com/docs/copilot/reference/workspa...

mattmanser2d ago

Someone mentioned here the other day that when you try and give Claude those tools throughan MCP or skill it tends to go a bit loopy.

At the moment it seems like the way it's been trained has been tightly coupled with grep.

It does feel bizarre though that it doesn't use the symbol servers.

skywhopper2d ago

Especially if you want effective results.

2 more replies

ninjagoo2d ago

> I wish I could understand the appeal of using Claude Code inside VScode rather than Copilot

MS thinks CoPilot is the Clark Griswold of LLMs when it's really Cousin Eddie...

gbro3n2d ago

RA_Fisher2d ago

Claude Code will write the whole thing for you. Whereas doesn’t Copilot require input along the way of coding? ie- it doesn’t do all the programming for you

mirekrusin2d ago

It can code the whole thing for you, copilot in vscode is simply better, people just never tried it.

__mharrison__2d ago

If you give Copilot a file with a list of tasks to complete, it will try to churn through them (just like most other harness would do these days).

1 more reply

stanac2d ago

I think they were comparing CLIs, not VS extensions.

mattmanser2d ago

I'm a little the opposite, what's the point of using an IDE with AI? I genuinely don't get it?

These days I just use Claude Code Desktop or Claude Code in powershell. Standalone, not inside and IDE. Honestly, I'm using Desktop more and more as it gets more features.

The IDE is for me. No AI in it at all. If I want to get Claude to do something specific to a file I just @ the file.

vovavili2d ago

2 more replies

serf2d ago

the obvious answer is because it's easier , faster, and more efficient to flip a true to false right in front of you than it is to prompt an llm.

if your response is "my prompts don't produce code that needs values flipped, ever." then I would wager you're only touching very simple things with an LLM.

for me I don't care about the token cost and prompt writing so much as the fact that it's just faster to change 0 to 1 and leaves me twiddling my thumbs for an llm output less.

5 more replies

Sharlin2d ago

1 more reply

subscribed2d ago

> what's the point

Tab completion.

But I'm not a developer, so I use both - haiku via github for tab completion and CC for cli.

harimau7772d ago

For Windsurf at least, it makes it easier to control context. I can simply drag and drop a file from the IDE into the chat.

I can also click on a file referenced by the AI and have it open immediately in the IDE so that I can inspect it.

Finally, it is a pain to write long, multi-line prompts in a CLI where you can't easily click around to edit different parts.

The primary weakness I've found in IDE based UI is that it struggles to get through the corporate security in order to run commands.

fendy30022d ago

All of them are valid usecase of VSCode CC extension for me.

cameronh902d ago

Microsoft have historically tended to dogfood their own products.

Anon10962d ago

ryanhecht1d ago

> Microsoft internally is actively hostile to cross-org collaboration

Quothling2d ago

Or maybe you're right and they want their developers to use the copilot models.

pwarner1d ago

Copilot Cowork seems to be the best part of M365 Copilot by a huge margin.

Quothling1d ago

__mharrison__2d ago

Copilot was great when folks were semi-attempting write their own coffee and needed auto complete.

There's a large (and growing!) contingent of people who don't write code these days. (Many don't even use the keyboard.)

Insanity2d ago

Wonder if Amazon will do the same with CC and Kiro now that we internally have access to both.

I think Kiro might have some “first mover” advantage internally, but CC feels better to use.

fg1372d ago

I never understand why Amazon even bothers to build their own coding agent.

To me, it's just a matter of time before they are sunset, like Chime or a bunch of AWS products.

Insanity2d ago

In fairness, Chime had tons of internal use and I quite liked it.

For Kiro, I agree with you, it seems like wasted effort and Anthropic / OpenAI are miles ahead in their tooling.

cameronh902d ago

I love AWS at the infrastructure level, but their PaaS tends to be meh, and their end-user directed stuff is usually atrocious.

1 more reply

tra33d ago

There's definitely a way to use Claude code that is token conscious.

I've tried throwing unsupervised agentic software factory workflows against the wall, and they burned through my tokens like nobody's business but didn't produce much.

Supervised, human-in-the-loop process on the other hand is much more productive but doesn't consume nearly as much. Maybe that's why everyone's pushing agentic approaches so much.

CoolestBeans3d ago

dualvariable2d ago

"The bureaucracy is expanding to meet the needs of the expanding bureaucracy"

visarga2d ago

beardyw2d ago

I didn't know that one. Loosely said to be Oscar Wilde.

1 more reply

matheusmoreira2d ago

Yeah. Claude does good work but reviewing it all properly takes quite a bit of time. It got to the point I started having trouble maxing out my weekly allocation.

kixxauth2d ago

There is this real danger that our thinking, and the things we make, become bloated without constraints.

IMO software has gone to shit since both mobile phones and laptops mostly have massive amounts of compute. We always seem to use it to the limit, just because it's there.

matheusmoreira2d ago

At least it's doing something productive instead of just sinking money into literal gambling simulators. Mercifully, unlike video games, automation is not "cheating".

krzyk2d ago

Any corp (> 150 seats) has to use API pricing, so e.g. I don't have pressure or weekly limits, just a set budget I can use each month.

1 more reply

jdsnape2d ago

matheusmoreira2d ago

I just committed the skill to my dotfiles repository.

https://github.com/matheusmoreira/.files/tree/master/~/.clau...

There are many "critics", one for each quality I want reviewed. Correctness, consistency, maintainability, security, testing... Everything I could think of, and I keep adding more.

https://github.com/matheusmoreira/.files/tree/master/~/.clau...

The scrutinize skill is the entry point. The Opus I'm talking to becomes an agent coordinator. He explores and autodiscovers the project's structure, subdivides it into logical sections.

Going from solo hobbyist programmer to this was pretty insane. I can only imagine what these corporations with infinite money must be doing.

4 more replies

visarga2d ago

I did the same thing - task oriented work, each task a md file. I have a harness based on it: https://github.com/horiacristescu/claude-playbook-plugin

jstummbillig2d ago

visarga2d ago

I find myself observing how my lead manages meetings ... "ah, this is like when I do that with Claude", "this is where he wants to understand what happened, like when I ask Claude" ... it's funny.

SubiculumCode3d ago

At the enterprise level though, its going to be hard to want to use a service in which costs are not predictable, and keeping those costs under control requires employee training.

mrgoldenbrown2d ago

>...use a service in which costs are not predictable, and keeping those costs under control requires employee training.

Isn't this a (mildly exaggerated) description of AWS, which is a very successful service?

noodletheworld2d ago

Mmm… but for AWS its pay for external use right?

So your costs scale with the number of users you have.

Thats an op ex that you can explain.

Its not quite the same; though, similarly lucrative for consultants.

1 more reply

sidewndr462d ago

Am I losing my mind, aren't there multiple headlines each day about companies penalizing employees for not using AI enough?

iSnow2d ago

That was roughly 3 weeks ago, with the reprising of Claude 4.7 and GPT 5.5, things have become more spicy.

2 more replies

basch2d ago

jochem92d ago

You can put a limit on token spend and provide training (and even pre-configured workflows) on how to limit token spend.

Like the other commenter said: cloud spend can also spin out of control if you don't pay attention, yet we've found ways to keep it under control (training, guardrails, limits, transparancy).

harimau7772d ago

The problem that I see is what you do if someone runs out of tokens. It doesn't very well work to say "well I guess you just get fired because you can't work at full speed for the rest of the month".

Personally, this feels like its just trying to push the work of managers in allocating resources onto developers so that they have more work to do and can be blamed if anything goes wrong.

layer82d ago

xienze2d ago

> To be fair, the cost of software development has always been fairly unpredictable.

1 more reply

ilovecake19842d ago

The cost per month is 100% known and always has been. What has been variable is the rate of delivery. AI is different and can be substantial in countries with lower wages.

salawat3d ago

There's no fucking training to mitigate a slot machine.

serf2d ago

that analogy is so boring now with so many real world examples of actual LLM work.

people still can't get over the unreasonable effectiveness of algorithms.

2 more replies

LPisGood2d ago

There’s actually been a ton of research on how to optimize “slot machines,” at least in a generalized sense. For more reading, check out the literature on multi armed bandits.

__mharrison__2d ago

Odd, I train teams (at large companies) to use harnesses effectively. So some training does exist.

dgellow2d ago

Games like Diablo are basically a whole bunch of slot machines, and there are strategies you can follow to optimize your run.

1 more reply

subscribed2d ago

LOL, that's a sophisticated and sometimes slightly unpredictable multitool.

If this is the "analogy" you go for, you don't seem to be suited to make that comparison.

KronisLV2d ago

> There's definitely a way to use Claude code that is token conscious.

On the other hand, this feels a bit hypocritical:

They're gonna say that the future is all AI... until they get the bill.

phillc732d ago

[1] https://codeberg.org/MimosaDev/skills

[2] https://dirac.run/

michaelbuckbee2d ago

The results for a function implementation and test of levenshtein distance in js are pretty similar but Mistral is 30x cheaper than Opus 4.7 and 4x faster than Sonnet 4.6.

https://5m6qnuhyde.evvl.io/

kaoD2d ago

But that's not very informative.

There's where even frontier models struggle, which makes comparisons meaningful.

2 more replies

KronisLV2d ago

dgellow2d ago

> They're gonna say that the future is all AI... until they get the bill.

I mean, the will continue to say so, they just want to be the ones being paid for the service, not anthropic :)

tracker13d ago

My experience as well... I've only hit Antrhopic's 5hr threshold a few times, and two of them was within a half hour of the window. Also, all three times I'd already accomplished a LOT.

I'm really curious as to HOW the MS employees were using the agents as much as what they were doing.

kristjansson3d ago

brookst2d ago

Yep. I get $6k - $8k worth of tokens (at api rates) using the $200 max subscription.

lawn2d ago

I don't understand why people are using the API pricing instead of the Pro/Max subscriptions? What am I missing?

6 more replies

skeledrew2d ago

Can verify that I've gotten about $400 worth of tokens from my $20 sub.

1 more reply

brookst2d ago

I get 98.6% cache hits on Claude code. Short of drastic arch changes it’s hard to imagine it getting much better.

gobdovan2d ago

kridsdale12d ago

We are all going to be graded by (tickets closed / tokens burned) soon enough.

2 more replies

hedgehog2d ago

blitzar2d ago

> There's definitely a way to use Claude code that is token conscious.

By buying a subscription and dealing with the limits, using claude code and paying per token seems like the fast lane to the poor house.

nurettin2d ago

---- Before it was:

Me: We need to do this this that.

Claude: <random stuff that approximates human outout>

Me: Are you sure?

Claude: Well actually there is a bug <more random stuff that looks right this time>

----- Now it is:

Me: We need to do this this that.

Claude: <random stuff that approximates human outout>

Claude: Let me consult the advisor on that.

Claude: advisor came up with some advice, adjusting according to that. <more random stuff that looks right this time>

thegreatpeter2d ago

yeah, by using codex

relevant_stats2d ago

So, snippet from the article says the following:

And people here are interpreting this as related mainly to the Claude burning too much tokens too quickly and suggesting Microsoft should rather use SomeOtherLLM©?

Is this Hacker News or rather Marketing Wars?

s_dev2d ago

So "Microsoft chooses to eat its own dogfood" is a more accurate title?

ninjagoo2d ago

> Is this Hacker News or rather Marketing Wars?

No public forum is naturally immune to the spread of (guerilla) marketing. [1]

[1] Internet Rule #48

johnnypangs2d ago

I don’t think people read the article, I didn’t until I saw your comment. The article feels like clickbait tbh.

righthand2d ago

It's a forum called Hacker News that's been hacked and covertly refactored into Marketing Wars. Being their primary goal is to foster a space to draw-in (marketing) projects/start-ups.

RobRivera2d ago

Por que no los dos?

Eso mensaje de hijo de Carlos

relevant_stats2d ago

Äh, was?

proxysna3d ago

Feels about right.

onlyrealcuzzo2d ago

I rage canceled Claude today.

After 2 weeks of Claude getting progressively worse and worse, today was the final straw.

I don't care if they have a phone app. The model is COMPLETE garbage after you subscribe long enough and they think they've "got you".

couchdb_ouchdb2d ago

I've seen a lot of this sentiment over the previous six months from people on reddit. I have yet to experience this myself as a developer with over 20 years of experience.

fendy30022d ago

colechristensen2d ago

What it does seem like is that they're tuning some knobs up and down or releasing new versions of models or system prompts that result in the model getting dumber and smarter in waves.

Opus has been dumb this week.

Claude was having a lot of capacity problems and downtime and then this week that has been much less obvious... and the model is dumber.

It could also just be luck and my impressions are false... who knows.

Our_Benefactors2d ago

1 more reply

dgellow2d ago

Opus 4.7 has been a real downgrade for me. I’m back to mid 2025 when I had to catch all the completely intermediary goals/assumptions the model is creating for itself

3 more replies

johnfink82d ago

When you're on a mature codebase with 500k+ lines of code, I haven't seen anything else be as effective as 4.7.

1 more reply

solenoid09372d ago

It's the same phenomenon as when you learn a new vocabulary word you see it everywhere.

People heard "Claude is nerfed" and now they see it everywhere, they notice failures a lot more than they would have otherwise.

Doesn't matter that Claude is not, in fact, nerfed. Perception is powerful and most humans are not rational.

2 more replies

mmusc2d ago

All these tools have almost feature parity. The GitHub cli allows remote sessions and can run anthropic models anyway

shomp2d ago

When you say "code on your phone" ... you don't mean what I think you mean do you? Like, are you actually using your phone to make code commits?

onlyrealcuzzo2d ago

Yes, you can do that with Claude Code.

Tell it what to do.

Commit, push to origin, review on GitHub.

Tell it to make changes, amend the commit, push --force-with-lease.

Mostly to test how much LLMs can actually scale development.

Depending on how long it takes them to clean up some architectural slop in the MIR lowering phase, the results could either be very impressive or not.

From a purely cost basis perspective, it's hard to argue they aren't killing it.

But from a multiplier perspective, it's up in the air how great they are.

It's proven to be a really nice experiment, because much of what I wanted to solve with a language is the problems inherent to LLM development.

So at the self hosting phase, I get a great opportunity to see if the language can actually deliver on what I dream for.

1 more reply

kridsdale12d ago

Considered Gemini?

operatingthetan2d ago

Gemini got a big reduction in usage limits this week. There was backlash and they added 3x usage for Antigravity a day later but I haven't really tried it out to get a feel for it yet.

seabrookmx2d ago

This was all supposed to be worked out prior to Cloud Next, but it wasn't. Ironically, they mentioned Claude in a few of their presentations at next.

And that was our solution. We are a big GCP customer but our whole team is on Claude now and much happier.

saulpw2d ago

Google has burnt all of its goodwill in dev communities so no, I don't think Gemini is worth consideration.

Akamant1d ago

pratikel1d ago

I’m curious to learn of your problem where you needed 17 Golang microservices (assuming these are newly created).

zkmon3d ago

andrekandre2d ago

its kind of weird tho, jensen also said we should be burning tons of tokens as well... 'perceived quality' cant be the only reason these ceos pushing token usage so hard can it?

verdverm2d ago

reasons for token usage beyond expectations

1. right now, usage correlates with experimentation and learning, few if anyone knows how to make these things effective on their own over long sessions of activity

2. long term, you should be using more than one agent at a time, because they are running in the background based on events (new direct message / something happened in eg. github)

rnxrx2d ago

andrewl-hn2d ago

They lay people off and look good in front of investors. Then they hire people, talk about "growth", and once again look good in front of investors.

This would never fly if stock market was rational. But it never is.

dawnerd2d ago

And if/when companies need to scale back their ai investments they can spin it too and the stock market will eat it up.

marcosdumay2d ago

They are just being AI efficient, and doing more while spending less on it :)

I wonder if this will happen before they have some obligatory debloating of the investors exposition to the company.

dividedbyzero2d ago

kridsdale12d ago

At least the models don’t need health insurance, office space, a cafeteria, or have a threat of unionizing.

dividedbyzero2d ago

thewebguyd2d ago

Shh, that's the quiet part the investors don't want to say outloud.

Applejinx2d ago

user342832d ago

There's 10-15 labs near the frontier, and like 30 serious inference providers, over 70 total on OpenRouter.

With research and hardware near guaranteed to bring the efficiency way up, I'm not scared here of massive price hikes.

There is no moat.

JumpCrisscross2d ago

> If developers are being laid off because AI is better/faster/cheaper

Ekaros2d ago

visarga2d ago

> Especially as there has not been any new great revenue sources outside AI in recent years.

We can definitely do amazing things with AI, and it makes us have superpowers, but so does everyone else. My competition also uses AI. I have to keep up with an AI powered competition now.

1 more reply

thewebguyd2d ago

So you're getting 2 for the price of 1.5. Scale that up to 500 devs at a big company and it's a big chunk of change saved on payroll.

jayd162d ago

You're starting from the assumption that its a 2x benefit. That's a massive leap.

thewebguyd2d ago

True, that was more hypothetical if it got good enough to 2x.

mrgoldenbrown2d ago

Also assuming that current API pricing is sustainable and not subsidized.

ilovecake19842d ago

This is economy dependant. It’s really Indians why will take the brunt of AI job losses.

__mharrison__2d ago

Interesting point. Outsource the outsourcers...

ako2d ago

ngc2482d ago

ares6232d ago

stock_toaster2d ago

I imagine layoffs are also very much "this quarter and next quarter" with regards to investor visibility.

While LLM Opex is "some future quarter" and very easy to co-mingle with other expenses.

o104493663d ago

I switched from Anthropic to OpenAI after spending ~$40K in equivalent token costs using Claude over 3 months.

gnat2d ago

beering2d ago

It doesn’t matter how good the harness is if the model does a bad job of planning and continuing from long context. A good harness cannot overcome a weak model.

robertkarlOP3d ago

Cancellation effective June 30. This was a _pilot_ launched in December that accidentally consumed their 2026 yearly target spend on AI!

I expect the r/LocalLLaMA guys to be going nuts about this news.

thewebguyd2d ago

From the article

> It was part of an effort to get project managers, designers, and other employees to experiment with coding for the first time.

I suspect they weren't as efficient as they could be with token use either. Sounds like they were trying to encourage non-developers to vibe code stuff

xienze2d ago

1 more reply

plaidfuji2d ago

totalhack2d ago

If you are talking about the Copilot built into vs code, that's not been my recent experience at all. Very capable in agent mode since gpt 5.4 came out.

siva72d ago

Especially since gpt 5.5 it's on par with Opus 4.7 or Sonnet.

cbdevidal2d ago

mellosouls2d ago

I'm not sure if you are referring to the old or new plan?

Github Copilot offered probably the best value and was IMO underappreciated for a long time; I've been an annual subscriber since day 1.

The changes announced a few days ago completely revoke that value proposition, I doubt I'll continue with it.

cbdevidal2d ago

There’s a new plan? Ugh. I signed up about five months ago.

mellosouls2d ago

Yes, unfortunately. eg. discussions below - I think I have seen multipliers of 9x cost to existing use cases:

Changes to GitHub Copilot individual plans

https://news.ycombinator.com/item?id=47838508

GitHub Copilot is moving to usage-based billing

https://news.ycombinator.com/item?id=47923357

Multipliers for annual subscribers:

https://docs.github.com/en/copilot/reference/copilot-billing...

1 more reply

__mharrison__2d ago

Copilot was the best deal in AI tooling for a few weeks there.

New pricing model changes that. I will still keep it around for autocompletion (for the rare times when I open up an editor).

cbdevidal2d ago

Can even buy more premium tokens for more Claude use, which I have done once. But most of the time the tokens included in the plan are sufficient.

keyle2d ago

The title is somewhat bait. It reads like MSFT is using less AI, while in fact it's just a force swap to Copilot.

Arguably, Copilot is GPT 5? Not sure what the CLI offers behind the covers.

meowkit2d ago

Copilot is the name for the harness / wrapper of MSFT products

The CLI can swap to whatever model (/models) based on your subscriptions.

The copilots on desktop or Office Apps are likely just GPT5 nano or other tiny models with cheap inference

golf10522d ago

Employees (at least on my team) get access to the Claude models as well when using Copilot CLI.

patentlyze2d ago

I disagree. As someone who just got a new Windows laptop with Copilot baked(forced) in I've tested Copilot a lot.

It. is. so. bad.

It feels like it's at least 1-2 years behind the current top models.

gbro3n2d ago

But there isn't a copilot model is there? Just a harnesse, and the vscode copilot extension is pretty good (haven't tried the tui)

alternatex2d ago

keyle2d ago

Your Copilot free offering isn't the Copilot they're using within the company for coding assistant. It's confusing I know.

tored2d ago

Copilot is not the same agent as GitHub Copilot.

andrewl-hn2d ago

RevEng2d ago

They do have agreements, but they aren't exclusive, and Microsoft and Open AI have had a rather public falling out over the last year.

tyleo3d ago

Lots of these places measure employee token use with managers having dashboards. It seems like performative code production rather than making anything useful.

Speed without judgement always compounds badly.

andrewl-hn2d ago

Tokens are current era' "lines of code per month"

https://www.folklore.org/Negative_2000_Lines_Of_Code.html

skeledrew2d ago

Well, that's the inevitable outcome of token-maxxing :shrugs:

thisislife22d ago

More here: Microsoft reports are exposing AI's real cost problem: Using the tech is more expensive than paying human employees - https://fortune.com/2026/05/22/microsoft-ai-cost-problem-tok...

guluarte3d ago

I think tech companies are doing layoffs partly because they need to cover AI operating expenses.

stock_toaster2d ago

I think so too, otherwise why wouldn't you put that (purported) increased capacity/output into improving your existing products or creating new ones, with the headcount that you already have?

maxignol2d ago

This might actually be clever since Microsoft dev will be longing claude code features and might result in copilot getting way better

ryanhecht1d ago

gradientsrneat2d ago

Related: Microsoft-owned GitHub recently switched to token-based billing:

https://github.blog/news-insights/company-news/github-copilo...

Claude tokens are priced by GitHub at a disproportionately premium price compared to Gemini and OpenAI. I wonder why?

https://docs.github.com/en/copilot/reference/copilot-billing...

dsagent3d ago

QuiEgo2d ago

kridsdale12d ago

An enormous percentage of America’s white collar work force has been doing this since 2023.

Fun fact, up until you face a consequence for crime, all crime is free! Have fun and go win the competition game against your co-workers.

cityofdelusion2d ago

None of the 5 places I have worked is this possible, but they are also all highly regulated industries. Firewalls block virtually everything by default.

QuiEgo2d ago

Fair, but I assume everything on my work laptop is key logged. Surely they would notice Claude phoning home from my company laptop? I suspect a network rule to look for that traffic is trivial?

1 more reply

InsideOutSanta2d ago

My guess is that at most companies, employees are prohibited from doing this, but not prevented.

uniclaude3d ago

That's very interesting to reconcile with the fact that not too far, Amazon employees feel incentivized to use as many tokens as possible.

HDThoreaun2d ago

boelboel2d ago

Makes sense why Anthropic wants to IPO as soon as possible as the growth right now comes from temporary wastefulness. Makes all the investments more risky.

sreekanth8502d ago

If you properly keep documents, architecture, and decision records, token consumption can be pretty less. Iam managing everything with two codex plus sub. Repo size is 300 k loc ( backend).

usernametaken292d ago

I switched to OpenRouter and OpenCode a while ago. It is much cheaper, much much cheaper, and A LOT more reliable. Particulary Gemini was a piece of trash when it came to uptime

zabil2d ago

Also it became very hard to convince management to keep both Claude code and GitHub Copilot enterprise licenses.

loloquwowndueo2d ago

Reminds me of when Steve Ballmer forbade his children to use iPods and pushed towards the Zune instead. Hahaha

1 more reply

andyfilms13d ago

mrweasel3d ago

Okay, but what if you're not Microsofts size and don't have and R&D budget large enough to fund development of your own models and tools?

How does Microsoft plan to fix CoPilot, so that the cost will be so much lower than Claude, that budget overruns won't be a problem for their own customer?

andyfilms12d ago

Smaller companies will have departments that distill larger models into something more specifically manageable and useful for them. At least, that's my personal prediction :)

mrweasel2d ago

I do think your prediction makes sense, because the AI really isn't the product, it needs to be baked into something and licensing the models saves you the R&D and cost of implementing your own.

jcgrillo2d ago

> Smaller companies will have departments that distill larger models into something more specifically manageable and useful for them.

kridsdale12d ago

Giving your workforce Claude is like giving everyone in the USPS a Ferrari.

There may be a spot of “good enough to pay for and make a profit” that exists.

onlyrealcuzzo2d ago

MSFT and Apple are taking the same approach.

The frontier model space costs 1000x as much to develop as the small language models, and is only 1.5 years ahead.

Factually, the frontier models have not paid for themselves. So, if you're MSFT and Apple, you don't need to run in a race where even the winner loses massively.

You can try to train models 1.5 years behind that are highly likely to be profitable, given your market position.

The average person is lagging behind what AI is capable of by 3+ years anyway...

So you can save 1000x on training and 10x on inference and just use SOTA small models.

Why spend $5B training a model that's for sure not going to make $5B (after inference costs) when you can spend $5M building one that WILL make far more than that after inference costs?

NitpickLawyer3d ago

> attempting to build their own models.

kridsdale12d ago

rglover3d ago

Curb Your Enthusiasm theme starts playing.

andrekandre2d ago

i was thinking more arrested development but that works as well

goldylochness2d ago

after having used claude for quite some time, i would buy puts on microsoft

visualphoenix1d ago

Good luck to them! I recently had the misfortune of fighting Copilot on a Github PR and it made me want to never contribute to the project again.

killerstorm3d ago

The way coding agent work is fantastically wasteful. All the megabytes of code are processed over and over and over, sometimes withing just one session.

brookst2d ago

Claude code gets >98% KV cache hits. It’s not reprocessing unless you let the cache go cold (5 minutes, which is annoyingly short).

killerstorm2d ago

Normally KV cache works only if your context prefix is identical, but there are papers which demonstrate documents can be cached between different contexts.

brookst2d ago

Ah, understood, and thanks for the clarification!

beoberha2d ago

I believe OP is talking about new sessions or after compaction. He’s getting at the fact that LLMs are stateless and have to rediscover your codebase on every new session.

iainmerrick2d ago

To be fair, on the Monday morning after a holiday, that’s exactly what I’m like too.

1 more reply

dgellow2d ago

Are you sure that hitting the cache mean you’re not paying for those tokens?

brookst2d ago

You pay, at 10% the price (in quota or dollars) for non-cached. See https://platform.claude.com/docs/en/about-claude/pricing

1 more reply

geoffbp2d ago

How efficient is Claude at cleaning up unused code and making things more simple - as good as it is at adding code / features?

wg02d ago

Microsoft should host DeepseekV4 internally for its developers. And you're welcome.

chris_money2022d ago

Microsoft does self host claude and gpt for GHCP

rvz2d ago

This is the smartest solution to do, to self host the model locally on premise.

kridsdale12d ago

And by that, you mean, in Azure, surely.

wolvoleo2d ago

What's the point of eating your own dog food when the only thing you are doing is reselling other people's dog food? Microsoft don't have any competing LLM.

fredcallagan1d ago

gmerc2d ago

They got DeepSeek on Azure, would cut costs by 10x … if they ran it on Huawei

matt32102d ago

Tokens aren’t that much of an issue when your not evaluated on the usage

sergiomattei2d ago

My impression is they're being cancelled in favor of full internal adoption of Copilot CLI, which has got much better over the past few months.

Shalomboy2d ago

I'm also a big fan of Copilot CLI, especially after demoing it to a coworker who liked Claude Code.

Kapura2d ago

"everybody needs to use these new AI tools or you will be left behind. no! not like that! the cheap, worser ones!"

heisenbit2d ago

How would one call such a strategy? Embrace and extend comes to mind.

lou13062d ago

This has really little to do with embrace and extend. They are not taking over an open standard or anything like that.

If anything, it's forced dogfooding, i.e., forcing their own workforce to beta-test their product.

dminik2d ago

To be fair, Microsoft dogfooding something for once would be great.

la647102d ago

jgalt2122d ago

What per cent of internal Microsoft IP runs through Anthropic? Do they not care about trade secrets, or certain groups allowed or not allowed to use tools that expose IP to external vendors?

nobodywillobsrv2d ago

This feels like these kind of bad incentive problems we always here about on here ... Like bugs and vipers.

DeathArrow2d ago

Doesn't MS have the compute to run GPT 5.5 for all its employees?

jadar2d ago

ndiddy3d ago

m1323d ago

The absolute state of the Hacker News main page in 2026. Thank you for taking your time to put it all together.

fishtoaster3d ago

https://archive.is/WfCta

ajd5553d ago

2nd link doesn't work. That would be a neat tool, to find the original article and see how many levels of AI summary it has gone through, a game of AI telephone!

OnionBlender3d ago

robertkarlOP3d ago

My bad. I had trouble finding the original source when I googled for it and grabbed a link. I was originally shown a screenshot of a x.com post.

robertkarlOP3d ago

I emailed dang to politely ask to make the link point to the Verge article since I can't update it.

q3k3d ago

i swear i'm going to start an amish community and internet where we forbid any technological development past 2019

call me a luddite, i'll be wearing it as a badge of honor

BoiledCabbage2d ago

Man, maybe it's time for me to give the verge a subscription. There the only ones actually doing any journalism here and a bunch of AI blogs skimming off the top.

siva73d ago

boy i'm leaving the internet. sun is shining. was a good time here while it lasted.

scarmig3d ago

The artificial centipede.

sashank_15093d ago

Welp, this is the future we live in now

josefritzishere2d ago

AI slop ruined a story about AI? This thread is a story about itself.

thadk3d ago

At least Codex is trying to win validation on merit.

j / k navigate · click thread line to collapse