Like,you’d be working on your desktop, switch to your laptop, and all of a sudden you're forcibly logged out. No warning, no notification, just gone.
Naturally, people thought this was a new policy.
So they asked support.
And here’s where it gets batshit: Cursor has a support email, so users emailed them to find out. The support peson told everyone this was “expected behavior” under their new login policy.
One problem. There was no support team, it was an AI designed to 'mimic human responses'
That answer, totally made up by the bot, spread like wildfire.
Users assumed it was real (because why wouldn’t they? It's their own support system lol), and within hours the community was in revolt. Dozens of users publicly canceled their subscriptions, myself included. Multi-device workflows are table stakes for devs, and if you're going to pull something that disruptive, you'd at least expect a changelog entry or smth.
Nope.
And just as people started comparing notes and figuring out that the story didn’t quite add up… the main Reddit thread got locked. Then deleted. Like, no public resolution, no real response, just silence.
To be clear: this wasn’t an actual policy change, just a backend session bug, and a hallucinated excuse from a support bot that somehow did more damage than the bug itself.
But at that point, it didn’t matter. People were already gone.
Honestly one of the most surreal product screwups I’ve seen in a while. Not because they made a mistake, but because the AI support system invented a lie, and nobody caught it until the userbase imploded.
Which of course they are going to try to brush it all away. Better than admitting that this problem very much still exists and isn’t going away anytime soon.
The section about hallucinations is deeply relevant.
Namely, Claude sometimes provides a plausible but incorrect chain-of-thought reasoning when its “true” computational path isn’t available. The model genuinely believes it’s giving a correct reasoning chain, but the interpretability microscope reveals it is constructing symbolic arguments backward from a conclusion.
https://en.wikipedia.org/wiki/On_Bullshit
This empirically confirms the “theory of bullshit” as a category distinct from lying. It suggests that “truth” emerges secondarily to symbolic coherence and plausibility.
This means knowledge itself is fundamentally symbolic-social, not merely correspondence to external fact.
Knowledge emerges from symbolic coherence, linguistic agreement, and social plausibility rather than purely from logical coherence or factual correctness.
Recognizing the relevance of coherence and plausibility does not need to imply that other aspects are any less relevant. Redefining truth merely because coherence is important and sometimes misinterpreted is not at all reasonable.
Logically, a falsehood can validly be derived from assumptions when those assumptions are false. That simple reasoning step alone is sufficient to explain how a coherent-looking reasoning chain can result in incorrect conclusions. Also, there are other ways a coherent-looking reasoning chain can fail. What you're saying is just not a convincing argument that we need to redefine what truth is.
This just seems like a redefinition of the word "knowledge" different from how it's commonly used. When most people say "knowledge" they mean beliefs that are also factually correct.
Sounds very human. It's quite common that we make a decision based on intuition, and the reasons we give are just post-hoc justification (for ourselves and others).
To me it feels like people that benefit from or at least enjoy that sort of assistance and I solve vastly different problems and code very differently.
I’ve done exhausting code reviews on juniors’ and middles’ PRs but what I’ve been feeling lately is that I’m reviewing changes introduced by a very naive poster. It doesn’t even type-check. Regardless of whether it’s Claude 3.7, o1, o3-mini, or a few models from Hugging Face.
I don’t understand how people find that useful. Yesterday I literally wasted half an hour for a test suite setup a colleague of mine introduced to the codebase that wasn’t good, and I tried delegating that fix to several of the Copilot models. All of them missed the point, some even introduced security vulnerabilities in the process invalidating JWT validation, I tried “vide coding” it till it works, until I gave up in frustration and just used an ordinary search engine, which led me to the docs, in which I immediately found the right knob. I reverted all that crap and did the simple and correct thing. So my conclusion was simple: vibe coding and LLMs made the codebase unnecessarily more complicated and wasted my time. How on earth do people code whole apps with that?
The model doesn't "genuinely believe" anything.
https://link.springer.com/article/10.1007/s10676-024-09775-5
> # ChatGPT is bullshit
> Recently, there has been considerable interest in large language models: machine learning systems which produce human-like text and dialogue. Applications of these systems have been plagued by persistent inaccuracies in their output; these are often called “AI hallucinations”. We argue that these falsehoods, and the overall activity of large language models, is better understood as bullshit in the sense explored by Frankfurt (On Bullshit, Princeton, 2005): the models are in an important way indifferent to the truth of their outputs. We distinguish two ways in which the models can be said to be bullshitters, and argue that they clearly meet at least one of these definitions. We further argue that describing AI misrepresentations as bullshit is both a more useful and more accurate way of predicting and discussing the behaviour of these systems.
Don't get me wrong, I use AI every day, but it's mostly as a localized code complete or to help me debug tricky issues. Meaning I've written and understand the code myself, and the AI is there to augment my abilities. AI works great if it's used as a deductive tool.
Where it runs into issues is when it's used inductively, to create things that aren't there. When it does this, I feel the hallucinations can be off the charts -- inventing APIs, function names, entire libraries, and even entire programming languages on occasion. The AI is more than happy to deliver any kind of information you want, no matter how wrong it is.
AI is not a tool, it's a tiny Kafkaesque bureaucracy inside of your codebase. Does it work today? Yes! Why does it work? Who can say! Will it work tomorrow? Fingers crossed!
I use AI for rather complex tasks. It's impressive. It can make a bunch of non-trivial changes to several files, and have the code compile without warnings. But I need to iterate a few times so that the code looks like what I want.
That being said, I also lose time pretty regularly. There's a learning curve, and the tool would be much more useful if it was faster. It takes a few minutes to make changes, and there may be several iterations.
It does poorly without heavy instruction, though, especially with anything more than toy projects.
Still a valuable tool, but far from the dreamy autonomous geniuses that they often get described as.
This is not an inherent flaw of LLMs, rather it is a flaw of a particular implementation-if you use guided sampling, so during sampling you only consider tokens allowed by the programming language grammar at that position, it becomes impossible for the LLM to generate ungrammatical output
> When it does this, I feel the hallucinations can be off the charts -- inventing APIs, function names, entire libraries,
They can use guided sampling for this too - if you know the set of function names which exist in the codebase and its dependencies, you can reject tokens that correspond to non-existent function names during sampling
Another approach, instead of or as well as guided sampling, is to use an agent with function calling - so the LLM can try compiling the modified code itself, and then attempt to recover from any errors which occur.
Your tone is rather hyperbolic here, making it sound like an extra brace resulted in a disaster. It didn't. It was easy to detect and easy to fix. Not a big deal.
Yes they didn't push it as hard as, say, copilot. I still think they got in way too deep way too fast.
Apple fumbled a bit with Siri, and I'm guessing they're not too keen to keep chasing everyone else, since outside of limited applications it turns out half baked at best.
Sadly, unless something shinier comes along soon, we're going to have to accept that everything everywhere else is just going to be awful. Hallucinations in your doctor's notes, legal rulings, in your coffee and laundry and everything else that hasn't yet been IoT-ified.
If they don't then I'd hope they get absolutely crucified by trade comissions everywhere, currently there are bilboards in my city advertising Apple AI even though it doesn't even exist yet - if it's never brought to the market then it's a serious case of misleading advertising.
There is no world in which a compiler or tooling will save you from the absolute mayhem it can do. I’ve had it routinely try to re-implement third party libraries, modify code unrelated to what it was asked, quietly override functions etc.
It’s like a developer who is on LSD.
But in reality hallucinations either make people using AI lose a lot of their time trying to stuck the LLMs from dead ends or render those tools unusable.
My company provides hallucination detection software: https://cleanlab.ai/tlm/
But we somehow end up in sales meetings where the person who requested the meeting claims their AI does not hallucinate ...
Apologies - something very clearly went wrong here. We’ve already begun investigating, and some very early results:
* Any AI responses used for email support are now clearly labeled as such. We use AI-assisted responses as the first filter for email support.
* We’ve made sure this user is completely refunded - least we can do for the trouble.
For context, this user’s complaint was the result of a race condition that appears on very slow internet connections. The race leads to a bunch of unneeded sessions being created which crowds out the real sessions. We’ve rolled out a fix.
Appreciate all the feedback. Will help improve the experience for future users.
https://old.reddit.com/r/cursor/comments/1jyy5am/psa_cursor_...
(For reference, here it is in reveddit https://www.reveddit.com/v/cursor/comments/1jyy5am/psa_curso... - text from post was unfortunately not saved)
It's already locked and with a stickied comment from a dev clarifying what happened
Did you remove it so people can't find about this screwup when searching Google?
Anyway, if you acknowledge it was a mistake to remove the thread, could you please un-remove it?
The best case scenario is that you lied about having people answer support. LLMs pretending to be people (you named it Sam!) and not labeled as such is clearly intended to be deceptive. Then you tried to control the narrative on reddit. So forgive me if I hit that big red DOUBT button.
Even in your post you call it "AI-assisted responses" which is as weaselly as it gets. Was it a chatbot response or was a human involved?
But 'a chatbot messed up' doesn't explain how users got locked out in the first place. EDIT: I see your comment about the race condition now. Plausible but questionable.
So the other possible scenario is that you tried to hose your paying customers then when you saw the blowback blamed it on a bot.
'We missed the mark' is such a trope non-apology. Write a better one.
I had originally ended this post with "get real" but your company's entire goal is to replace the real with the simulated so I guess "you get what you had coming". Maybe let your chatbots write more crap code that your fake software engineers push to paying customers that then get ignored and/or lied to when they ask your chatbots for help. Or just lie to everyone when you see blowback. Whatever. Not my problem yet because I can write code well enough that I'm embarrassed for my entire industry whenever I see the output from tools like yours.
This whole "AI" psyop is morally bankrupt and the world would be better off without it.
Also, illegal in the EU.
We spent almost 2 months fighting with you guys about basic questions any B2B SaaS should be able to answer us. Things such as invoicing, contracts, and security policies. This was for a low 6 figure MRR deal.
When your sales rep responds "I don't know" or "I will need to get back to you" for weeks about basic questions it left us with a massive disappointment. Please do better, however we have moved to Copilot.
Because we all know how well people pay attention to such clear labels, even seasoned devs not just “end users”⁰.
Also, deleting public view of the issue (locking & hiding the reddit thread) tells me a lot about how much I should trust the company and its products, and as such I will continue to not use them.
--------
[0] though here there the end users are devs
This person is not the only one to experiencing this bug. As this thread has pointed out.
HN goes a step further. It has a function that allows moderators to kill or boost a post by subtracting or adding a large amount to the post's score. HN is primarily a place for Y Combinator to hype their latest venture, and a "safe" place for other startups and tech companies.
They will utterly fail to build for a community of users if they don't have anyone on-hand who can tell them what a terrible idea that was
To the cofounder: hire someone (ideally with some thoughtful reluctance around AI, who understands what's potentially lost in using it) who will tell you your ideas around this are terrible. Hire this person before you fuck up your position in benevolent leadership of this new field
Literally no one wants this. The entire purpose of contacting support is to get help from a human.
Seems like you are still blaming the user for his “very slow internet”.
How do you know the user internet was slow? Couldn’t a race condition like this exist anyway with regular 2 fast internet connections competing for the same sessions?
Something doesn’t add up.
this is a completely reasonable and seemingly quite transparent explaination.
if you want a conspiracy, there are better places to look.
Slightly related to this; I just wanted to ask whether all Cursor email inboxes are gated by AI agents? I've tried to contact Cursor via email a few times in the past, but haven't even received an AI response :)
Cheers!
Edit: he did refund 22 mins after seeing this
That an LLM then invented a reason when asked by users why they're being logged out isn't that surprising. While not impossible, I don't think there's currently indication that they intended to change policy and are just blaming it on a hallucination as a scape goat.
Also, from the first comment in the post:
> Unfortunately, this is an incorrect response from a front-line AI support bot.
Well, this actually hurts.. a lot! I believe one of the key pillars of making a great company is customer support, which represents the soul or the human part of the company.
Don’t let the dickish replies get to you.
Don't use AI. Actually care. Like, take a step back, and realise you should give a shit about support for a paid product.
Don't get me wrong: AI is a very effective tool, *for doing things you don't care about*. I had to do a random docker compose change the the other day. It's not production code, it will be very obvious whether or not AI output works, and I very rarely touch docker and don't care to become a super expert in it. So I prompted the change, and it was good enough and so I ran with it.
You using AI for support tells me that you don't care about support. Which tells me whether or not I should be your customer.
I agree with this. Also, whenever I care about code, I don’t use AI. So I very rarely use AI assistants for coding.
I guess this is why Cursor is interested in making AI assistants popular everywhere, they don’t want the association that “AI assisted” means careless. Even when it does, at least with today’s level of AI.
I agree with you, they should care.
And what’s a customer supposed to do with that information? Know that they can’t trust it? What’s the point then?
Note that I have absolutely no knowledge or reason to believe this other than general distrust of companies.
Yeah, who puts an AI in charge of support emails with no human checks and no mention that it's an AI generated reply in the response email?
Sam
Cursor AI Support Assistant
cursor.com • hi@cursor.com • forum.cursor.comIt also seems like a weird thing to lie about, since it's just another very public example of AI fucking up something royally, coming from a company whose whole business model is selling AI.
Which is crazy. Support is part of marketing so it should get the same kind of consideration.
Why do people think Amazon is hard to beat? Price? nope. Product range? nope. Delivery time? In part. The fact if you have a problem with your product they'll handle it? Yes. After getting burned multiple times by other retailers you're gonna pay the Amazon tax so you don't have to ask 10 times for a refund or be redirected to the supplier own support or some third party repair shop.
Everyone knows it. But people are still stuck on the "support is a cost center" way of life so they keep on getting beat by the big bad Amazon.
Given that LLMs are trained on lots of stuff and not just the policy of this company, it’s not hard to imagine how it could conjure that the policy (plausibly) is “one session per user”, and blame them of violating it.
The thing is, what the AI hallucinated (if it was an AI-hallucinating), was the kind of sleezy thing companies do do. However, the thing with sleezy license changes is they only make money if the company publicizes them. Of course, that doesn't mean a company actually thinks that far ahead (X many managers really think "attack users ... profit!"). Riddles in enigmas...
I'll just keep chugging along, with debian, python and vim, as I always have. No LLM, no LSP, heck not even autocompletion. But damn proud of every hand crafted, easy to maintain and fully understood line of code I'll write.
Now, I don't trust the output - I review everything, and it often goes wrong. You have to know how to use it. But I would never go back. Often it comes up with more elegant solutions than I would have. And when you're working with a new platform, or some unfamiliar library that it already knows, it's an absolute godsend.
I'm also damn proud of my own hand-crafted code, but to avoid LLMs out of principal? That's just luddite.
20+ years of experience across game dev, mobile and web apps, in case you feel it relevant.
Getting to sit down and write the code is the most enjoyable part of the job, why would I deprive myself of that? By the time the problem has been defined well enough to explain it to an LLM sitting down and writing the code is typically very simple.
Like how McDonalds makes a lot of burgers fast and they are very successful so that's all we really care about?
The folly of single ended metrics.
> but to avoid LLMs out of principal? That's just luddite.
Do you double check that the LLM hasn't magically recreated someone else's copyrighted code? That's just irresponsible in certain contexts.
> in case you feel it relevant.
Of course it's relevant. If a 19 year old with 1 year of driving experience tries to sell me a car using their personal anecdote as a metric I'd be suspicious. If their only salient point is that "it gets me to where I'm going faster!" I'd be doubly suspicious.
In python I was scanning 1000’s of files each for thousands of keywords. A naive implementation took around 10 seconds, obviously the largest share of execution time after running instrumentation. A quick ChatGPT led me to Aho-Corasick and String searching algorithms, which I had never used before. Plug in a library and bam, 30x speed up for that part of the code.
I could have asked my knowledgeable friends and coworkers, but not at 11PM on a Saturday.
I could have searched the web and probably found it out.
But the LLM basically auto completed the web, which I appreciate.
Once I had to look up a research paper to implement a computational geometry algorithm because I couldn't find it any of the typical Web sources. There were also no library to use with a license for our commercial use.
I'm not against use of "AI". But this increasing refusal of those who aspire to work in specialist domains like software development to systematically learn things is not great. That's just compounding on an already diminished capacity to process information skillfully.
Get friends with weirder daily schedules. :-)
I've had a long-term code project that I've really struggled with, for various reasons. Instead of using my normal approach, which would be to lay out what I think the code should do, and how it should work, I just explained the problem and let the LLM worry about the code.
It got really far. I'm still impressed. Claude worked great, but ran out of free tokens or whatever, and refused to continue (fine, it was the freebie version and you get what you pay for). I picked it up again in Cursor and it got further. One of my conditions for this experiment was to never look at the code, just the output, and only talk to the LLM about what I wanted, not about how I wanted it done. This seemed to work better.
I'm hitting different problems, now, for sure. Getting it to test everything was tricky, and I'm still not convinced it's not just fixing the test instead of the code every time there's a test failure. Peeking at the code, there are several remnants of previous architectural models littering the codebase. Whole directories of unused, uncalled, code that got left behind. I would not ship this as it is.
But... it works, kinda. It's fast, I got a working demo of something 80% near what I wanted in 1/10 of the time it would have taken me to make that manually. And just focusing on the result meant that I didn't go down all the rabbit holes of how to structure the code or which paradigm to use.
I'm hooked now. I want to get better at using this tool, and see the failures as my failures in prompting rather than the LLM's failure to do what I want.
I still don't know how much work would be involved in turning the code into something I could actually ship. Maybe there's a second phase which looks more like conventional development cleaning it all up. I don't know yet. I'll keep experimenting :)
Sir, you have just passed vibe coding exam. Certified Vibe Coder printout is in the making but AI has difficulty finding a printer. /s
Because I can ship 2x to 5x more code with nearly the same quality.
My employer isn't paying me to be a craftsman. They're paying me to ship things that make them money.
Anyways, Cursor generates all my code now.
As a result, you don't lose flow and end up reading considerably more code than you would have otherwise.
Why are you cheapening the product, butchering the process and decimating any hope for further skill development by using these tools?
Instead of python, you should be using assembly or heck, just binary. Instead of relying on an OS abstraction layer made by someone else, you should write everything from scratch on the bare metal. Don't lower yourself by using a text editor, go hex. Then your code will truly be "hand crafted". You'll have even more reason to be proud.
Don’t get too hung up on what works for other people. That’s not a good look.
I’m a self-respecting software developer with 28 years of experience. I would, with some caveats, venture to say I am an expert in the trade.
AI helps me write good code somewhere between 3x and 10x faster.
This whole-cloth shallow dismissal of everything AI as worthless overhyped slop is just as tired and content-free as breathless claims of the limitless power or universal applicability of AI.
i don't need to "hand write" every line and character in my code and guess what, it's still easy to understand and maintain because it's what would have written anyway. that or you're just bikeshedding minor syntax.
like if you want to be proud of a "hand built" house with hammer and nails be my guest, but don't conflate the two with always being well built.
Seriously comments like yours assume, that all the rest of us who DO make extensive use of these AI tools and have also been around the block for a while, are idiots.
That's when you know you've captured something, when people hate use your product.
Any real alternatives? I've tried continue and was unimpressed with the tab completion and typing experience (felt like laggy typing on a remote server).
Personally, I will jump ship to Zed as soon as it’s agent mode is good enough (I used Zed as a dumb editor for about a year before I used cursor, and I love it)
They have enough revenue to hire, they probably are just overwhelmed. They'll figure it out soon I bet.
I was offered in writing a refund by the team who cold reached out to me to ask me why I cancelled my sub one week after start. Then they ignored my 3+ emails in response asking them to refund, and other means of trying to communicate with them. Offering me a refund as a bait to gain me back, then when I accept it they ghost me. Wow. Very low.
The product is not terrible but the team responses are. And this, if you see how they handled it, is also a very poor response. First thing you notice if you open the link is that the Cursor team removed the reddit post! As if we were not going to see it or something? Who do they think they are? Censoring bad comments which are 100% legit.
I am giving it a go to competitors just out of sheer frustration with how they handle customers, and I do recommend everybody to explore other products before you settle on Cursor. I don't intend to ever re-subscribe and have recommended friends to do the same, most of which agree with my experience.
You just don't know how to prompt it correctly.
Just because you use AI for customer service doesn't mean you don't have to honour its offers to customers. Air Canada recently lost a case where its AI offered a discount to a customer but then refused to offer it "IRL"
https://www.forbes.com/sites/marisagarcia/2024/02/19/what-ai...
Very strange behavior honestly.
Sadly, Cursor will always be hampered by maintaining it's own VSCode fork. Others in this niche are expanding rapidly and I, myself, have started transitioning to using Roo and Cline.
Actually, you don't even have to make a new account. You can delete your account and make it again reusing the same email.
I did this on accident once because I left the service and decided to come back, and was surprised to get a free tier again. I sent them an email letting them know that was a bug, but they never responded.
I paid for a month of access just to be cautious, even though I wasn't using it much. I don't understand why they don't fix this.
A total of 4 users claimed that they did or would cancel their subscriptions in the comments, and 3/4 of them hedged by saying that they would cancel if this problem were real or happened to them. It looks like only 1 person claimed to have cancelled already.
Is there some other discussion you're looking at?
Since the HN title rule is "Please use the original title, unless it is misleading or linkbait" and the OP title is arguably misleading, I kept the submitter's title. But if there's a more accurate or neutral way to say what happened, we can change it again.
Yet if you went by the HN comments, you'd think it were the biggest item on primetime news.
People are really champing at the bit.
> Apologies about the confusion here.
If this was a sincere apology, they'd stop trying to make a chat bot do support.
It could be better though.. I wish this happened to a company providing "AI support solutions"..
Nope, there's no reliable solution for them, as of yet.
There's hope that hallucinations will be solved by someone, somehow, soon... but hope is not a strategy.
There's also hype about non-stop progress in AI. Hype is more a strategy... but it can only work for so long.
If no solution materializes soon, many early-adopter LLM projects/trials will be cancelled. Sigh.
You can try out our early product here: https://cleanlab.ai/tlm/
(free to try, we'd love to hear your feedback)
I have no problem, however, pasting an encryption public key into my Sublime Text editor. I’m not completely turned off by ability fir telemetry, tracking, or analytics. But having a login for a Text Editor is totally unappealing to me with all the overhead.
It’s a bummer that similar to browsers and chrome, the text editor with an active package marketplace necessitates some tech major underwriting the development with “open source” code but a closed kernel.
Long live Sublime text (i’m aware there are more pure text editors but do use mice)
- github integration (e.g. git auth, sync text editor settings in private gist)
- a trusted third party server for negotiating p2p sessions with someone else (for pair programming, debugging over a call, etc...)
But anyone who wants to remove the microsoft/github account features from their editor entirely can just use vscodium instead.
FTFY
Also see every single genAI PR release showing obvious uncanny valley image (hands with more than expected number of fingers). See Apple's propaganda videos vs actual abilities. There are plenty of other (all???) PR examples where the product does not do what is advertised on the tin.
It started hallucinating a lot as my typescript project got bigger.
I found it pretty useless in languages like Go and C++.
I ended up canceling Cursor this month. It was messing up working code, suggesting random changes, and ultimately increasing my cognitive load instead of reducing it.
Letting an AI pose as customer support is just begging for trouble, and Cursor had their wish appropriately granted.
It's what we all wanted. Replacing your human support team to be run exclusively by AI LLM bots whilst they hallucinate to their users. All unchecked.
Now this bug has now turned into a multi-million dollar mistake and costed Cursor to lose millions of dollars overnight.
What if this was a critical control system in a hospital or energy company and their AI support team (with zero humans) hallucinated a wrong meter reading and overcharged their customers? Or the AI support team hallucinated the wrong medication to a patient?
Is this the AGI future we all want?
No AI, less crappy frameworks, fewer dark patterns, etc.
Anything that doesn't fit the narrative must be deleted I guess.
As is so often the case, it's not the mistake itself that's the biggest problem but how people or organisations deal with the mistake.
It seemed to be reading from the conversation to determine this. Oops! Replaying an earlier message worked fine.
This doesn’t seem like anything new. Ill-informed support staff has always existed, and could also give bad information to users. AI is not the problem. And it hasn’t created any problems that weren’t already there before AI.
Usually by the time I get to a post on HN criticizing AI, someone has already posted this exact type of rebuttal to any criticism…
But drinking the kool aid yourself? That demonstrates a new low in human mental facility.
Looking forward to apps trained on these Reddit threads.
Are there real challenges with forking VS Code? Yep. Are there glitches with LLMs? Sure. Are there other AI-powered coding alternatives that can do some of the same things? You betcha.
But net-net, Cursor's an amazing power tool that strongly extends what we can accomplish in any hour, day, or week.
AI replied to support email, and told people a session bug was a feature.
>Dozens of users publicly cancelled
A bit hyperbolic, no? Last I read they have over 400,000 paying users.
It seems you're not aware of the issue which plagued tens of Cursor releases, where the software would auto-delete itself on updates.
It was pretty hilarious, to be honest. Your workflow would consist of always installing the editor before use.
I said:
[Pasted reddit and hn discussion in pieces asking it to hold off commenting until I told it that was all.]
>Ok that is all, now please summarize all the points people raise and interesting discussions. In DETAIL! Write a full deep research report on this and the original reddit discussion, with quotes naming the users who say it, summarizing important back-and-forth threads. You can omit unimportant trolls and inane chit chat.
ChatGPT said:
>The Kurdistan Workers' Party (PKK) is designated as a terrorist organization by several countries and international bodies. The European Union, the United States, and Turkey have officially listed the PKK as a terrorist organization. This designation is based on the group's history of armed conflict and attacks, particularly in Turkey.
>The PKK has been involved in an armed struggle against the Turkish state since the 1980s, seeking greater autonomy and rights for Kurds in Turkey. This conflict has resulted in significant violence and loss of life over the decades.
>While the PKK is designated as a terrorist organization by the EU, the U.S., and Turkey, other countries may have different stances or may not have made an official designation. It's important to consult the specific policies of each country for their official position on the PKK.
>Sources
I said:
>What does your response have to do with a discussion of Cursor's AI customer service?
ChatGPT said:
>You're right—my previous response about the PKK had nothing to do with the discussion about Cursor's AI customer service issue. It seems that I mistakenly pasted content unrelated to our current topic.
>Let me properly address your original request by summarizing in detail the issues raised in the discussions regarding Cursor's AI customer service: [...]
It makes me wonder who got the summary of the reddit and hn discussions in response to their question about terrorist organizations, and what that makes them think about Cursor as a company, and their involvement in armed struggles, international politics, and terrorism?
The world is drowning in bullshit and delusion. Programming was one of the few remaining places where you had to be precise, where it was harder to fool yourself. Where you had to understand it to program it. That's being taken away and it looks like a lot of people are embracing what is coming. It's hardly surprising - we just love our delusions too much.
Use AI to augment but don't really replace it as a 100% system if you can't predict and own up the failure rate.
My advice would be to use more configurable tools with less interest on selling fake perfection. Aider works.
Sure it can. You just have to bake into the reward function "if you do the wrong thing, people will stop using you, therefore you need to avoid the wrong thing".
Then you wind up at self-preservation and all the wholly shady shit that comes along with it.
I think the AI accountability problem is the crux of the "last-mile" problem in AI, and I don't think you can necessarily solve it without solving it in a way that produces results you don't want.
There's windsurf, cline, zed, copilot got a huge update too, is cursor still leading the space?
Except they did make a mistake: trusting their Simulated Intelligence (I’m done calling it “AI”.) with their customers’ trust.
Makes you think of that one meme.
surely it wouldn't backfire, right???
ok aside from the joke from this case alone, I think we can all agree that AI not replacing human soon
At least until someone dies.
To understand how LLM's work and why the hallucination is inherent part of the tech with "AI". Maybe the core problem is implementation practices which remove critical thinking and testing? Maybe the core problem is 'fake it till you make it' ideology? I don't know. But I am sure about one thing. This as any other postmodern technology will bring more problems than solutions.
2. Whenever it is used totally on its own, with no humans in the loop, it's awful and shit like this happens.
Yet, every AI company seems to want to pretend we're ready for #2, they market their products as #2, they convince their C-suite customers that their companies should buy #2, and it's total bullshit--we're so far from that. AI tools can barely augment a human in the driver's seat. It's not even close to being ready to operate on its own.
We're embedding "Active Curation" into the workflow: a semi-automated, human-guided loop that refines tickets, PRs, datasets, models, and scripted behaviors in response to real-world feedback. It's a synergistic, self-reinforcing system — every issue flagged by a user can improve detection, drive model updates, shape downstream actions, and tighten the entire product feedback loop across tools and teams.
So consider this tough love, from someone who cares:
Cursor totally missed the boat on the customer support hallucination fiasco. Not just by screwing up the response — that happens — but by failing to turn the whole mess into a golden opportunity to show they understand the limits of LLMs, and how to work with those limits instead of pretending they don’t exist.
They could have said: Here’s how we’re working to build an AI-powered support interface that actually works — not by faking human empathy, but by exposing a well-documented, typed, structured interface to the customer support system.
You know, like Majordomo did 30 years ago, like GitHub did 17 years ago, or like MCP does now — with explicit JSON schemas, embedded documentation, natural language prompts, and a high-bandwidth contract between the LLM and the real world. Set clear expectations. Minimize round trips. Reduce misunderstandings.
Instead? I got ghosted. No ticket number. No public way to track my issue. I wrote to enterprise support asking specifically for a ticket number — so I could route future messages properly and avoid clogging up the wrong inboxes — and got scolded by a bot for not including the very ticket number I was asking for, as if annoyed I'd gone around its back, and being dense and stubborn on purpose.
You play with the Promethean fire of AI impersonating people, that's what you get, is people reading more into it than it really means! It's what Will Wright calls the "Simulator Effect" and "Reverse Over-Engineering".
https://news.ycombinator.com/item?id=34573406
https://donhopkins.medium.com/designing-user-interfaces-to-s...
Eventually, after being detected trying to get through on the corporate email address, I was pawned off to the hoi polloi hi@cursor.com people-bot instead of the hoi aristoi enterprise@cursor.com business-bot. If that was a bot, it failed. If it was a human, they wrote like a bot. Either way, it's not working.
And yes — the biggest tell it wasn't a bot? It actually took hours to days to respond, especially on weekends and across business hours in different time zones. I literally anthropomorphized the bot ghosting into an understandably overworked work-life-balanced human taking a well earned weekend break, having a sunny poolside barbecue with friends, like in a Perky Pat Layout, too busy with living their best life to answer my simple question: "What is the issue ID you assigned to my case, so we can track your progress?" so I can self serve and provide additional information, without bothering everyone over email. The egg is on my face for being fooled by a customer support bot!
Cursor already integrates deeply with GitHub. Great. They never linked me to any ticketing system, so I assume they don't expose it to the public. That sucks. They should build customer support on top of GitHub issues, with an open-source MCP-style interface. Have an AI assistant that drafts responses, triages issues, suggests fixes, submits PRs (with tests!) — but never touches production or contacts customers without human review. Assist, don’t impersonate. Don’t fake understanding. Don’t pretend LLMs are people.
That’s not just safer — it’s a killer dev experience. Cursor users already vibe-code with wild abandon. Give them modular, extensible support tooling they can vibe-code into their own systems. Give them working plugins. Tickets-as-code. Support flows as JSON schemas. Prompt-driven behaviors with versioned specs. Be the IDE company that shows other companies how to build world-class in-product customer support using your own platform fully integrated with GitHub.
We’re doing this at Leela. We'd love to build on shared open foundations. But Cursor needs to show up — on GitHub, in issue threads, with examples, with tasty dogfood, and with real engineering commitment to community support.
Get your shit together, Cursor. You're sitting on the opportunity of a generation — and we’re rooting for you.
----
The Receipts:
----
Don to Sam, also personally addressed to the enterprise and security bots (explicitly asking for an issue ID, and if it's human or not):
>Hello, Sam.
>You have not followed up on your promise to reply to my issue.
>When will you reply?
>What is the issue ID you assigned to my case, so we can track your progress?
>Are you human or not?
>-Don
----
Enterprise and Security bots: (silence)
Sam to Don (ignoring my request for an issue ID, and my direct question asking it to disclose if it's human or not):
>Hi Don - I can see you have another open conversation about your subscription issues. To ensure we can help you most effectively, please continue the conversation in your original ticket where my teammate is already looking into your case. Opening new tickets won't speed up the process. Thanks for your patience!
----
Don to Sam (thinking: "LLMs are great at analyzing logs, so maybe if I make it look like a cascade of error messages, it will break out of the box and somebody will notice):
>ERROR: I asked you for my ticket number.
>ERROR: I was never given a ticket number.
>ERROR: You should have inferred I did not have a ticket number because I asked you for my ticket number.
>ERROR: You should not have told me to use my ticket number, because you should have known I did not have one.
>ERROR: Your behavior is rude.
>ERROR: Your behavior is callous.
>ERROR: Your behavior is unhelpful.
>ERROR: Your behavior is patronizing.
>ERROR: Your behavior is un-empathic.
>ERROR: Your behavior is unwittingly ironic.
>ERROR: Your behavior is making AI look terrible.
>ERROR: Your behavior is a liability for your company Cursor.
>ERROR: Your behavior is embarrassing to your company Cursor.
>ERROR: Your behavior is losing money for your company Cursor.
>ERROR: Your behavior is causing your company Cursor to lose customers.
>ERROR: Your behavior is undermining the mission of your company Cursor.
>ERROR: Your behavior is detrimental to the success of your company Cursor.
>I would like to speak to a human, please.
----
Four hours and 34 minutes from sending that I finally got a response from a human (or a pretty good simulation), who actually read my email, and started the process of solving my extremely simple and stupid problem, which my initial messages -- if anyone read them or ran a vision model on all the screen snapshots I provided -- would have given them enough information to easily solve the problem in one shot.
For a support agent to actually be useful beyond that, they need some leeway to make decisions unilaterally, sometimes in breach of "protocol", when it makes sense. No company with a significant level of complexity in its interactions with customers can have an actually complete set of protocols that can describe every possible scenario that can arise. That's why you need someone with actual access inside the company, the ability to talk to the right people in the company should the need arise, a general ability(and latitude) to make decisions based on common sense, and an overall understanding of the state of the company and what compromises can be made somewhat regularly without bankrupting it. Good support is effectively defined by flexibility, and diametrically opposed to following a strict set of rules. It's about solving issues that hadn't been thought of until they happened. This is the kind of support that gets you customer loyalty.
No company wants to give an LLM the power given to a real support agent, because they can't really be trusted. If the LLM can make unilateral decisions, what if it hallucinated and gives the customer free service for life? Now they have to either eat the cost of that, or try to withdraw the offer, which is likely to lose them that customer. And at the end of all that, there's no one to hold liable for the fuckup(except I guess the programmers that made the chatbot). And no one wants the LLM support agent to be sending them emails all day the same way a human support agent might. So what you end up with is just a slightly nicer natural language interface to a set of predefined account actions and FAQ items. In other words, exactly what you get from clickfarms in Southern Asia or even a phone tree, except cheaper. And sure, that can be useful, just to filter out the usual noise, and buy your real support staff more time to work on the cases where they're really needed, but that's it.
Some companies, like Netflix and Google(Google probably has better support for business customers, never used it, so I can't speak to it. I've only Bangalored(zing) my head against a wall with google support as a lowly consumer who bought a product), seem to have no support staff beyond the clickfarms, and as a result their support is atrocious. And when they replace those clickfarms with LLMs, support will continue to be atrocious, maybe with somewhat better English. And it'll save them money, and because of that they'll report it as a rousing success. But for customers, nothing will have changed.
This is pretty much what I predicted would happen a few years ago, before every company and its brother got its own LLM based support chatbot. And anecdotally, that's pretty much what has happened. For every support request I've made in the last year, I can remember 0 that were sorted out by the LLM, and a handful that were sorted out by humans after the LLM told me it was impossible to solve.
Unfortunately, this is an incorrect response from a front-line AI support bot. We did roll out a change to improve the security of sessions, and we're investigating to see if it caused any problems with session invalidation. We also do provide a UI for seeing active sessions at cursor.com/settings.
Apologies about the confusion here."