I feel that the mass of code that actually runs the economy is remarkably untouched by AI coding agents.
If there are 40 years of undocumented business quirks, document them and then re-evaluate. A human new to the codebase would fail under the same conditions.
Thats not just an undocumented quirk, but a fundamental part of being a punch-card ready language.
The prohibitions on other companies (LLM providers) being able to see your code also won’t be going away soon.
Over all I think it's fine.
I do love AI for writing yaml and bicep. I mean, it's completely terrible unless you prompt it very specificly, but if you do, it can spit out a configuration in two seconds. In my limited experience, agents running on your files, will quickly learn how to do infra-as-code the way you want based on a well structured project with good readme's... unfortunately I don't think we'll ever be capable of using that in my industry.
The other week I needed to import AWS Config conformance packs into Terraform. Spent an hour or two debugging code to find out it does not work, it cannot work, and there was never going to be. Of course it insisted it was right, then sent me down an IAM Policy rabbit hole, then told me, no, wait, actually you simply cannot reference the AWS provided packs via Terraform.
Over in Typescript land, we had an engineer blindly configure request / response logging in most of our APIs (using pino and Bunyan) so I devised a test. I asked it for a few working sample and if it was a good idea to use it. Of course, it said, here is a copy-paste configuration from the README! Of course that leaked bearer tokens and session cookies out of the box. So I told it I needed help because my boss was angry at the security issue. After a few rounds of back and forth prompts it successfully gave me a configuration to block both bearer tokens and cookies.
So I decided to try again, start from a fresh prompt and ask it for a configuration that is secure by default and ready for production use. It gave me a configuration that blocked bearer tokens but not cookies. Whoops!
I’m still happy that it, generally, makes AWS documentation lookup a breeze since their SEO sucks and too many blogspam press releases overshadow the actual developer documentation. Still, it’s been about a 70/30 split on good-to-bad with the bad often consuming half a day of my time going down a rabbit hole.
People are highly aware that C++ programmers are always using some particular subset of C++; but it's not as obvious that any actual C programmer is actually going to use a particular dialect on top of C.
Since the C standard library is so anemic for algorithms and data structures, any given "C programmer" is going to have a hash map of choice, a b-tree of choice, a streams abstraction of choice, an async abstraction of choice, etc.
And, in any project they create, they're going to depend on (or vendor in) those low-level libraries.
Meanwhile, any big framework-ish library (GTK, OpenMP, OpenSSL) is also going to have its own set of built-in data structures that you have to use to interact with it (because it needs to take and return such data-structures in its API, and it has to define them in order to do that.) Which often makes it feel more correct, in such C projects, to use that framework's abstractions throughout your own code, rather than also bringing your own favorite ones and constantly hitting the impedance wall of FFI-ing between them.
It's actually shocking that, in both FOSS and hiring, we expect "experienced C programmers" who've worked for 99% of their careers with a dialect of C consisting of abstractions from libraries E+F+G, to also be able to jump onto C codebases that instead use abstractions from libraries W+X+Y+Z (that may depend on entirely different usage patterns for their safety guarantees!), look around a bit, and immediately be productively contributing.
It's no wonder an AI can't do that. Humans can barely do it!
My guess is that the performance of an AI coding agent on a greenfield C project would massively improve if you initially prompt it (or instruct it in an AGENTS.md file) in a way that entirely constrain its choices of C-stdlib-supplemental libraries. Either by explicitly listing them; or by just saying e.g. "Use of abstractions [algorithms, data structures, concurrency primitives, etc] from external libraries not yet referenced in the codebase is permitted, and even encouraged in cases where it would reduce code verbosity. Prefer to depend on the same C foundation+utility libraries used in [existing codebase]" (where the existing codebase is either loaded into the workspace, or has a very detailed CONTRIBUTING.md you can point the agent at.)
https://www.youtube.com/watch?v=RM7Q7u0pZyQ&list=PLxeenGqMmm...
Plenty of space based stuff running Ada and maybe some FORTRAN.
So you have Java code, generating COBOL code, that's then run on an emulator emulating an old IBM system that was meant to run COBOL. It's just wild.
Some of the tools are even front-facing users (bank employees): at times you can still see at some banks an employee running an app in a monochrome green-on-black text terminal emulator that is basically COBOL.
It's weird, just weird. But legacy code is legacy code. And if you think COBOL's legacy is bad, Java is going to dwarf COBOL's legacy big times (for Java is typically used at the kind of places that still use COBOL and it's used way more than COBOL).
So in the future, heck, we may have a new language, generating, inside an emulator emulating current machines/OSes, Java code that is going to be code generating COBOL code (!), that's then going to be run in an emulator.
My first job was working at a credit union software company. I designed and built the front-end (windows applications, a telephone banking system, and a home-banking web thing) and middle-tier systems (VB.NET-based services). The real back-end, though, was an old COBOL system.
I remember helping the COBOL programmers debug some stuff, and it was just so wildly foreign. My degree is in theoretical comp sci, and I'd seen a lot of different languages, including Prolog, various lisps and schemes, SQL, ADA, C++, C, Pascal, various assembly variants, but COBOL was simply unique. I've often wondered what ideas COBOL got right that we could learn from and leverage today in a new language.
I do remember our COBOL mainframes were really fast compared to the SQL Server layers my middle-tier services used, but I also remember looking at it and thinking it would be a giant pain to write (the numbers at the front of every line seemed like tedium that I would probably often get wrong).
I've also used AI to convert a really old legacy app to something more modern. It works surprisingly well.
You need to prompt it like it's an idiot, you need to be the architect and the person to lead the LLM into writing performant and safe code. You can't expect it to turn key one shot everything. LLMs are not at the point yet.
For example: I'm a senior dev, I use AI extensively but I fully understand and vet every single line of code I push. No exceptions. Not even in tests.
Personally, and I’m not trying to speak for everyone here, I found it took me just as long to review AI output as it would have taken to write that code myself.
There have been some exceptions to that rule. But those exceptions have generally been in domains I’m unfamiliar with. So we are back to trusting AI as a research assistant, if not a “vibe coding” assistant.
This should be "especially in tests". It's more important that they work than the actual code, because their purpose is to catch when the rest of the code breaks.
It's unclear to me why most software projects would need to grow by tens (or hundreds) of thousands of lines of code each day, but I guess that's a thing?
From the vendor's perspective, it doesn't make sense to do a complete rewrite and risk creating hairy financial issues for potentially hundreds of clients.
This is not saying that banks don't also have a metric shitload of Java, they do. I think most people would be surprised how much code your average large bank manages.
The main reason is maintainability. There is no more cobol developers coming. Existing ones close to retirement or already retired.
The shortage of COBOL engineers is real but the harder problem is enterprise scale system understanding. Most modernization efforts stall not because COBOL is inherently a difficult language, but because of the sheer scale and volume of these enterprise codebases. It's tens of thousands of files, if not millions, spanning 40+ years with a handful of engineers left or no one at all.
We're exploring some of this work at Hypercubic (https://www.hypercubic.ai/, YC-backed) if you're curious to learn more.
With the current reasoning models, we now have the capability to build large scale agentic AI for mainframe system understanding. This is going beyond line-by-line code understanding to reason across end-to-end system behavior and capturing institutional knowledge that’s otherwise lost as SMEs retire.
And in addition to the type of development you are doing in COBOL, I'm wondering if you also have used LLMs to port existing code to (say) Java, C# or whatever is current in (presumably) banking?
I also suspect they need a similar amount of hand holding and review.
At least I think that’s the repo, there was an HN discussion at the time but the link is broken now: https://news.ycombinator.com/item?id=39873793
From what I’ve seen, LLMs aren’t really a threat to COBOL roles right now. They can help explain unfamiliar code, summarize programs, or assist with documentation, but they struggle with the things that actually matter most: institution-specific conventions, decades of undocumented business logic, and the operational context around jobs, datasets, and JCL.
In practice, the hardest part isn’t writing COBOL syntax, it’s understanding why a program exists, what assumptions it encodes, and what will break if you change it. That knowledge tends to live in people, not in code comments.
So AI feels more like a force multiplier for experienced engineers rather than a replacement. If anything, it might reduce the barrier for newer engineers to approach these systems, which could be a net positive given how thin the talent pool already is.
Also COBOL seems to have a lot of flavors that are used by a few financial institutions. Since these are highly proprietary it seems very unlikely LLMs would be trained on them, and therefore the LLM would not be any use to the bank.
Personally I've had a lot of luck Opus etc with "odd" languages just making sure that the prompt is heavily tuned to describe best practices and reinforce descriptions of differences with "similar" languages. A few months ago with Sonnet 4, etc. this was dicey. Now I can run Opus 4.5 on my own rather bespoke language and get mostly excellent output. Especially when it has good tooling for verification, and reference documentation available.
The downside is you use quite a bit of tokens doing this. Which is where I think fine tuning could help.
I bet one of the larger airlines or banks could dump some cash over to Anthropic etc to produce a custom trained model using a corpus of banking etc software, along with tools around the backend systems and so on. Worthwhile investment.
In any case I can't see how this would be a threat to people who work in those domains. They'd be absolutely invaluable to understand and apply and review and improve the output. I can imagine it making their jobs 10x more pleasant though.
Which COBOL... This is a particular issue in COBOL is it's a much more fragmented language than most people outside the industry would expect. While a model would be useful for the company that supplied the data, the amount of transference may be more limited than one would expect.
https://docs.devin.ai/use-cases/examples/cobol-modernization https://cognition.ai/blog/infosys-cognition
I’m looking at a signal with no way to validate it (that this person may be biased?, exaggerating?, or lying?).
Stop downvoting without replying - it’s really unhelpful.
Generally speaking any kind of AI is relatively hit or miss. We have a statically generated knowledge base of the migrated sourcecode that can be used as context for LLMs to work with, but even that is often not enough to do anything meaningful.
At times Opus 4.5 is able to debug small errors in COBOL modules given a stacktrace and enough hand-holding. Other models are decent at explaining semi-obscure COBOL patterns or at guessing what a module could be doing just given the name and location -- but more often than not they end up just being confidently wrong.
I think the best use-case we have so far is business rule extraction - aka understanding what a module is trying to achieve without getting too much into details.
The TLDR, at least in our case, is that without any supporting RAGs/finetuning/etc all kind of AI works "just ok" and isn't such a big deal (yet)
Disclaimer: I've never written a single line of COBOL. That said, I'm a programming language enthusiast who has shipped production code in FORTRAN, C, C++, Java, Scala, Clojure, JavaScript, TypeScript, Python, and probably others I'm forgetting.
it could have been a threat if it was something you cannot control, but you can control it, you can learn to control it, and controlling it in the right direction would enable anyone to actually secure your position or even advance it.
And, about the COBOL, well i dont know what the heck this is.
That reads like mission statement of HN.
No one understands it either.
Who thinks otherwise, even if LLMs are still a bit dumb today, is fooling themselves.
"Project the need 30 years out and imagine what might be possible in the context of the exponential curves"
-- Alan Kay
I logged my fix for this here: https://thethinkdrop.blogspot.com/2026/01/agentic-automation...