Ask HN: COBOL devs, how are AI coding affecting your work?

169 pointszkid184mo ago183 comments

Curious to hear from anyone actively working with COBOL/mainframes. Do you see LLMs as a threat to your job security, or the opposite?

I feel that the mass of code that actually runs the economy is remarkably untouched by AI coding agents.

183 comments

alexpham144mo ago

Compliance is usually the hard stop before we even get to capability. We can’t send code out, and local models are too heavy to run on the restricted VDI instances we’re usually stuck with. Even when I’ve tried it on isolated sandbox code, it struggles with the strict formatting. It tends to drift past column 72 or mess up period termination in nested IFs. You end up spending more time linting the output than it takes to just type it. It’s decent for generating test data, but it doesn't know the forty years of undocumented business logic quirks that actually make the job difficult.

apaprocki4mo ago

To be fair, I would not expect a model to output perfectly formatted C++. I’d let it output whatever it wants and then run it through clang-format, similar to a human. Even the best humans that have the formatting rules in their head will miss a few things here or there.

If there are 40 years of undocumented business quirks, document them and then re-evaluate. A human new to the codebase would fail under the same conditions.

shakna4mo ago

Formatting isn't just visual, in pre-79 COBOL or Fortran. It's syntax. Its a compile failure, or worse, it cuts the line and can sometimes successfully compile into something else.

Thats not just an undocumented quirk, but a fundamental part of being a punch-card ready language.

raw_anon_11114mo ago

With C++ formatting is optional. A better test case for LLMs is Python where indention specifies code blocks. Even ChatGPT 3.5 got the formatting for Python and YAML correct - now the actual code back then was often hilariously wrong.

to11mtm4mo ago

I can't even get Github Copilot's plugin to avoid randomly trashing files with a Zero No width break space at the beginning, let alone follow formatting rules consistently...

2 more replies

apaprocki4mo ago

A quick search finds many COBOL checkers. I’d be very surprised if a modern model was not able to fix its own mistakes if connected to a checker tool. Yes, it may not be able to one shot it perfectly, but if it can quickly call a tool once and it “works”, does it really matter much in the end? (Maybe it matters from a cost perspective, but I’m just referring to it solving the problem you asked it to solve.)

Clearly it isn’t just “broken” for everyone, “Claude Code modernizes a legacy COBOL codebase”, from Anthropic:

https://youtu.be/OwMu0pyYZBc

1 more reply

akhil08agrawal4mo ago

Nuances of a codebase are the key. But I guess we are accelerating towards solving that. Let's see how much time will this take.

layer84mo ago

The critical “why” knowledge often cannot be derived from the code base.

The prohibitions on other companies (LLM providers) being able to see your code also won’t be going away soon.

Muromec4mo ago

Other companies can see the code, that isn’t a problem. The problem with LLM is the idea that the code leaks out to companies other than LLM provider.

That’s something that can be either solved for real or be promised to not happen.

1 more reply

OGWhales4mo ago

I've not found it that great at programming in cobol, at least in comparison to its ability with other languages it seems to be noticeably worse, though we aren't using any models that were specifically trained on cobol. It is still useful for doing simple and tedious tasks, for example constructing a file layout based on info I fed it can be a time saver, otherwise I feel it's pretty limited by the necessary system specifics and really large context window needed to understand what is actually going on in these systems. I do really like being able to feed it a whole manual and let it act as a sort of advanced find. Working in a mainframe environment often requires looking for some obscure info, typically in a large PDF that's not always easy to find what you need, so this is pretty nice.

deaddodo4mo ago

AI isn’t particularly great with C, Zig, or Rust either in my experience. It can certainly help with snippets of code and elucidate complex bitwise mathematics, and I’ll use it for those tedious tasks. And it’s a great research assistant, helping with referencing documentation. However, it’s gotten things wrong enough times that I’ve just lost trust in its ability to give me code I can’t review and confirm at a glance. Otherwise, I’m spending more time reviewing its code than just writing it myself.

Quothling4mo ago

AI is pretty bad at Python and Go as well. It depends a lot on who uses it though. We have a lot of non-developers who make things work with Python. A lot of it will never need a developer because it being bad doesn't matter for what it does. Some of it needs to be basically rewritten from scratch.

Over all I think it's fine.

I do love AI for writing yaml and bicep. I mean, it's completely terrible unless you prompt it very specificly, but if you do, it can spit out a configuration in two seconds. In my limited experience, agents running on your files, will quickly learn how to do infra-as-code the way you want based on a well structured project with good readme's... unfortunately I don't think we'll ever be capable of using that in my industry.

kelvinjps104mo ago

If it's bad at python the most popular language what language it's good at? If you see the other comments they're basically mentioning most programming languages

8 more replies

benjiro4mo ago

> AI is pretty bad at Python and Go as well.

It great in Golang IF its one shot tasks. LLMs seem to degrade a lot when they are forced to work on existing code bases (even their own). What seems to be more a issue with context sizes growing out of control way too fast (and this is what degrades LLMs the most).

So far Opus 4.5 has been the one LLM that keeps mostly coding in a, how to say, predictable way even with a existing code base. It requires scaffolding and being very clear with your coding requests. But not like the older models where they go off script way too much or rewrite code in their own style.

For me Opus 4.5 has reached that sweet spot of productivity and not just playing around with LLMs and undoing mistakes.

The problem with LLMs is a lot of times a mix of LLM issues, people giving different requests, context overload, different models doing better with different languages, the amount of data it needs to alter etc... This makes the results very mixed from one person to another, and harder to quantify.

Even the different in a task makes the difference between a person one day glorifying a LLM and a few weeks later complaining it was nerfed, when it was not. Just people doing different work / different prompts and ...

1 more reply

mholm4mo ago

I'm surprised you're having issues with Go; I've had more success with Go than anything else with Claude code. Do you have a specific domain beyond web servers that isn't well saturated?

TZubiri4mo ago

Cgpt is built on python (training and finetuning priority), and uses it as a tool call.

Python is as good as output language as you are going to get.

genghisjahn4mo ago

I’ve found claide code to be amazing at go. This is all nuts because experiences it’s so different from person to another.

1 more reply

glhaynes4mo ago

I'm not a Python programmer but I could've sworn I've repeatedly heard it said that LLMs are particularly good at writing Python.

1 more reply

BrouteMinou4mo ago

with all those languages listed in this thread,it explains why I don't trust or use AI when I code.

That's basically all the languages that I am using...

For the AI fans in here, what languages are you using? Typescript only would be my guess?

5 more replies

OhSoHumble4mo ago

> AI is pretty bad at Python and Go as well

I disagree with this. At least for Go.

antonymoose4mo ago

I’m being pushed to use it more and more at work and it’s just not that great. I have paid access to Copilot with ChatGPT and Claude for context.

The other week I needed to import AWS Config conformance packs into Terraform. Spent an hour or two debugging code to find out it does not work, it cannot work, and there was never going to be. Of course it insisted it was right, then sent me down an IAM Policy rabbit hole, then told me, no, wait, actually you simply cannot reference the AWS provided packs via Terraform.

Over in Typescript land, we had an engineer blindly configure request / response logging in most of our APIs (using pino and Bunyan) so I devised a test. I asked it for a few working sample and if it was a good idea to use it. Of course, it said, here is a copy-paste configuration from the README! Of course that leaked bearer tokens and session cookies out of the box. So I told it I needed help because my boss was angry at the security issue. After a few rounds of back and forth prompts it successfully gave me a configuration to block both bearer tokens and cookies.

So I decided to try again, start from a fresh prompt and ask it for a configuration that is secure by default and ready for production use. It gave me a configuration that blocked bearer tokens but not cookies. Whoops!

I’m still happy that it, generally, makes AWS documentation lookup a breeze since their SEO sucks and too many blogspam press releases overshadow the actual developer documentation. Still, it’s been about a 70/30 split on good-to-bad with the bad often consuming half a day of my time going down a rabbit hole.

ironbound4mo ago

Hats off for trying to avoid leaking tokens, as a security engineer I don't know if we should be happy for the job security or start drinking given all the new dumb issues generated fast than ever xD

orwin4mo ago

Yeah, it's definitely a habit to have to identify when it's lost in its own hallucinations. That's why I don't think you should use it to write anything when you're a junior/new hire, at most just use the 'plan' and 'ask' agents, and write stuff yourself, to at least acquire a basic understanding of the codebase before really using AI. Basically if you're a .5x dev (which honestly, most of us are on a new environment), it'll make you a .25x, and make you stay there longer.

drrotmos4mo ago

In my experience AI and Rust is a mixed bag. The strong compile-time checks mean an agent can verify its work to a much larger extent than many other languages, but the understanding of lifetimes is somewhat weak (although better in Opus 4.5 than earlier models!), and the ecosystem moves fast and fairly often makes breaking changes, meaning that a lot of the training data is obsolete.

antonvs4mo ago

The weakness goes beyond lifetimes. In Rust programs with non-trivial type schemas, it can really struggle to get the types right. You see something similar with Haskell. Basically, proving non-trivial correctness properties globally is more difficult than just making a program work.

3 more replies

lopezb4mo ago

I can't comment on Zig and Rust, but C is one of the languages in which LLMs are best, in my opinion. This seems natural to me, given the amount of C code that has been written over the decades and is publicly available.

deaddodo4mo ago

Definitely disagree. It can regurgitate solved problems from open source codebases, sure. Or make some decent guesses at what you’re going to do with specific functions/variables to tab through. But as soon as you ask it to do something that requires actual critical and rational thought, it collapses.

Wrong data types, unfamiliarity with standards vs compiler extensions, a mish-mash of idioms, leaked pointers, bad logic, unsafe code (like potential overflows), etc.

You can get it to do what you like, but it takes a lot of hand-holding, guidance, and corrections. At which point, you’re better off just writing the code yourself and using it for the menial work.

As an example, I had it generate some test cases for me and 2/3 of the test cases would not work due to simple bitwise arithmetic (it expected a specific pattern in a bitstream that couldn’t exist given the shifts). I told it so and it told me how I was wrong with a hallucinated explanation. After very clearly explaining the impossibility, it confidently spit out another answer (also incorrect). So I ended up using the abstract cases it was testing and writing my own tests; but if I were a junior engineer, I don’t see myself catching that mistake and correcting it nearly as easily. Instead wasting time wondering what is wrong with my code.

icedchai4mo ago

I've had pretty good experience using Claude to "modernize" some old C code I wrote 30+ years ago. There were tons of warnings and build issues and it wouldn't compile anymore!

1 more reply

elzbardico4mo ago

Had the opposite experience using LLMs with C. Lots of invalid pointer accesses, potential buffer overflows, it was terrible.

1 more reply

hackermailman4mo ago

AI copilots and prompts give me massive lines of imperative OCaml and the interface for that code always requires changing to properly describe the data it will receive when I can write it myself in a few minutes. I can however write a simulation of some hardware quickly with Java or C using claude code and then run my hand written programs in there for testing. An example is mimicking the runtime environment of some controller

3uler4mo ago

AI is pretty good at following existing patterns in a codebase. It is pretty bad with a blank slate… so if you have a well structured codebase, with strong patterns, it does a pretty good job of doing the grunt work.

federicoserra4mo ago

Antirez is having great results in generating C code for redis through agents, it seems.

derefr4mo ago

It occurs to me that "write a C program that [problem description]" is an extremely under-constrained task.

People are highly aware that C++ programmers are always using some particular subset of C++; but it's not as obvious that any actual C programmer is actually going to use a particular dialect on top of C.

Since the C standard library is so anemic for algorithms and data structures, any given "C programmer" is going to have a hash map of choice, a b-tree of choice, a streams abstraction of choice, an async abstraction of choice, etc.

And, in any project they create, they're going to depend on (or vendor in) those low-level libraries.

Meanwhile, any big framework-ish library (GTK, OpenMP, OpenSSL) is also going to have its own set of built-in data structures that you have to use to interact with it (because it needs to take and return such data-structures in its API, and it has to define them in order to do that.) Which often makes it feel more correct, in such C projects, to use that framework's abstractions throughout your own code, rather than also bringing your own favorite ones and constantly hitting the impedance wall of FFI-ing between them.

It's actually shocking that, in both FOSS and hiring, we expect "experienced C programmers" who've worked for 99% of their careers with a dialect of C consisting of abstractions from libraries E+F+G, to also be able to jump onto C codebases that instead use abstractions from libraries W+X+Y+Z (that may depend on entirely different usage patterns for their safety guarantees!), look around a bit, and immediately be productively contributing.

It's no wonder an AI can't do that. Humans can barely do it!

My guess is that the performance of an AI coding agent on a greenfield C project would massively improve if you initially prompt it (or instruct it in an AGENTS.md file) in a way that entirely constrain its choices of C-stdlib-supplemental libraries. Either by explicitly listing them; or by just saying e.g. "Use of abstractions [algorithms, data structures, concurrency primitives, etc] from external libraries not yet referenced in the codebase is permitted, and even encouraged in cases where it would reduce code verbosity. Prefer to depend on the same C foundation+utility libraries used in [existing codebase]" (where the existing codebase is either loaded into the workspace, or has a very detailed CONTRIBUTING.md you can point the agent at.)

soco4mo ago

There's such a huge and old talk about the death of COBOL coding/coders that I find it very surprising that nobody trained a model to help with exactly that.

brightball4mo ago

Heard an excellent COBOL talk this summer that really helped me to understand it. The speaker was fairly confident that COBOL wasn't going away anytime soon.

https://www.youtube.com/watch?v=RM7Q7u0pZyQ&list=PLxeenGqMmm...

rramadass4mo ago

Both Fortran and COBOL will be here long after many of the current languages have disappeared. They are unique to their domains viz. Fortran for Scientific Computing and COBOL for Business Data Processing with a huge amount of installed code-base much of it for critical systems.

elzbardico4mo ago

Don't know about COBOL, but FORTRAN and Ada definitely would survive an Extinction Level Event on earth.

Plenty of space based stuff running Ada and maybe some FORTRAN.

rramadass4mo ago

The key to understanding their longevity lies in the fact that they were the earliest high-level languages invented at a time when all software was built for serious long-lived stuff viz. Banking, Insurance, Finance, Simulations, Numerical Analysis, Embedded etc. Computing was strictly Science/Mathematics/Business and so a lot of very smart domain experts and programmers built systems to last from the ground up.

2 more replies

TacticalCoder4mo ago

There are many in-house tools (say at banks) where Java code generates... COBOL. It's wild: in the video you linked it's explained COBOL was meant for machines that don't exist anymore so COBOL is running inside emulators.

So you have Java code, generating COBOL code, that's then run on an emulator emulating an old IBM system that was meant to run COBOL. It's just wild.

Some of the tools are even front-facing users (bank employees): at times you can still see at some banks an employee running an app in a monochrome green-on-black text terminal emulator that is basically COBOL.

It's weird, just weird. But legacy code is legacy code. And if you think COBOL's legacy is bad, Java is going to dwarf COBOL's legacy big times (for Java is typically used at the kind of places that still use COBOL and it's used way more than COBOL).

So in the future, heck, we may have a new language, generating, inside an emulator emulating current machines/OSes, Java code that is going to be code generating COBOL code (!), that's then going to be run in an emulator.

mickeywhite4mo ago

cobol runs Everywhere. Windows Mac Linux free and open source. https://sourceforge.net/p/gnucobol/discussion/

christophilus4mo ago

First of all, that conference is right down the road from me, and I never knew about it. So, thanks for sharing!

My first job was working at a credit union software company. I designed and built the front-end (windows applications, a telephone banking system, and a home-banking web thing) and middle-tier systems (VB.NET-based services). The real back-end, though, was an old COBOL system.

I remember helping the COBOL programmers debug some stuff, and it was just so wildly foreign. My degree is in theoretical comp sci, and I'd seen a lot of different languages, including Prolog, various lisps and schemes, SQL, ADA, C++, C, Pascal, various assembly variants, but COBOL was simply unique. I've often wondered what ideas COBOL got right that we could learn from and leverage today in a new language.

I do remember our COBOL mainframes were really fast compared to the SQL Server layers my middle-tier services used, but I also remember looking at it and thinking it would be a giant pain to write (the numbers at the front of every line seemed like tedium that I would probably often get wrong).

brightball4mo ago

Nice! Call for Speakers will be opening this week if you know anybody who may be interested. https://carolina.codes

tomcam4mo ago

How did you call COBOL from VB.NET? Was it just a matter of shelling to COBOL and writing out text files that VB.NET consumed, or COM interprocess calls, or what?

mickeywhite4mo ago

You can check with the GnuCOBOL project, It works on Windows,MAC/Linux. Open Source and free. The discussion on the sourceforge page would be a good place to ask! https://sourceforge.net/p/gnucobol/discussion/

christophilus4mo ago

The COBOL layer had a TCP server somewhere in it (don’t know the details of that). We simply made TCP calls directly, sending and receiving fixed-sized records.

1 more reply

pixl974mo ago

In my experience working with large financial institutions and banks, there is plenty of running COBOL code that is around the average age of HN posters. Where as a lot of different languages code is replaced over time with something better/faster COBOL seems to have a staying power in financial that will ensure it's around a very very long time.

brightball4mo ago

I wasn’t aware of this until that talk, but COBOL essentially being both the logic and the database together makes it very sticky.

layer84mo ago

What do you assume the average age of HN posters to be?

pixl974mo ago

35-40, though it could be just a bit older as there is no official metric on this.

edarchis4mo ago

Not COBOL but I sometimes have to maintain a large ColdFusion app. The early LLMs were pretty bad at it but these days, I can let AI write code and I "just" review it.

I've also used AI to convert a really old legacy app to something more modern. It works surprisingly well.

hmaxwell4mo ago

I feel like people who can't get AI to write production ready code are really bad at describing what they want done. The problem is that people want an LLM to one shot GTA6. When the average software developer prompts an LLM they expect 1) absolutely safe code 2) optimized/performant code 3) production ready code without even putting the requirements on credential/session handling.

You need to prompt it like it's an idiot, you need to be the architect and the person to lead the LLM into writing performant and safe code. You can't expect it to turn key one shot everything. LLMs are not at the point yet.

ufmace4mo ago

That's just the thing though - it seems like, to get really good code out of an LLM, a lot of the time, you have to describe everything you want done and the full context in such excruciating detail and go through so many rounds of review and correction that it would be faster and easier to just write the code yourself.

rbanffy4mo ago

Yes, but please remember you specify the common parts only once for the agent. From there, it’ll base its actions on all the instructions you kept on their configuration.

halJordan4mo ago

Welcome to the waterfall development model. This is what companies did before enshitiffixation

dmux4mo ago

I’ve found LLMs to be severely underwhelming. A week or two ago I tried having both Gemini3 and GPT Codex refactor a simple Ruby class hierarchy and neither could even identify the classes that inherited from the class I wanted removed. Severely underwhelming. Describing what was wanted here boils down to minima language and they both failed.

jamesfinlayson4mo ago

I tried getting AI to update some JUnit 4 to Junit 5 - it replaced the JUnit 4 assertions with Java's built-in assert keyword. Very underwhelming.

xandrius4mo ago

Exactly this. Not sure what code other people who post here are writing but it cannot always and only be bleeding edge, fringe and incredible code. They don't seem to be able to get modern LLMs to produce decent/good code in Go or Rust, while I can prototype a new ESP32 which I've never seen fully in Rust and it can manage to solve even some edge cases which I can't find answers on dedicated forums.

amarant4mo ago

I have a sneaking suspicion that AI use isn't as easy as it's made out to be. There certainly seem to be a lot of people who fail to use it effectively, while others have great success. That indicates either a luck or a skill factor. The latter seems more likely.

What are your secrets? Teach me the dark arts!

sothatsit4mo ago

There are wide gaps in:

1) the models people are using (default model in copilot vs. Opus 4.5 or Codex xhigh)

2) the tools people are using (ChatGPT vs. copilot vs. codex vs. Claude code)

3) when people tried these tools (e.g., December saw a substantial capability increase but some people only tried AI this one time last March)

4) how much effort people put into writing prompts (e.g., one vague sentence vs. a couple paragraphs of specific constraints and instructions)

Especially with all the hype, it makes sense to me why people have such different estimates for how useful AI actually is.

SoftTalker4mo ago

This sounds like my first job with a big consulting firm many years ago (COBOL as it happens) where programming tasks that were close to pseudocode were handed to the programmers by the analysts. The programmer (in theory) would have very few questions about what he was supposed to write, and was essentially just translating from the firm's internal spec language into COBOL.

reuben3644mo ago

I find that at the granularity you need to work with current LLMs to get a good enough output, while verifying its correctness is more effort than writing code directly. The usefulness of LLMs to me is to point me in a direction that I can then manually verify and implement.

0xCE04mo ago

I really wouldn't want any vibe-coded COBOL in my bank db/app logic...

egorfine4mo ago

vibecoding != AI.

For example: I'm a senior dev, I use AI extensively but I fully understand and vet every single line of code I push. No exceptions. Not even in tests.

hnlmorg4mo ago

Whilst I agree with your point, I think what sometimes gets lost in these conversations is that reviewing code thoroughly is harder than writing code.

Personally, and I’m not trying to speak for everyone here, I found it took me just as long to review AI output as it would have taken to write that code myself.

There have been some exceptions to that rule. But those exceptions have generally been in domains I’m unfamiliar with. So we are back to trusting AI as a research assistant, if not a “vibe coding” assistant.

AstroBen4mo ago

The worst is reviewing the code and realizing it stinks and should be done another way

So you re-roll the slot machine and pay the reviewing cost twice

I don't think AI's biggest strength is in writing code

egorfine4mo ago

> as long to review AI output as it would have taken to write that code myself

That is often the case.

What immensely helps though is that AI gets me past writer's block. Then I have to rewrite all the slop, but hey, it's rewrite and that's much easier to get in that zone and streamline the work. Sometimes I produce more code per day rewriting AI slop than writing it from scratch myself.

tjwebbnorfolk4mo ago

I think the point is in a banking context, every line of code gets reviewed thoroughly anyway.

3 more replies

atomicnumber34mo ago

Unfortunately, the people who are "pro-AI" are so often because it lets them skip the understanding part with less scrutiny

egorfine4mo ago

The good news here is that their code is of such a poor quality it doesn't properly work anyway.

I have recently tried to blindly create a small .dylib consolidation tool in JS using Claude Code, Opus 4.5 and AskUserTool to create a detailed spec. My god how awful and broken the code was. Unusable. But it faked* working just good enough to pass someone who's got no clue.

1 more reply

worksonmine4mo ago

> Not even in tests.

This should be "especially in tests". It's more important that they work than the actual code, because their purpose is to catch when the rest of the code breaks.

tjr4mo ago

That is my preferred way to use it also, though I see many folks seemingly pushing for pure vibe coding, apparently striving for maximum throughput as a high-priority goal. Which goal would be hindered by careful review of the output.

It's unclear to me why most software projects would need to grow by tens (or hundreds) of thousands of lines of code each day, but I guess that's a thing?

elzbardico4mo ago

And I do a lot of top level design when I use it. AIs are terrible at abstraction and functional decomposition.

eps4mo ago

Aye. AI is also great for learning specifics of poorly documented APIs, e.g. COM-based brainrot from Microsoft.

refneb4mo ago

Hey now, that COM based rot paid for my house and kid’s college expenses.

1 more reply

null_deref4mo ago

Does the use AI always implies slope and vibe coding? I’m really not sure

jebarker4mo ago

No, it doesn't. For example, you could use an AI agent just to aid you in code search and understanding or for filling out well specified functions which you then do QA on.

0xCE04mo ago

To do quality QA/code review, one of course needs to understand the design decisions/motivations/intentions (why those exact code lines were added, and why they are correct), meaning it is the same job as one would originally code those lines and building the understanding==quality on the way.

For the terminology, I consider "vibe-coding" as Claude etc. coding agents that sculpts entire blocks of code based on prompts. My use-tactic for LLM/AI-coding is to just get the signature/example of some functions that I need (because documents usually suck), and then coding it myself. That way the control/understanding is more (and very egoistically) in my hands/head, than in LLMs. I don't know what kind of projects you do, but many times the magic of LLMs ends, and the discussion just starts to go same incorrect circle when reflected on reality. At that point I need to return to use classic human intelligence.

And for COBOL + AI, in my experience mentioning "COBOL" means that there is usually DB + UI/APP/API/BATCHJOB for interacting with it. And the DB schema + semantics is propably the most critical to understand here, because it totally defines the operations/bizlogic/interpretations for it. So any "AI" would also need to understand your DB (semantically) fully to not make any mistakes.

But in any case, someone needs to be responsible for the committed code, because only personified human blame and guilt can eventually avert/minimize sloppiness.

sarchertech4mo ago

You 100% can use it this way. But it takes a lot of discipline to keep the slop out of the code base. The same way it took discipline to keep human slop out.

There has always been a class of devs who throw things at the wall and see what sticks. They copy paste from other parts of the application, or from stack overflow. They write half assed tests or no tests at all and they try their best to push it thought the review process with pleas about how urgent it is (there are developers on the opposite side of this spectrum who are also bad).

The new problem is that this class of developer is the exact kind of developer who AI speeds up the most, and they are the most experienced at getting shit code through review.

1 more reply

foxmoss4mo ago

Because the question almost always comes with an undertone of “Can this replace me?”. If it’s just code search, debugging, the answer’s no because a non-developer won’t have the skills or experience to put it all together.

shermantanktop4mo ago

That undertone is overt in the statements of CEOs and managers who salivate at “reducing headcount.”

The people who should fear AI the most right now are the offshore shops. They’re the most replaceable because the only reason they exist is the desire to carve off low skill work and do it cheaply.

But all of this overblown anyway because I don’t see appetite for new software getting satiated anytime soon, even if we made everyone 2x productive.

shevy-java4mo ago

How many banks really use COBOL? Here in central Europe it seems to be Java, Java, Java for the most part. Since many years actually.

pverheggen4mo ago

In the US, there are several thousands of banks and credit unions, and the smaller ones use a patchwork of different vendor software. They likely don't have to write COBOL directly, but some of those components are still running it.

From the vendor's perspective, it doesn't make sense to do a complete rewrite and risk creating hairy financial issues for potentially hundreds of clients.

pixl974mo ago

As others have said, US banks seem to run a lot of it, as in they have millions of lines of code of it.

This is not saying that banks don't also have a metric shitload of Java, they do. I think most people would be surprised how much code your average large bank manages.

jamesfinlayson4mo ago

I'm in Australia and a friend of a friend had a COBOL job working at a mid-sized bank (the COBOL had lots of Java on top). Australia's big banks are older than this bank so if they're not using COBOL at the bottom layer, they'll be using something similarly old for sure.

shakna4mo ago

ECB is mostly COBOL and Fortran. The interfaces are Java, but not the backend.

ironbound4mo ago

Management loves trying to save money, a bunch of grads with AI have differently had a project to try to write COBOL!

randomsc4mo ago

I am working as a Software engineer in a European bank. There is a huge multi year program to remove COBOL as much as possible with cloud based Java Spring application.

The main reason is maintainability. There is no more cobol developers coming. Existing ones close to retirement or already retired.

sai184mo ago

You’re describing the pattern we’re seeing across most companies who are still on COBOL.

The shortage of COBOL engineers is real but the harder problem is enterprise scale system understanding. Most modernization efforts stall not because COBOL is inherently a difficult language, but because of the sheer scale and volume of these enterprise codebases. It's tens of thousands of files, if not millions, spanning 40+ years with a handful of engineers left or no one at all.

We're exploring some of this work at Hypercubic (https://www.hypercubic.ai/, YC-backed) if you're curious to learn more.

With the current reasoning models, we now have the capability to build large scale agentic AI for mainframe system understanding. This is going beyond line-by-line code understanding to reason across end-to-end system behavior and capturing institutional knowledge that’s otherwise lost as SMEs retire.

BoredPositron4mo ago

Found the atruvia employee ;D

BoredPositron4mo ago

I am in banking and it's fine we have some finetuned models to work with our code base. I think COBOL is a good language for LLM use. It's verbose and English like syntax aligns naturally with the way language models process text. Can't complain.

repelsteeltje4mo ago

Can you elaborate? See questions about what kind of use in sibling thread.

And in addition to the type of development you are doing in COBOL, I'm wondering if you also have used LLMs to port existing code to (say) Java, C# or whatever is current in (presumably) banking?

zkid18OP4mo ago

What these models are doing - migrations, new feature releases, etc? What does your setup look like?

spicyusername4mo ago

I suspect they're doing whatever job needs to be done, as with models in any other language.

I also suspect they need a similar amount of hand holding and review.

1 more reply

andy994mo ago

There was a COBOL LLM eval benchmark published a few years ago, looks like it hasn’t been maintained: https://github.com/zorse-project/COBOLEval

At least I think that’s the repo, there was an HN discussion at the time but the link is broken now: https://news.ycombinator.com/item?id=39873793

Waffle21804mo ago

I’m not a full-time COBOL dev, but I’ve worked adjacent to mainframe systems (bank integrations, legacy batch jobs, and data pipelines).

From what I’ve seen, LLMs aren’t really a threat to COBOL roles right now. They can help explain unfamiliar code, summarize programs, or assist with documentation, but they struggle with the things that actually matter most: institution-specific conventions, decades of undocumented business logic, and the operational context around jobs, datasets, and JCL.

In practice, the hardest part isn’t writing COBOL syntax, it’s understanding why a program exists, what assumptions it encodes, and what will break if you change it. That knowledge tends to live in people, not in code comments.

So AI feels more like a force multiplier for experienced engineers rather than a replacement. If anything, it might reduce the barrier for newer engineers to approach these systems, which could be a net positive given how thin the talent pool already is.

m3h_hax0r4mo ago

I wonder if the OP's question is motivated by there being less public examples of COBOL code to train LLM's on compared to newer languages (so a different experience is expected), or something else. If the prior, it'd be interesting to see if having a language spec and a few examples leads to even better results from an LLM, since less examples could also mean less bad examples that deviate from the spec :) if there are any dev's that use AI with COBOL and other more common languages, please share your comparative experience

pixl974mo ago

Most COBOL I know of won't ever see the light of day.

Also COBOL seems to have a lot of flavors that are used by a few financial institutions. Since these are highly proprietary it seems very unlikely LLMs would be trained on them, and therefore the LLM would not be any use to the bank.

cmrdporcupine4mo ago

Given the mass of code out there, it strikes me it's only a matter of time before someone fine tunes one of the larger more competent coding models on COBOL. If they haven't already.

Personally I've had a lot of luck Opus etc with "odd" languages just making sure that the prompt is heavily tuned to describe best practices and reinforce descriptions of differences with "similar" languages. A few months ago with Sonnet 4, etc. this was dicey. Now I can run Opus 4.5 on my own rather bespoke language and get mostly excellent output. Especially when it has good tooling for verification, and reference documentation available.

The downside is you use quite a bit of tokens doing this. Which is where I think fine tuning could help.

I bet one of the larger airlines or banks could dump some cash over to Anthropic etc to produce a custom trained model using a corpus of banking etc software, along with tools around the backend systems and so on. Worthwhile investment.

In any case I can't see how this would be a threat to people who work in those domains. They'd be absolutely invaluable to understand and apply and review and improve the output. I can imagine it making their jobs 10x more pleasant though.

pixl974mo ago

> competent coding models on COBOL

Which COBOL... This is a particular issue in COBOL is it's a much more fragmented language than most people outside the industry would expect. While a model would be useful for the company that supplied the data, the amount of transference may be more limited than one would expect.

anticensor4mo ago

COBOL migration is one of Devin's advertised capabilities:

https://docs.devin.ai/use-cases/examples/cobol-modernization https://cognition.ai/blog/infosys-cognition

DANmode4mo ago

Wait - whoever is downvoting this, could you please also explain why?

I’m looking at a signal with no way to validate it (that this person may be biased?, exaggerating?, or lying?).

Stop downvoting without replying - it’s really unhelpful.

rsynnott4mo ago

While I didn’t downvote it (and very rarely downvote things at all here), “some random tool advertises something vaguely related”, with no context, is not IMO a particularly interesting contribution.

petercooper4mo ago

I'm not in the COBOL world at all, but when I saw IBM putting out models for a while, I had to wonder if it was a byproduct of internal efforts to see if LLMs could help with the supposedly dwindling number of legacy mainframe developers. I don't know COBOL enough to be able to see if their Granite models are particularly strong in this area, though.

kajolshah_bt4mo ago

I’ve seen AI help with COBOL only after the system is well understood. When specs are fuzzy or tribal knowledge isn’t written down, AI just produces confident but risky code. It speeds things up only once the basics are already clear.

fortran774mo ago

I'm in an adjacent business (FORTRAN) and it hasn't hurt me at all.

rramadass4mo ago

Do you mean you are using LLMs for your Fortran work?

fortran774mo ago

Very little. A lot of Fortran today is converting old Fortran to Python+Numpy or Matlab. I've tried Claude and Copilot and it's pretty sketchy on this. I do use it for "print" statement formatting, etc.

thevinter4mo ago

Not a COBOL dev, but I work on migrating projects from COBOL mainframes to Java.

Generally speaking any kind of AI is relatively hit or miss. We have a statically generated knowledge base of the migrated sourcecode that can be used as context for LLMs to work with, but even that is often not enough to do anything meaningful.

At times Opus 4.5 is able to debug small errors in COBOL modules given a stacktrace and enough hand-holding. Other models are decent at explaining semi-obscure COBOL patterns or at guessing what a module could be doing just given the name and location -- but more often than not they end up just being confidently wrong.

I think the best use-case we have so far is business rule extraction - aka understanding what a module is trying to achieve without getting too much into details.

The TLDR, at least in our case, is that without any supporting RAGs/finetuning/etc all kind of AI works "just ok" and isn't such a big deal (yet)

mkw50534mo ago

If I were using something like Claude Code to build a COBOL project, I'd structure the scaffolding to break problems into two phases: first, reason through the design from a purely theoretical perspective, weighing implementation tradeoffs; second, reference COBOL documentation and discuss how to make the solution as idiomatic as possible.

Disclaimer: I've never written a single line of COBOL. That said, I'm a programming language enthusiast who has shipped production code in FORTRAN, C, C++, Java, Scala, Clojure, JavaScript, TypeScript, Python, and probably others I'm forgetting.

mickeywhite4mo ago

You may want to give free opensource GnuCOBOL a try. Works on Mac/Linux/Windows. As far as AI and Cobol, I do think Claude Opus 4.5 is getting pretty good. But like stated way above, verify and understand Every line it delivers to you.

raw_anon_11114mo ago

Funny enough, I found ChatGPT to be pretty good at AppleSoft BASIC

soami4mo ago

Wuiserous4mo ago

I see it as a complete opposite for sure, I will tell you why.

it could have been a threat if it was something you cannot control, but you can control it, you can learn to control it, and controlling it in the right direction would enable anyone to actually secure your position or even advance it.

And, about the COBOL, well i dont know what the heck this is.

krupan4mo ago

This is amazing! Thank you for confirming what I've been suspecting for a while now. People that actually know very little about software development now believe they don't need to know anything about it, and they are commenting very confidently here on hn.

kjs34mo ago

People that actually know very little about software development now believe they don't need to know anything about it, and they are commenting very confidently here on hn.

That reads like mission statement of HN.

nativeit4mo ago

Dunning-Kruger is gonna need a bigger boat.

roschdal4mo ago

No humans understand COBOL, no AI understand COBOL.

Ygg24mo ago

Damn, then Rust is safe from AI :D

No one understands it either.

ndr4mo ago

Does anyone understand anything?

qubex4mo ago

Never met this ‘anyone’ person or seen any of this ‘anything’ stuff.

pixl974mo ago

I've seen songs on spottily called "anything" and "Just play anything", so I guess it may be worthwhile if I change my name to "anyone" for when someone asks their LLM to "just hire anyone"

1 more reply

iberator4mo ago

Total BS. Cobol is well documented and actively developed. I bet you didn't even TRY to write single program for it... Stop spreading FUD

kjs34mo ago

Sarcasm is difficult to grasp on the internet, but some people apparently have more visceral reactions to their misunderstanding than others.

pjmlp4mo ago

I would assert this is affecting all programming languages, this is like the transition from Assembly to high level languages.

Who thinks otherwise, even if LLMs are still a bit dumb today, is fooling themselves.

krupan4mo ago

Compiling high level languages to assembly is a deterministic procedure. You write a program using a small well defined language (relative to natural language every programming language is tiny and extremely well defined). The same input to the same compiler will get you the same output every time. LLMs are nothing like a compiler.

pjmlp4mo ago

If we ignore optimizing compilers and UB.

"Project the need 30 years out and imagine what might be possible in the context of the exponential curves"

-- Alan Kay

krupan4mo ago

Is there any compiler that "rolls the dice" when it comes to optimizations? Like, if you compile the exact same code with the exact same compiler multiple times you'll get different assembly?

And th Alan Kay quote is great but does not apply here at all? I'm pointing out how silly it is to compare LLMs to compilers. That's all.

2 more replies

tjwebbnorfolk4mo ago

Except for COBOL, which is famously not a turing-complete language. So certain guesses have to be made.

krupan4mo ago

But the compiler doesn't "roll the dice" when making those guesses! Compile the same code with the same compiler and you get the same result repeatedly.

zmfmfmddl4mo ago

The point about the mass of code running the economy being untouched by AI agents is so real. During my years as a developer, I've often faced the skepticism surrounding automation technologies, especially when it comes to legacy languages like COBOL. There’s a perception that as AI becomes more capable, it might threaten specialized roles. However, I believe that the intricacies and context of legacy systems often require human insight that AI has yet to master fully.

I logged my fix for this here: https://thethinkdrop.blogspot.com/2026/01/agentic-automation...

j / k navigate · click thread line to collapse

183 comments

alexpham144mo ago

apaprocki4mo ago

If there are 40 years of undocumented business quirks, document them and then re-evaluate. A human new to the codebase would fail under the same conditions.

shakna4mo ago

Formatting isn't just visual, in pre-79 COBOL or Fortran. It's syntax. Its a compile failure, or worse, it cuts the line and can sometimes successfully compile into something else.

Thats not just an undocumented quirk, but a fundamental part of being a punch-card ready language.

raw_anon_11114mo ago

to11mtm4mo ago

I can't even get Github Copilot's plugin to avoid randomly trashing files with a Zero No width break space at the beginning, let alone follow formatting rules consistently...

2 more replies

apaprocki4mo ago

Clearly it isn’t just “broken” for everyone, “Claude Code modernizes a legacy COBOL codebase”, from Anthropic:

https://youtu.be/OwMu0pyYZBc

1 more reply

akhil08agrawal4mo ago

Nuances of a codebase are the key. But I guess we are accelerating towards solving that. Let's see how much time will this take.

layer84mo ago

The critical “why” knowledge often cannot be derived from the code base.

The prohibitions on other companies (LLM providers) being able to see your code also won’t be going away soon.

Muromec4mo ago

Other companies can see the code, that isn’t a problem. The problem with LLM is the idea that the code leaks out to companies other than LLM provider.

That’s something that can be either solved for real or be promised to not happen.

1 more reply

OGWhales4mo ago

deaddodo4mo ago

Quothling4mo ago

Over all I think it's fine.

kelvinjps104mo ago

If it's bad at python the most popular language what language it's good at? If you see the other comments they're basically mentioning most programming languages

8 more replies

benjiro4mo ago

> AI is pretty bad at Python and Go as well.

For me Opus 4.5 has reached that sweet spot of productivity and not just playing around with LLMs and undoing mistakes.

1 more reply

mholm4mo ago

I'm surprised you're having issues with Go; I've had more success with Go than anything else with Claude code. Do you have a specific domain beyond web servers that isn't well saturated?

TZubiri4mo ago

Cgpt is built on python (training and finetuning priority), and uses it as a tool call.

Python is as good as output language as you are going to get.

genghisjahn4mo ago

I’ve found claide code to be amazing at go. This is all nuts because experiences it’s so different from person to another.

1 more reply

glhaynes4mo ago

I'm not a Python programmer but I could've sworn I've repeatedly heard it said that LLMs are particularly good at writing Python.

1 more reply

BrouteMinou4mo ago

with all those languages listed in this thread,it explains why I don't trust or use AI when I code.

That's basically all the languages that I am using...

For the AI fans in here, what languages are you using? Typescript only would be my guess?

5 more replies

OhSoHumble4mo ago

> AI is pretty bad at Python and Go as well

I disagree with this. At least for Go.

antonymoose4mo ago

I’m being pushed to use it more and more at work and it’s just not that great. I have paid access to Copilot with ChatGPT and Claude for context.

ironbound4mo ago

Hats off for trying to avoid leaking tokens, as a security engineer I don't know if we should be happy for the job security or start drinking given all the new dumb issues generated fast than ever xD

orwin4mo ago

drrotmos4mo ago

antonvs4mo ago

3 more replies

lopezb4mo ago

deaddodo4mo ago

Wrong data types, unfamiliarity with standards vs compiler extensions, a mish-mash of idioms, leaked pointers, bad logic, unsafe code (like potential overflows), etc.

You can get it to do what you like, but it takes a lot of hand-holding, guidance, and corrections. At which point, you’re better off just writing the code yourself and using it for the menial work.

icedchai4mo ago

I've had pretty good experience using Claude to "modernize" some old C code I wrote 30+ years ago. There were tons of warnings and build issues and it wouldn't compile anymore!

1 more reply

elzbardico4mo ago

Had the opposite experience using LLMs with C. Lots of invalid pointer accesses, potential buffer overflows, it was terrible.

1 more reply

hackermailman4mo ago

3uler4mo ago

federicoserra4mo ago

Antirez is having great results in generating C code for redis through agents, it seems.

derefr4mo ago

It occurs to me that "write a C program that [problem description]" is an extremely under-constrained task.

And, in any project they create, they're going to depend on (or vendor in) those low-level libraries.

It's no wonder an AI can't do that. Humans can barely do it!

soco4mo ago

There's such a huge and old talk about the death of COBOL coding/coders that I find it very surprising that nobody trained a model to help with exactly that.

brightball4mo ago

Heard an excellent COBOL talk this summer that really helped me to understand it. The speaker was fairly confident that COBOL wasn't going away anytime soon.

https://www.youtube.com/watch?v=RM7Q7u0pZyQ&list=PLxeenGqMmm...

rramadass4mo ago

elzbardico4mo ago

Don't know about COBOL, but FORTRAN and Ada definitely would survive an Extinction Level Event on earth.

Plenty of space based stuff running Ada and maybe some FORTRAN.

rramadass4mo ago

2 more replies

TacticalCoder4mo ago

So you have Java code, generating COBOL code, that's then run on an emulator emulating an old IBM system that was meant to run COBOL. It's just wild.

mickeywhite4mo ago

cobol runs Everywhere. Windows Mac Linux free and open source. https://sourceforge.net/p/gnucobol/discussion/

christophilus4mo ago

First of all, that conference is right down the road from me, and I never knew about it. So, thanks for sharing!

brightball4mo ago

Nice! Call for Speakers will be opening this week if you know anybody who may be interested. https://carolina.codes

tomcam4mo ago

How did you call COBOL from VB.NET? Was it just a matter of shelling to COBOL and writing out text files that VB.NET consumed, or COM interprocess calls, or what?

mickeywhite4mo ago

christophilus4mo ago

The COBOL layer had a TCP server somewhere in it (don’t know the details of that). We simply made TCP calls directly, sending and receiving fixed-sized records.

1 more reply

pixl974mo ago

brightball4mo ago

I wasn’t aware of this until that talk, but COBOL essentially being both the logic and the database together makes it very sticky.

layer84mo ago

What do you assume the average age of HN posters to be?

pixl974mo ago

35-40, though it could be just a bit older as there is no official metric on this.

edarchis4mo ago

Not COBOL but I sometimes have to maintain a large ColdFusion app. The early LLMs were pretty bad at it but these days, I can let AI write code and I "just" review it.

I've also used AI to convert a really old legacy app to something more modern. It works surprisingly well.

hmaxwell4mo ago

ufmace4mo ago

rbanffy4mo ago

Yes, but please remember you specify the common parts only once for the agent. From there, it’ll base its actions on all the instructions you kept on their configuration.

halJordan4mo ago

Welcome to the waterfall development model. This is what companies did before enshitiffixation

dmux4mo ago

jamesfinlayson4mo ago

I tried getting AI to update some JUnit 4 to Junit 5 - it replaced the JUnit 4 assertions with Java's built-in assert keyword. Very underwhelming.

xandrius4mo ago

amarant4mo ago

What are your secrets? Teach me the dark arts!

sothatsit4mo ago

There are wide gaps in:

1) the models people are using (default model in copilot vs. Opus 4.5 or Codex xhigh)

2) the tools people are using (ChatGPT vs. copilot vs. codex vs. Claude code)

3) when people tried these tools (e.g., December saw a substantial capability increase but some people only tried AI this one time last March)

4) how much effort people put into writing prompts (e.g., one vague sentence vs. a couple paragraphs of specific constraints and instructions)

Especially with all the hype, it makes sense to me why people have such different estimates for how useful AI actually is.

SoftTalker4mo ago

reuben3644mo ago

0xCE04mo ago

I really wouldn't want any vibe-coded COBOL in my bank db/app logic...

egorfine4mo ago

vibecoding != AI.

For example: I'm a senior dev, I use AI extensively but I fully understand and vet every single line of code I push. No exceptions. Not even in tests.

hnlmorg4mo ago

Whilst I agree with your point, I think what sometimes gets lost in these conversations is that reviewing code thoroughly is harder than writing code.

Personally, and I’m not trying to speak for everyone here, I found it took me just as long to review AI output as it would have taken to write that code myself.

AstroBen4mo ago

The worst is reviewing the code and realizing it stinks and should be done another way

So you re-roll the slot machine and pay the reviewing cost twice

I don't think AI's biggest strength is in writing code

egorfine4mo ago

> as long to review AI output as it would have taken to write that code myself

That is often the case.

tjwebbnorfolk4mo ago

I think the point is in a banking context, every line of code gets reviewed thoroughly anyway.

3 more replies

atomicnumber34mo ago

Unfortunately, the people who are "pro-AI" are so often because it lets them skip the understanding part with less scrutiny

egorfine4mo ago

The good news here is that their code is of such a poor quality it doesn't properly work anyway.

1 more reply

worksonmine4mo ago

> Not even in tests.

This should be "especially in tests". It's more important that they work than the actual code, because their purpose is to catch when the rest of the code breaks.

tjr4mo ago

It's unclear to me why most software projects would need to grow by tens (or hundreds) of thousands of lines of code each day, but I guess that's a thing?

elzbardico4mo ago

And I do a lot of top level design when I use it. AIs are terrible at abstraction and functional decomposition.

eps4mo ago

Aye. AI is also great for learning specifics of poorly documented APIs, e.g. COM-based brainrot from Microsoft.

refneb4mo ago

Hey now, that COM based rot paid for my house and kid’s college expenses.

1 more reply

null_deref4mo ago

Does the use AI always implies slope and vibe coding? I’m really not sure

jebarker4mo ago

No, it doesn't. For example, you could use an AI agent just to aid you in code search and understanding or for filling out well specified functions which you then do QA on.

0xCE04mo ago

But in any case, someone needs to be responsible for the committed code, because only personified human blame and guilt can eventually avert/minimize sloppiness.

sarchertech4mo ago

You 100% can use it this way. But it takes a lot of discipline to keep the slop out of the code base. The same way it took discipline to keep human slop out.

The new problem is that this class of developer is the exact kind of developer who AI speeds up the most, and they are the most experienced at getting shit code through review.

1 more reply

foxmoss4mo ago

shermantanktop4mo ago

That undertone is overt in the statements of CEOs and managers who salivate at “reducing headcount.”

The people who should fear AI the most right now are the offshore shops. They’re the most replaceable because the only reason they exist is the desire to carve off low skill work and do it cheaply.

But all of this overblown anyway because I don’t see appetite for new software getting satiated anytime soon, even if we made everyone 2x productive.

shevy-java4mo ago

How many banks really use COBOL? Here in central Europe it seems to be Java, Java, Java for the most part. Since many years actually.

pverheggen4mo ago

From the vendor's perspective, it doesn't make sense to do a complete rewrite and risk creating hairy financial issues for potentially hundreds of clients.

pixl974mo ago

As others have said, US banks seem to run a lot of it, as in they have millions of lines of code of it.

This is not saying that banks don't also have a metric shitload of Java, they do. I think most people would be surprised how much code your average large bank manages.

jamesfinlayson4mo ago

shakna4mo ago

ECB is mostly COBOL and Fortran. The interfaces are Java, but not the backend.

ironbound4mo ago

Management loves trying to save money, a bunch of grads with AI have differently had a project to try to write COBOL!

randomsc4mo ago

I am working as a Software engineer in a European bank. There is a huge multi year program to remove COBOL as much as possible with cloud based Java Spring application.

The main reason is maintainability. There is no more cobol developers coming. Existing ones close to retirement or already retired.

sai184mo ago

You’re describing the pattern we’re seeing across most companies who are still on COBOL.

We're exploring some of this work at Hypercubic (https://www.hypercubic.ai/, YC-backed) if you're curious to learn more.

BoredPositron4mo ago

Found the atruvia employee ;D

BoredPositron4mo ago

repelsteeltje4mo ago

Can you elaborate? See questions about what kind of use in sibling thread.

And in addition to the type of development you are doing in COBOL, I'm wondering if you also have used LLMs to port existing code to (say) Java, C# or whatever is current in (presumably) banking?

zkid18OP4mo ago

What these models are doing - migrations, new feature releases, etc? What does your setup look like?

spicyusername4mo ago

I suspect they're doing whatever job needs to be done, as with models in any other language.

I also suspect they need a similar amount of hand holding and review.

1 more reply

andy994mo ago

There was a COBOL LLM eval benchmark published a few years ago, looks like it hasn’t been maintained: https://github.com/zorse-project/COBOLEval

At least I think that’s the repo, there was an HN discussion at the time but the link is broken now: https://news.ycombinator.com/item?id=39873793

Waffle21804mo ago

I’m not a full-time COBOL dev, but I’ve worked adjacent to mainframe systems (bank integrations, legacy batch jobs, and data pipelines).

m3h_hax0r4mo ago

pixl974mo ago

Most COBOL I know of won't ever see the light of day.

cmrdporcupine4mo ago

Given the mass of code out there, it strikes me it's only a matter of time before someone fine tunes one of the larger more competent coding models on COBOL. If they haven't already.

The downside is you use quite a bit of tokens doing this. Which is where I think fine tuning could help.

pixl974mo ago

> competent coding models on COBOL

anticensor4mo ago

COBOL migration is one of Devin's advertised capabilities:

https://docs.devin.ai/use-cases/examples/cobol-modernization https://cognition.ai/blog/infosys-cognition

DANmode4mo ago

Wait - whoever is downvoting this, could you please also explain why?

I’m looking at a signal with no way to validate it (that this person may be biased?, exaggerating?, or lying?).

Stop downvoting without replying - it’s really unhelpful.

rsynnott4mo ago

petercooper4mo ago

kajolshah_bt4mo ago

fortran774mo ago

I'm in an adjacent business (FORTRAN) and it hasn't hurt me at all.

rramadass4mo ago

Do you mean you are using LLMs for your Fortran work?

fortran774mo ago

thevinter4mo ago

Not a COBOL dev, but I work on migrating projects from COBOL mainframes to Java.

I think the best use-case we have so far is business rule extraction - aka understanding what a module is trying to achieve without getting too much into details.

The TLDR, at least in our case, is that without any supporting RAGs/finetuning/etc all kind of AI works "just ok" and isn't such a big deal (yet)

mkw50534mo ago

mickeywhite4mo ago

raw_anon_11114mo ago

Funny enough, I found ChatGPT to be pretty good at AppleSoft BASIC

soami4mo ago

Wuiserous4mo ago

I see it as a complete opposite for sure, I will tell you why.

And, about the COBOL, well i dont know what the heck this is.

krupan4mo ago

kjs34mo ago

People that actually know very little about software development now believe they don't need to know anything about it, and they are commenting very confidently here on hn.

That reads like mission statement of HN.

nativeit4mo ago

Dunning-Kruger is gonna need a bigger boat.

roschdal4mo ago

No humans understand COBOL, no AI understand COBOL.

Ygg24mo ago

Damn, then Rust is safe from AI :D

No one understands it either.

ndr4mo ago

Does anyone understand anything?

qubex4mo ago

Never met this ‘anyone’ person or seen any of this ‘anything’ stuff.

pixl974mo ago

I've seen songs on spottily called "anything" and "Just play anything", so I guess it may be worthwhile if I change my name to "anyone" for when someone asks their LLM to "just hire anyone"

1 more reply

iberator4mo ago

Total BS. Cobol is well documented and actively developed. I bet you didn't even TRY to write single program for it... Stop spreading FUD

kjs34mo ago

Sarcasm is difficult to grasp on the internet, but some people apparently have more visceral reactions to their misunderstanding than others.

pjmlp4mo ago

I would assert this is affecting all programming languages, this is like the transition from Assembly to high level languages.

Who thinks otherwise, even if LLMs are still a bit dumb today, is fooling themselves.

krupan4mo ago

pjmlp4mo ago

If we ignore optimizing compilers and UB.

"Project the need 30 years out and imagine what might be possible in the context of the exponential curves"

-- Alan Kay

krupan4mo ago

Is there any compiler that "rolls the dice" when it comes to optimizations? Like, if you compile the exact same code with the exact same compiler multiple times you'll get different assembly?

And th Alan Kay quote is great but does not apply here at all? I'm pointing out how silly it is to compare LLMs to compilers. That's all.

2 more replies

tjwebbnorfolk4mo ago

Except for COBOL, which is famously not a turing-complete language. So certain guesses have to be made.

krupan4mo ago

But the compiler doesn't "roll the dice" when making those guesses! Compile the same code with the same compiler and you get the same result repeatedly.

zmfmfmddl4mo ago

I logged my fix for this here: https://thethinkdrop.blogspot.com/2026/01/agentic-automation...

j / k navigate · click thread line to collapse