What if it wrote all of the boilerplate and just let you focus on the important bit that deserves your scrutiny?
You could say I failed every single project I ever built because it took many iterations to get to the final deliverable stage with lots of errors along the way. So more nuance would be needed.
But when it comes to LLMs suddenly we get all gleeful about how negatively we can frame the experience. Even among HN tech scholars.
Actually, I can tell; I ran split on the C source and got things like this:
disk_space.c:144:16: Only storage bin.ref_bin (type void *) derived from variable declared in this scope is not released (memory leak)
So I'm looking into a Rust version with Rustler now.
As you said, the very title of the article acknowledged that it didn’t produce a working product.
This is just outrage for the sake of outrage.
Then why not say "mostly didn't work"? I read the article and that's the impression I got.
The OP's comment isn't an outage, it's more like you intentionally painted it as an outrage with a comment that reads more like an outrage.
At the very least, it's fine for personal projects which is something I'm getting into more and more: remembering that computers were meant to create convenience, so writing small programs to make life easier.
I'd say absent some temporary hack to do something, my bad experiences won't let me say something is low risk. I worked at Microsoft years ago, and after the zillions of vulnerabilities were attacked by people around the time of windows 95 and computers on the net, we did serious code reviews in my team of the data access libraries. There were vast numbers of vulnerabilities. A group of 3 or 4 of us would sit in a room for 3 hours a day, one person a scribe, and we'd go over this c code that was ancient even then - we found problems everywhere, it was exhausting and shocking. The entire data access infrastructure was riddled with memory leaks, strings that were not length limited, input parameters that were not checked or sanitized, etc. I'm sure it was endemic across all components, not just there. We fixed some things, but we found so much shit.
Thank got I wasn't on the team trying to figure out what to do about those problems. I think they end of lifed a lot of stuff.
It's less spectrum and more that it's relative. Depends on attacker and what they seek to gain.
An unsecured server is an unsecured server. But there is a world of difference if they are attacked by CIA or local script kiddies.
I had a bunch of fun getting ChatGPT Code Interpreter to write (and compile and test) C extensions for SQLite last year: https://simonwillison.net/2024/Mar/23/building-c-extensions-...
If you want to use a GC language for NIFs, you'd need to hook up your runtime somehow.
IMHO, it makes more sense to lean into the BEAM and use its resource management... my NIFs have all been pretty straight forward to write. All the boiler plate is what it is, and figuring out how to cooperate with the scheuduler for long running code or i/o can be a bit tricky, but if you can do a lot in a BEAM language, the native code ends up being like
Check the arguments, do the thing, return the results.
I essentially ran out of patience and tried another approach. It involved an LLM running C code so I could check the library output compared to my implementation to make sure it was byte-for-byte.
The C will never ship. I don't have practice writing C so I am very inefficient at it. I read it okay. LLMs are pretty decent help for this type of scrap code.
Then I noticed that some of the tests that failed were failing in really odd ways. Upon closer inspection, the generated processor had made lots of crazy assumptions about what it should be doing based upon specific values in yaml keys that were obviously unrelated to instructions.
Yeah, I agree with the author. This stuff can be incredibly useful, but it definitely isn't anything like an AGI in its current form.
I'm now re-vibe-coding it into Rust with the same process, but also using Grok 4 to get better results. It now builds and passes the tests on Elixir 1.14 to 1.18 on macOS and Ubuntu, but I'm still trying to get Grok 3 and 4 to fix the Windows-specific parts of the Rust code.
a lot of the choice here are made at the expense of VM's health.
also why wouldn't anyone just use :disksup.get_disk_info/1. (Thats immediate) calling :disksup.get_disk_info/1 won’t mess with the scheduler in the way a custom NIF or a big blocking port might.
I see the above code/lib and just see reflags all over the place.
Also, is Claude Code free to use?
The manual process has the upside that you get to see how the sausage is (badly) made. Otherwise, just YOLO it and put your trust in GenAI completely.
Furthermore, if there is the interim step of pushing to GitHub to trigger the build & test workflow and see if it works on something other than Linux, is the choice of Vibe-Coding IDE really the limiting factor in the entire process?
However, this NIF also returns more fields than the disksup function.