Languages and domais that have leaned too faar into package managers and small libraries are prone to fragility and security nightmares.
For any "serious" application of critical code; every library used need to be vetted and verifierad to be maintained and secure.
Id much rather deal with a bug in our code than a depricated library or breaking version update.
If we are to use a library outside of standard unix or stdlib within my field, better expect a nighmareish code review and a meeting.
Besides being fun; implementing it ourselves improves our skill level for the future. Something vibe coding itself goes against aswell.
A project only become serious once legal is breathing down engineering's neck. Before that, it's usually the far west. After, it becomes a security circus trying to patch the technology deficiency (custom registries, complex linting and other analysis tooling,...)
It's kinda like project specific semantic monomorphization.
> If there is a battle tested, well known package that can help us, then recommend it BEFORE implementing large swaths of custom code.
Why does this reduce your attack surface? Can the functions in the library, unrelated to the ones you're using, be triggered by user input somehow?
And yes, I agree.
https://www.npmjs.com/package/boolean
>converts lots of things to boolean.
>3 million weekly downloads
This is insane.
My celery/RabbitMQ-based web crawler failed because of the Cloudflare CAPTCHAs, I figured it was best to empty out the queue and archive it. I asked copilot what to do and it told me to use a CLI program. “Does that come with RabbitMQ?” “No, you download it from GitHub”. It offered to write me a Python script but the CLI program did exactly what I needed. It got an option wrong but I’d expect the same if I asked a friend for help.
This is analogous to folks who claim nobody is going to be able to learn software engineering any more. I think it is just the opposite. LLMs can be an awesome tool for learning.
I want people in the company to use it, but it's big and complicated (lots of chipsets and Bluetooth to boot).
I'm trying to design the library so the MCP can tell the LLM to pull it from our repo, read the prompt file for instructions and automatically integrate with the code.
I can't get it to do it consistenlty. There is a big gap in the current LLM tech where there is no standard/consistent way to tell an LLM how to interface with a library (C/Python/Java/etc.)
The LLM more often than not will read the library and then start writing duplicate code.
Maddening.
I'm still not clear on what the best patterns for this are myself. I've been experimenting with dumping my entire documentation into the model as a single file - see https://github.com/simonw/docs-for-llms and https://github.com/simonw/llm-docs - but I'd like to produce shorter, optimized documentation (probably with a whole bunch of illustrative examples) that use fewer tokens and get better results.
In my experience the big problem is that the documentation is always terrible, you can't ask open-ended questions on stack overflow, the library's reddit (if any) has zero users, and anything asked on their discord is not searchable.
It's incredible that we still don't have a stack overflow that is just a forum.
> learning the library's design
without solid documentation. And if I am reading the library implementation thoroughly, I might as well implement what I need myself.
Invoking the smarter-than-thou effect is not a great starting point.
See e.g. https://www.sciencedirect.com/science/article/abs/pii/S01602...
If we’re considering a library, it would be prudent of us to take a look at the source code to see what exactly we’re pulling in. In the process, we would learn about the lay of the land, the API and the internals, and get at least an overview of the complexity of the problem it solves.
Anyways...I've had a few reoccurring issues with libraries. Note that the language is framed on a case by case basis...not general rules.
1. The essential implementation is a small amount of code...wrapped in structures just for packaging essential code. The wrapping code can be larger & more complex than the essential code.
2. There's small differences between what's needed & what's provided. Which requires workarounds for the desired outcome. These workarounds muddy the logic & can be pervasive at scale.
3. There can be dissonance between the app architecture & the library api.
4. Popular libraries in particular...create a culture of thinking in terms of the library/framework. Leading to resource inefficiencies...And outright dismissing solutions that are a better match for the domain. In short, the library/framework api frames the problem & solution...Which may not match the actual problem & optimal solution.
5. The library/framework authors are concerned about promoting the library/framework. Not solving the actual problem. Many problems need to be solved. The library/framework just be the "Golden Hammer" to pound in your screw.
With all that being said...there are many useful libraries that define & solve problems in their particular domain. Particularly with common, well defined, appropriately scoped requirements.
Though the addition of pipes to the base language is helping fix that.
I don't think DK has anything to do with people releasing libraries that nobody should use.
(The quotes comes from a different context, but works quite well here as well.)
(or "naïve")
On the article: some use cases eg handling dates, fault tolerant queues have so many edge cases and are so mission critical that relying on a battle tested tool makes a lot of sense.
However, in my career I’ve seen a lot of examples of a package being installed to avoid 40-50 lines of well thought out code and now a dependency is forever embedded in the system.
I think there is a catch with replacing libraries with LLM generated code. Part of the benefit of skipping third party libraries is the domain knowledge that gets built up: this is potentially lost with llm generated code.