Libraries are under-used. LLMs make this problem worse (opens in new tab)

(makefizz.buzz)

62 pointskmdupree11mo ago52 comments

52 comments

I disagree. Every python package we install seems to install dozens of libraries, each of which can could harbour malware. Many of them are only used for a single function within them. We have no idea of what most of the packages are for. It's a lot.

aDyslecticCrow11mo ago

https://en.m.wikipedia.org/wiki/Log4j https://en.m.wikipedia.org/wiki/Npm_left-pad_incident

Languages and domais that have leaned too faar into package managers and small libraries are prone to fragility and security nightmares.

For any "serious" application of critical code; every library used need to be vetted and verifierad to be maintained and secure.

Id much rather deal with a bug in our code than a depricated library or breaking version update.

If we are to use a library outside of standard unix or stdlib within my field, better expect a nighmareish code review and a meeting.

Besides being fun; implementing it ourselves improves our skill level for the future. Something vibe coding itself goes against aswell.

skydhash11mo ago

> For any "serious" application of critical code; every library used need to be vetted and verifierad to be maintained and secure.

A project only become serious once legal is breathing down engineering's neck. Before that, it's usually the far west. After, it becomes a security circus trying to patch the technology deficiency (custom registries, complex linting and other analysis tooling,...)

giantg211mo ago

If it's open source, it may be possible to create your own fork to fix issues.

j-pb11mo ago

This. We finally have a tool that can learn from all the libraries and abstractions that have to fit everybody's needs (and do so badly because there is no free lunch), and extract just the parts that are actually relevant to our problem and domain. This allows you to not only produce a much smaller attack surface, but also allows for domain specific optimisations and shortcuts.

It's kinda like project specific semantic monomorphization.

handfuloflight11mo ago

Sure. Lot's more debugging than using something battle tested, which is why I have this in my CLAUDE.MD:

> If there is a battle tested, well known package that can help us, then recommend it BEFORE implementing large swaths of custom code.

1 more reply

closeparen11mo ago

>This allows you to not only produce a much smaller attack surface

Why does this reduce your attack surface? Can the functions in the library, unrelated to the ones you're using, be triggered by user input somehow?

1 more reply

AlienRobot11mo ago

LOL! I thought the article was going to be about reading books and ChatGPT!

And yes, I agree.

https://www.npmjs.com/package/boolean

>converts lots of things to boolean.

>3 million weekly downloads

This is insane.

what11mo ago

3 million weekly downloads for a package that is “deprecated” and the source repo no longer exists. Truly insane.

1 more reply

lazide11mo ago

This is the total leopards-eating-faces moment from all the greybeards.

rjsw11mo ago

Same with ruby, I have to use a package with 230 dependencies.

ozim11mo ago

This sounds exactly like under utilized - if someone needs a function or two from a library I guess making yourself depending on 3rd party for such small gain doesn't make sense.

PaulHoule11mo ago

I dunno, often people say libraries are over used, at least in the JavaScript world.

My celery/RabbitMQ-based web crawler failed because of the Cloudflare CAPTCHAs, I figured it was best to empty out the queue and archive it. I asked copilot what to do and it told me to use a CLI program. “Does that come with RabbitMQ?” “No, you download it from GitHub”. It offered to write me a Python script but the CLI program did exactly what I needed. It got an option wrong but I’d expect the same if I asked a friend for help.

tptacek11mo ago

I don't know about vibe coding (I'm not a fan of vibe coding) but LLM agents make me more likely to use good libraries, not less, because they instantly know how to use them; there's less intellectual friction to breaking them out (don't have to find and add the dep, don't have to look for the example code). These kinds of things made me ultra-likely to just hand-code crappier versions of stuff libraries did, before I got LLM-assisted.

tedunangst11mo ago

Is it your assessment or the LLM's that it's a good library? There have been many times I looked at the API for a library, said this is bonkers, and bailed. The weird contortions needed to use something should be a signal.

tptacek11mo ago

It's mine. I've been shooting down LLM library picks semiregularly. That's kind of what motivated me to comment: it is not at all my experience that LLMs steer me away from libraries, and rather more my experience that it's keeping me on my toes suggesting libraries I might not want to use.

outside123411mo ago

I don't know if this is true. An LLM just today recommended a library I had never heard of, and after doing some due diligence, it looks solid.

This is analogous to folks who claim nobody is going to be able to learn software engineering any more. I think it is just the opposite. LLMs can be an awesome tool for learning.

corby11mo ago

I'm having a problem like this now. I have a library that handles very complex hardware drivers and linkages.

I want people in the company to use it, but it's big and complicated (lots of chipsets and Bluetooth to boot).

I'm trying to design the library so the MCP can tell the LLM to pull it from our repo, read the prompt file for instructions and automatically integrate with the code.

I can't get it to do it consistenlty. There is a big gap in the current LLM tech where there is no standard/consistent way to tell an LLM how to interface with a library (C/Python/Java/etc.)

The LLM more often than not will read the library and then start writing duplicate code.

Maddening.

simonw11mo ago

That's part of the idea behind https://llmstxt.org/ - even if you ignore the "/llms.txt" URL there's a bunch of thinking around that to help write explanations of things like libraries that can be used to "teach" a model to use it by injecting that into a prompt.

I'm still not clear on what the best patterns for this are myself. I've been experimenting with dumping my entire documentation into the model as a single file - see https://github.com/simonw/docs-for-llms and https://github.com/simonw/llm-docs - but I'd like to produce shorter, optimized documentation (probably with a whole bunch of illustrative examples) that use fewer tokens and get better results.

nimish11mo ago

At this point it seems like just learning the library is easier than trying to cram the documentation into an LLM compatible format.

1 more reply

AlienRobot11mo ago

>Vibe coding is more fun than reading documentation. Shit, vibe-coding can be more fun than ordinary coding.

In my experience the big problem is that the documentation is always terrible, you can't ask open-ended questions on stack overflow, the library's reddit (if any) has zero users, and anything asked on their discord is not searchable.

It's incredible that we still don't have a stack overflow that is just a forum.

skydhash11mo ago

There are some bad/missing documentations out there, but more often than not, people rush to use the library without first understanding the domain and learning the library's design. Once that's done, the generated api reference and the source code is more than enough to get going.

warkdarrior11mo ago

Hard to do this:

> learning the library's design

without solid documentation. And if I am reading the library implementation thoroughly, I might as well implement what I need myself.

d4rkp4ttern11mo ago

When I read the title, I thought this was about physical libraries, which would make this statement very true!

brikym11mo ago

A bit of duplication is better than a lot of dependency.

giantg211mo ago

Kind of a false dichotomy. To avoid a lot of dependencies you generally need a lot of duplication.

cat_plus_plus11mo ago

If your vibe coding prompt generated a 1000 line output, you should probably ask if there is a library that would do that for you. If not, library is not worth it to shorten a one pager.

cluckindan11mo ago

”Dunning-Kruger effect leads us to understimate the complexity of the problem solved by the library we're considering.”

Invoking the smarter-than-thou effect is not a great starting point.

See e.g. https://www.sciencedirect.com/science/article/abs/pii/S01602...

If we’re considering a library, it would be prudent of us to take a look at the source code to see what exactly we’re pulling in. In the process, we would learn about the lay of the land, the API and the internals, and get at least an overview of the complexity of the problem it solves.

briantakita11mo ago

I learned to consider that if one brings up Dunning-Kruger...projection/irony may be at play.

Anyways...I've had a few reoccurring issues with libraries. Note that the language is framed on a case by case basis...not general rules.

1. The essential implementation is a small amount of code...wrapped in structures just for packaging essential code. The wrapping code can be larger & more complex than the essential code.

2. There's small differences between what's needed & what's provided. Which requires workarounds for the desired outcome. These workarounds muddy the logic & can be pervasive at scale.

3. There can be dissonance between the app architecture & the library api.

4. Popular libraries in particular...create a culture of thinking in terms of the library/framework. Leading to resource inefficiencies...And outright dismissing solutions that are a better match for the domain. In short, the library/framework api frames the problem & solution...Which may not match the actual problem & optimal solution.

5. The library/framework authors are concerned about promoting the library/framework. Not solving the actual problem. Many problems need to be solved. The library/framework just be the "Golden Hammer" to pound in your screw.

With all that being said...there are many useful libraries that define & solve problems in their particular domain. Particularly with common, well defined, appropriately scoped requirements.

terribleperson11mo ago

I imagine a good example for 4 would be the Tidyverse. It's very nice, but R with and without Tidyverse packages are very different experiences with different syntaxes, conventions, and even communities.

Though the addition of pipes to the base language is helping fix that.

fmbb11mo ago

The Dunning-Kruger effect absolutely also leads to people releasing libraries they should not have and which nobody should use.

unclad596811mo ago

The DK effect only implies that people who know things underestimate their knowledge superiority and people who don't know things underestimate their knowledge inferiority. The popular interpretation that uniformed people think they're informed is not consistent with the DK research.

I don't think DK has anything to do with people releasing libraries that nobody should use.

cluckindan11mo ago

You could replace ”The Dunning—Kruger effect” with ”ignorance” and the message would stay the same without sounding like you’re trying to prove your own intellect.

1 more reply

giantg211mo ago

I'd much rather learn a library than create it from scratch. The two main issues are licensing concerns and being able to find ones that actually do what you need.

layer811mo ago

The third issue is avoiding “You wanted a banana but what you got was a gorilla holding the banana and the entire jungle.”

(The quotes comes from a different context, but works quite well here as well.)

khalic11mo ago

The Dunning Kruger Effect? It applies to unqualified people in a group… wrong usage

msgodel11mo ago

Totally disagree. I avoided python for way too long because of how people were abusing pip/anaconda. Especially with such a complete standard library there's no reason to be dragging in external libraries most of the time (except numpy and maybe pytorch if you're doing ML.)

egypturnash11mo ago

"naive"

(or "naïve")

kianN11mo ago

Unrelated: I initially expected this articles to be referring to public libraries. I think that would be a challenging connection to prove despite it making intuitive sense.

On the article: some use cases eg handling dates, fault tolerant queues have so many edge cases and are so mission critical that relying on a battle tested tool makes a lot of sense.

However, in my career I’ve seen a lot of examples of a package being installed to avoid 40-50 lines of well thought out code and now a dependency is forever embedded in the system.

I think there is a catch with replacing libraries with LLM generated code. Part of the benefit of skipping third party libraries is the domain knowledge that gets built up: this is potentially lost with llm generated code.

krackers11mo ago

I thought this was about physical libraries as well. Maybe the link is librarians, supposedly if you didn't even know where to begin searching a trained librarian would have been a good person to ask.

1 more reply

j / k navigate · click thread line to collapse

52 comments

seunosewa11mo ago

aDyslecticCrow11mo ago

https://en.m.wikipedia.org/wiki/Log4j https://en.m.wikipedia.org/wiki/Npm_left-pad_incident

Languages and domais that have leaned too faar into package managers and small libraries are prone to fragility and security nightmares.

For any "serious" application of critical code; every library used need to be vetted and verifierad to be maintained and secure.

Id much rather deal with a bug in our code than a depricated library or breaking version update.

If we are to use a library outside of standard unix or stdlib within my field, better expect a nighmareish code review and a meeting.

Besides being fun; implementing it ourselves improves our skill level for the future. Something vibe coding itself goes against aswell.

skydhash11mo ago

> For any "serious" application of critical code; every library used need to be vetted and verifierad to be maintained and secure.

giantg211mo ago

If it's open source, it may be possible to create your own fork to fix issues.

j-pb11mo ago

It's kinda like project specific semantic monomorphization.

handfuloflight11mo ago

Sure. Lot's more debugging than using something battle tested, which is why I have this in my CLAUDE.MD:

> If there is a battle tested, well known package that can help us, then recommend it BEFORE implementing large swaths of custom code.

1 more reply

closeparen11mo ago

>This allows you to not only produce a much smaller attack surface

Why does this reduce your attack surface? Can the functions in the library, unrelated to the ones you're using, be triggered by user input somehow?

1 more reply

AlienRobot11mo ago

LOL! I thought the article was going to be about reading books and ChatGPT!

And yes, I agree.

https://www.npmjs.com/package/boolean

>converts lots of things to boolean.

>3 million weekly downloads

This is insane.

what11mo ago

3 million weekly downloads for a package that is “deprecated” and the source repo no longer exists. Truly insane.

1 more reply

lazide11mo ago

This is the total leopards-eating-faces moment from all the greybeards.

rjsw11mo ago

Same with ruby, I have to use a package with 230 dependencies.

ozim11mo ago

This sounds exactly like under utilized - if someone needs a function or two from a library I guess making yourself depending on 3rd party for such small gain doesn't make sense.

PaulHoule11mo ago

I dunno, often people say libraries are over used, at least in the JavaScript world.

tptacek11mo ago

tedunangst11mo ago

tptacek11mo ago

outside123411mo ago

I don't know if this is true. An LLM just today recommended a library I had never heard of, and after doing some due diligence, it looks solid.

This is analogous to folks who claim nobody is going to be able to learn software engineering any more. I think it is just the opposite. LLMs can be an awesome tool for learning.

corby11mo ago

I'm having a problem like this now. I have a library that handles very complex hardware drivers and linkages.

I want people in the company to use it, but it's big and complicated (lots of chipsets and Bluetooth to boot).

I'm trying to design the library so the MCP can tell the LLM to pull it from our repo, read the prompt file for instructions and automatically integrate with the code.

I can't get it to do it consistenlty. There is a big gap in the current LLM tech where there is no standard/consistent way to tell an LLM how to interface with a library (C/Python/Java/etc.)

The LLM more often than not will read the library and then start writing duplicate code.

Maddening.

simonw11mo ago

nimish11mo ago

At this point it seems like just learning the library is easier than trying to cram the documentation into an LLM compatible format.

1 more reply

AlienRobot11mo ago

>Vibe coding is more fun than reading documentation. Shit, vibe-coding can be more fun than ordinary coding.

It's incredible that we still don't have a stack overflow that is just a forum.

skydhash11mo ago

warkdarrior11mo ago

Hard to do this:

> learning the library's design

without solid documentation. And if I am reading the library implementation thoroughly, I might as well implement what I need myself.

d4rkp4ttern11mo ago

When I read the title, I thought this was about physical libraries, which would make this statement very true!

brikym11mo ago

A bit of duplication is better than a lot of dependency.

giantg211mo ago

Kind of a false dichotomy. To avoid a lot of dependencies you generally need a lot of duplication.

cat_plus_plus11mo ago

If your vibe coding prompt generated a 1000 line output, you should probably ask if there is a library that would do that for you. If not, library is not worth it to shorten a one pager.

cluckindan11mo ago

”Dunning-Kruger effect leads us to understimate the complexity of the problem solved by the library we're considering.”

Invoking the smarter-than-thou effect is not a great starting point.

See e.g. https://www.sciencedirect.com/science/article/abs/pii/S01602...

briantakita11mo ago

I learned to consider that if one brings up Dunning-Kruger...projection/irony may be at play.

Anyways...I've had a few reoccurring issues with libraries. Note that the language is framed on a case by case basis...not general rules.

1. The essential implementation is a small amount of code...wrapped in structures just for packaging essential code. The wrapping code can be larger & more complex than the essential code.

2. There's small differences between what's needed & what's provided. Which requires workarounds for the desired outcome. These workarounds muddy the logic & can be pervasive at scale.

3. There can be dissonance between the app architecture & the library api.

With all that being said...there are many useful libraries that define & solve problems in their particular domain. Particularly with common, well defined, appropriately scoped requirements.

terribleperson11mo ago

Though the addition of pipes to the base language is helping fix that.

fmbb11mo ago

The Dunning-Kruger effect absolutely also leads to people releasing libraries they should not have and which nobody should use.

unclad596811mo ago

I don't think DK has anything to do with people releasing libraries that nobody should use.

cluckindan11mo ago

You could replace ”The Dunning—Kruger effect” with ”ignorance” and the message would stay the same without sounding like you’re trying to prove your own intellect.

1 more reply

giantg211mo ago

I'd much rather learn a library than create it from scratch. The two main issues are licensing concerns and being able to find ones that actually do what you need.

layer811mo ago

The third issue is avoiding “You wanted a banana but what you got was a gorilla holding the banana and the entire jungle.”

(The quotes comes from a different context, but works quite well here as well.)

khalic11mo ago

The Dunning Kruger Effect? It applies to unqualified people in a group… wrong usage

msgodel11mo ago

egypturnash11mo ago

"naive"

(or "naïve")

kianN11mo ago

Unrelated: I initially expected this articles to be referring to public libraries. I think that would be a challenging connection to prove despite it making intuitive sense.

On the article: some use cases eg handling dates, fault tolerant queues have so many edge cases and are so mission critical that relying on a battle tested tool makes a lot of sense.

However, in my career I’ve seen a lot of examples of a package being installed to avoid 40-50 lines of well thought out code and now a dependency is forever embedded in the system.

krackers11mo ago

1 more reply

j / k navigate · click thread line to collapse