- why developers/maintainers choose the package granularity they currently do. e.g., you can have tiny granular packages today (npm famously has single-simple-function packages, which is widely derided, BTW). Developers break down packages in a way that makes sense to them to best develop, test, maintain, and release the package. If you reduce the overhead of a small "grains" of package, developers might choose to go a little more granular, but not a lot.
- why people want or need to update. People want or need security updates. People want or need new features and functionality.
So even with this magically fully in-place (there's some tooling implied here), I don't think there would be much impact on updating.
(And people who tried to implement it or use packages that implemented it would be getting burned by version update mistakes -- this seems almost pathologically error-prone -- and when something does go wrong, it will take some new class of tool to even diagnose what went wrong where. People will end up with issues triggered by their personal upgrade path.)
BTW, patch updated don't have to be done at a source or function level at all. (e.g. upgrade from version x to x+1 could be expressed as a delta. Or x to x + 2 for that matter.) This has been popping up for decades, but it seems the practical value must not be worth the trouble because it doesn't seem to catch on in a big way.
- either version the data structures/classes/shapes of dictionaries/whatever that a function accepts/returns;
- or have converters between different data versions and use them inside your functions.
As I said in another topic on HN which was about that project that hoped to bring hot-code reloading in a C REPL or something: changing the code inside the running program is the least of the problems; flawlessly updating the data inside the running problem so that the new code could proceed to work on it — that's the hard problem (think e.g. about rolling back the update that threw away the bunch of fields).
Unless these sorts of things are dealt with, any framework like this will just be solving the part of the problem that isn't really a problem.
The solution to dependency chaos is grouping dependencies together and versioning the larger group, not splitting into even more dependencies.
Let's take a fictional example: I import D3.js to use the parseDSV() function, after 2 years the method has not received any updates, but the package has gone from version 1.0.2 to 5.0.2. With a granular system, my function would still be on version 1.0.2 (because no changes were made), but with the current system I would have received an unnecessary update.
So, in this case, granular versioning would actually help to put an end to the chaos of dependencies.
The problem is in NPM culture, and how much churn there is in packages and especially unnecessary breaking changes.
Avoid that and then the problem is reduced from constantly fighting to play API keepup to simply letting security updates flow through.
Let your patch version number go to the moon (which is no real problem practically, computers do big numbers and it is auto automatable.)
This is a human culture problem if anything. Things cannot be left alone and be called "done" anymore, everything has to constantly "improve", breakage be damned. New connectors HAVE to be invented, even though the improvements are marginal, and now everything "old" doesn't work anymore.
How many times haven't you opened up a tool you use daily/weekly and suddenly the UI has shifted in ways so you cannot understand how to do the task you were supposed to do?
With SaaS, this has become much more prevalent than before. And it's not just the "npm culture" or even JavaScript, this exists everywhere in society, from cars to doors to chairs to airplanes and everything in-between. Obviously, some sectors are better with standards than others, but seems to be happening more and more, everywhere.
The problem of tracking changes across dependencies exists whether or not this is true. Perhaps the problem is more evident because the feature development and software change processes have become more efficient. eg Detecting the need for changes, new features that are expected to stay competitive, etc. These efficient processes are highlighting this mismatch mitigation, that was easily manageable (or ignored) in the past.
It's "semver" (with an "e"), short for Semantic Versioning.
You could have different variable semantics for different namespaces or partitions!
:)
Joe Armstrong made a proposal for this (I’m pretty sure half tongue in cheek).
https://joearms.github.io/published/2015-03-12-The_web_of_na...
Once a node in this graph (a function, a type) changes, it may require a version change of anything that depend on it (a function, a type), because the behavior / contract may materially change even if the code itself did not change!
I suppose this is handled by changing the module version, because that module likely also contains the stateful object whose behavior is now different.
But equally the module version should change once its dependencies change, because the summary behavior of the functions inside the module is now different, as it incorporates the changed behavior of its dependencies.
Because of that I suspect we'll end up with situation similar to today's, with constant updates of our dependencies because their (distant transitive) dependencies changed.
Theoretically we could track dependencies on an individual function level. Then the version o a function may stay the same even if its module's dependencies have changed, because we can prove that these changes did not reflect on the function in any way. I don't think it's realistic for Typescript specifically though, and I don't think it would bring much practical benefit.
E.g. suppose that there is a new version of pthread_mutex_lock(&mutex) which relies on a larger structure with new members in it. Problem is that compiled programs have pthread_mutex_lock(&mutex) which pass a pointer to the older, smaller structure. If the library were to work with the structure using the new definition, it would access out of bounds. Versioning take care of this; the old clients call a backwards compatible function. It might work with the new definition, but avoids touching the new members that didn't exist in the old library.
But this is a very low-level motivation; this same problem of low-level layout information being baked into the contract shouldn't exist in a higher level language.
Regarding "nothing stopping us from making this versioning system completely automated" it seems like that depends on whether your language's type system supports that, and whether programmers follow the rules. For example, if you're relying on varargs/kwargs too much, it's going to be difficult to tell before runtime whether you've broken something.
Find a set of versions that is self-compatible and works, and pin all your versions to those specific versions, with a hash if possible. Upgrade on your schedule, not someone else's. Thoughts?
In practice, it will stay pinned for years until a CVE forces a patch upgrade that ends up triggering a dependency avalanche and weeks or months of headaches.
Package spec puts down what it should work with, you pin a specific version in that range for your app that you've tested.
Otherwise updating things will never happen. Unless you have full separation between upstream dependencies (so you can have multiple versions at the same time) - and that brings huge questions - a single dep 3 steps away can stop you upgrading.
Ranges also communicate "this doesn't work with later than X" as well.
I don’t know what’s everyone else’s experience but I was updating dependencies due to either bugs identified in old versions, because I wanted a new feature or because the old version was not supported anymore. Setting dependency to a fixed version was not an option. Using in your code function with given version fixed seems to be problematic.
During updates the problem was to update all other dependencies as a result of the update. I can’t see how the proposed approach would solve it.
Another problem which I sometimes faced (less annoying) was the api change i.e. start using function B instead of function A which requires slightly different parameters. Those kind of automatic refactors could be supplied with library upgrades (some libs already come with automatic migration “scripts”)
(It will of course also take some garbage collection mechanism to eventually remove old, disused versions when nobody depends on them any more.)
[1]: https://en.wikipedia.org/wiki/Purely_functional_data_structu...
Version information is essentially a lossy compression- all the changes that go into a given release are summarized into a handful of numbers. Whether this happens at the component level or the function level only changes how lossy the versioning step is. I am not convinced it improves the workflow described above.
(This problem is not a technical problem.)
To actually get dependencies for our software, we need two mechanisms:
- (a) Some way to precisely specify what we depend on
- (b) Some mechanism to fetch those dependencies
Many package managers (NPM, Maven, etc.) use a third-party server for both, e.g.
- (a) We depend on whatever npm.org returns when we ask for FOO
- (b) Fetch dependency FOO by attempting to HTTP GET https://npm.org/FOO; fail if it's not 200 OK
Delegating so much trust to a HTTP call isn't great; so there's an alternative approach based on "lock files":
- (a) We depend on the name FOO with this hash (usually 'trust on first use', where we find the hash by doing an initial HTTP GET, etc. and store the resulting hash)
- (b) Fetch dependency FOO by looking in these local folders, or checking out these git repos, or doing a HTTP GET against these caches, or against these mirrors, or leeching this torrent, etc. Fail if we can't find anything which matches our hash.
The interesting thing about using lock files and hashes, is that our hash of dependency FOO depends on the contents of its lock file; and that content depends on the contents of FOO's dependencies, including their lock files; and so on.
Hence a lock file is a Merkle tree, which pins all of the transitive dependencies of a package: changing any of those dependencies (e.g. to update) requires altering all of the lock files in-between that dependency and our package. That, in turn, alters our lock file, and hence our package's hash.
The author is complaining that such dependency-cascades require a whole bunch of version numbers to get updated. I think it's better to keep track of these things separately: use your version number as documentation, of major/minor/patch changes; and keep track of dependency trees using a separate, cryptographically-secure hash. The thing is, we already have such hashes: they're called git commit IDs!
Other advantages of identifying transitive dependencies with hashes:
- They're not sequential. Our package isn't "out of date" just because we're using hash 1234 instead of 1235. All that matters are the version numbers. In other words, we're distinguishing between "real" updates (a version number changed) and "propagation" (version numbers stayed the same, but a dependency hash changed).
- They're unstructured; e.g. they give us no information about "major" versus "minor" changes, etc. (and hence no need to decide whether an update is one or the other!)
- They can be auto-generated; e.g. we might forget to update our version number, but there's no way we can forget to update our git commit ID!
- They're eventually-consistent: it doesn't matter how updates 'propagate' through each package; each sub-tree will converge to the same hash (NOTE: for this to work we must only take the content hash, not the full history like a git commit ID!).
For example, take the following ("diamond") dependency tree:
+--> B --+
| |
Our package --> A --+ +--> D
| |
+--> C --+
When D publishes a new version, B and C should update their lock-files; then A should update its lock-file; then we should update our lock-file. However, this may happen in multiple ways:- B and C update; A updates (getting new hashes from B and C)
- B updates; A updates; C updates; A updates
- C updates; A updates; B updates; A updates
Using version-numbers (or git commit IDs!) would result in different A packages (one increment versus two increments; or commit IDs with different histories). Using content hashes will give A the same hash/lock-file in all three cases. This also means we're free to propagate updates whenever we like, rather than waiting for things to 'stabilise'; and it's safe to use private forks/patches for propagating updates if we like, without fear of colliding version numbers.
Note that some of this propagation can be avoided if our build picks a single version of each dependency (e.g. Python requires this for entries in its site-packages directory; and Nixpkgs uses laziness and a fixed-point to defer choosing dependencies until the whole set of packages has been defined)
For example:
GET /user/9893
Accept: application/json; charset=utf8; version=1
No semantic versioning, just bumped the version number for each significant change. And yup, "significant" is in the eye of the caller, but it worked out well.Now this is a bit different from TFA, because the server supported all the versions at the same time, so the caller could choose whatever mix of versions it wanted. This proposal is about assigning version numbers to individual functions rather than the library as a whole - essentially just a documentation/metadata change, with support from package managers.
Here's why this is relevant: the fact that the API was versioned this way had a big impact on how it evolved over time. At first it was pretty much the same as the usual `v1/user/9893` design. But as new versions of specific resources were added, it forced a decoupling of the underlying data model from the schema that were exposed in the interface. Each endpoint-version became an adaptor layer between the contract it offered to the caller and the more generalized, more abstract functionality offered by the data layer. That had costs as well as benefits. New endpoint versions often required an update to the data layer, which in turn required refactoring of older versions to work with the new data layer while continuing to adhere to their contracts. It worked out well, but it did require a change in implementation strategy.
I think the lesson for this proposal is that changing the way package metadata is handled is just the first step. Adopting it could then create pressure for mix and match packaging of the interface functions - "Hey can I get a version of this library with addFunction 1.2.16 and divFunction 2.0.1? I don't want to change all my addition code just to get ZeroDiv protection." That could be done with the right tooling and library design.
Or maybe it makes DLL hell worse because now you have to solve semantic versioning compatibility for every function in a library and that's slower and more sensitive to semantic versioning mistakes. You could get work-arounds like "only ever change one function when you release a new version of the library" or "just bump all the major versions even if they haven't changed."
Or maybe linkers would get built that can do the logic, like "when package A calls package B, use addFunction 1.2.16, but when package C calls package B, use 1.3.1"
Anyway, I don't think this proposal is sufficient on its own. It would either have ripple effects throughout the language ecosystem, or be ineffective because of developers working around it, or not be adopted at all.
Stopped reading there.
Who cares about pure velocity if you are really trying to communicate something? We shouldn't measure written word by pure word count, or how quickly you can ship it. Not everything needs to be just some kind of hyper-advertising.
Just gives the impression they only care so much about what they wrote, that they only care so much about their readers!
In this case it may be ok because we may assume the author looked over the result and agrees with it. They could remove the citation as far as I'm concerned, the same way they don't have to cite their spell checker.
But a summary is a distillation of an understanding.
chatgpt does not understand anything, it is merely pattern-matching against and recomposing other texts.
The only reason the result is even half way sensible is because as of today, most other text that it is matching against and recomposing was written by people who did understand what they were writing and writing about.
So I would perhaps agree that a person using it as part of the process of their own writing is a good use case. But I would not agree that chatgpt can summarize things, and would not say that letting it do the entire job of interpreting and restating is a good use case.