So there are constantly URLs that if full url path was used they wouldn't resolve to anything.
But you juat the product id part only and then either redirect to new categories or you keep the same URL. Up to you.
I feel like all cases are teally solvable and slug ids or ids with meaning are actually great.
So I am talking about URLs like /electronics/smartphones/apple-iphone-blue-32gb etc.
This is very good for UX and usability as well.
I don't think any of us would prefer a meaningless number to a username? But you can create a new account if you wish.
And of course we use meaningful names in source code.
One scheme I like for URL's is a meaningless id followed by an SEO string in a URL, where only the number is used and it redirects if the SEO string doesn't match.
We also made the decision that the geographical cluster doing the processing would be embedded in the id. This may have been a mistake, we never got to the point where it caused problems but if there was a major reorganization then yeah could have been problematic.
It was also very difficult to work with. I had to refactor the lovecraftian mess that generated them for an entire project of thousands of nodes. If they didn't come out exactly the same, a technician would have to spend days manually updating physical units to match the new ids. Thank God for unit tests.
I agree with the author. But practically speaking, for every example where semantic IDs broke down, there are probably 10 examples where it worked out fine. Just a thought.
The author provides several examples, and they were all chosen because there was a benefit. Although some were poorly chosen like a group name rather than a function (everyone remembers individuals' emails rather than functions). Whether it was worth it is easy to dispute after that choice caused a headache but is "sore loser bias": it doesn't account for all the worthwhile effective choices elsewhere. Nor does it account for the effectiveness it bought the project in the meantime.
All the way to the extreme: do we prefer blog URLs that mention at least some category, date and a few words of subject line. Or do we go with opaque machine generated ones? Several lines long for good measure? Does the fact that you will never have to rename the opaque ones justify inflicting them on the users? How likely are you to ever rename? Some people will still choose the opaque URL! Do they earn points with their readers?
This is a great point! I should have mentioned in the article, but in the examples mentioned part of the nuisance was that people involved could envision alternative solutions that would have also been effective without causing most of the long lasting trouble.
But, sometimes that's not possible (I do mention that they are not always avoidable)
I wonder now that I am retired if we should perhaps just give up the fight and instead concentrate on mitigation.
I think if the identifier just adds what type of data it is identifying, the extra meaning will only become obsolete at the same time that data does. And the extra type information can help avoid/debug problems where the wrong type of id was used.
Compare with strongly typed ids/type branding:
- https://hw.leftium.com/#/item/39174998
- https://www.peakscale.com/strongly-typed-ids/
- https://andrewlock.net/using-strongly-typed-entity-ids-to-av...
I think a key point in the article is that "models become obsolete faster than we’d like".
On the other hand, you could argue that it's simply necessary to put in the effort to keep the data model up to date with your current needs, through API versioning, database migrations, etc.
Honestly I'm not sure which approach is less messy; maybe it depends on the team.
The difference is that IDs are part of the public API.
Your database schema (KV or otherwise) is not.
[0] This means no comma-separated lists in strings, JSON columns, serialized PHP objects, and so on.
In the backend use your own primary key, put a leading index on that other thing, it can even be shitty and long but if you have the first 50 chars indexed it will be fast as hell to lookup 99.999% of cases.
Natural keys serve as a great primary key when contextual meaning is important. A surrogate key is a key which does not have any contextual or business meaning.
If you are going to add semantic structure to an identifier, which is frequently useful and a good idea, best practice is usually to encrypt it before sending it to the external world. Encrypting a UUID-like structure is approximately free on modern computers.
The essense of a thing still changes afer it's meaningful identifier was assigned, yet it's a problem to change a things identifier.
The identifier should be nothing other than an identifier. It's properties are both infinite and mutable.
As a baby admin I had the genius idea to name servers after the state they were in once we started renting racks scattered around the country. Completely stupid. oh3 and pa6 etc continued to exist as entities long after they had been migrated or failed-over to their hot backups in other locations.
I'm thick, so I still didn't get it when I realized the state names were wrong, and so the next plan was hostnames/cnames based on roles instead of physical location. Exactly the same problem.
Super simple baby example, and applies the same to everything else. It wasn't only stupid for that one case and reasonable in other cases. It's the same wrong in all cases.
Been there. Problem is, now you can’t rotate your keys without breaking users and everyone and everything needs access to this key. This means the key is going to leak sooner or later. Also, someone will inevitably create an endpoint that does not encrypt, nay obfuscate, the identifier. Might as well not have bothered to obfuscate the ID to begin with.
Yeah, although encryption is basically a way of hiding semantics from the external world (or, everyone else but whoever generates them) no?
At my company, we often ran into the same question over and over, namely, weather a convention should go one way or the other way. And in almost every case, we found it’s better to just make a general implementation with the options being available to be supplied at runtime, or in a configuration file. In other words, don’t choose, implement a more general solution. That has become the policy in our company.
Bad architects make decisions. Good architects make deciding harmless
Those systems are actually useful, you know. I have a friend who used to live at a place with address like "Northern Living Block, 157". There were about 300 total buildings in that block, and the numbers were assgined to the buildings pretty randomly, so it was impossible to navigate unless you were either given explicit directions, or had a map with you.
The routing info has to live somewhere, you know. Pushing it into the IDs means that you don't need to "have an entry in some database saying '1|INFRA|HOST|12' RUNS '1|APM|APPLICATION|23'", you don't have to update/delete it as needed, you don't need to look it up and deal with caching issues, etc.