- SQL/PGQ - A Graph Query Language
- JSON improvements (a JSON type, simplified notations)
Peter Eisentraut gives a nice overview here: https://peter.eisentraut.org/blog/2023/04/04/sql-2023-is-fin...
SQL:2023 is finished: Here is what's new - https://news.ycombinator.com/item?id=35562430 - April 2023 (153 comments)
SELECT t.j.foo.bar[2], ... FROM tbl t ...Using a graph DB with many underlying KV-store nodes, you can have a single graph spread over many machines representing e.g. Facebook's social graph, and run a query which "chases around" edges between vertices that live on different nodes, to solve that query, while ensuring that as little of that has to happen as possible — both by rebalancing vertices so that data is sharded at low-connection-degree points in the graph; and by consolidating the steps of queries that occur on the same node into single batch queries, such that the whole thing becomes (close to) a single map/reduce step.
There's nothing in Postgres that knows how to do that; if you had e.g. a graph stored in a Citus hypertable, and did a recursive CTE over it to do graph search, then you'd get pretty dang bad perf.
What you really want is to apply graph processing to data as it is. The SQL 2023 additions are a step in the right direction. I need to find a good detailed description of the constraints and semantics to assess how good it is.
I suspect if Postgres had a solid implementation of SQL/PCQ it would be a similar story for Neo4j.
In particular it is nice to see that a core dev views JSON dot accessing and PCQ as "sensible" future additions to Postgres.
[1] - https://peter.eisentraut.org/blog/2023/04/18/postgresql-and-...
For example, OK, I realize auto-incrementing IDs are not the most important thing in the world, and arguably not even a good approach in many cases. But sometimes you want them, and helpfully almost every database engine I know of has some kind of support for this, even if the semantics may differ. It's a super basic thing to want a unique ID that roughly counts upward on a table. You might have specific needs about re-using numbers and whatnot, but the general idea is very simple.
However: in practice, there is not an excellent way to do it that I can see. The closest thing I could find is `GENERATED BY DEFAULT AS IDENTITY` which, well, works. However, none of SQLite3, MSSQL, nor MariaDB support this to my knowledge.
This is relentlessly annoying.
Is it the standards fault, or the implementations? I honestly can't say. However, I definitely find this annoying, since I was really hoping that by this time, we'd at least have a nice clean subset of standard SQL you could count on anywhere, for popular database engines. Unfortunately, it's not quite there yet, necessitating ugly hacks to this day.
I assume this new standard doesn't really change anything on this regard, since it's a desync with implementations that is a problem, and it does not seem the standards committee really cares too much about this kind of thing. (I could be wrong, though, as I am saying this based on feel and not evidence.)
My mental model is that it's a mixture, but my life experience has been that a "standard" without a test harness or (at bare minimum) a reference implementation is just a bunch of navel gazing. For SQL specifically, that problem is even worse given the number of existing engines that move faster than the specification, so in the absence of leadership they just make stuff up
Natural language is also a catastrophically horrible specification mechanism, since your black/blue is my white/gold
As a result, for databases to compete on features, they must arbitrarily extend the SQL language standard; these modifications to the language then get backfilled into the standard, and runs headfirst into backwards compatibility, and suddenly no one agrees on really what the feature should precisely be so they it becomes an optional part of the standard, which really just means that it isn't standardized.
In any sane language, you wouldn't need different databases to add specific support for GENERATED BY DEFAULT AS IDENTITY, and especially not for 15 different syntax's used in 15 different databases to specify kinda sorta not really the same thing -- it would simply be a function, one you could write yourself, or provided by the standard library. It wouldn't be up to the RDBMS to offer support beyond actual language features -- it'd just be up to you to update your libraries.
Very good point. Non-composable mean non-scalable (language).
I wonder if it would be possible to define a very small robust standard subset of SQL with a standard for extending it. But that would take away the customer-lock-in which is what all big DB-vendors desire.
SQLite SQL != MySQL SQL != ISO SQL
---
As @rgbgraph points out below, the price is actually several times that. There are several parts to the standard, and that US$230 is per part.
Not all specs are particularly accessible (e.g. ECMAScript is often hard to follow if you haven’t spent a fair bit of time around it—it’s mostly natural-language code that works with a ECMAScript virtual machine), but most of the time, I would much rather have the actual spec over someone’s digested summary that covers what they thought was important, but regularly lacks details important for my situation. Some specs are absolutely magnificent. The HTML Standard is my very favourite as both a piece of spec work and as reference material.
Seriously, specs are really good stuff and it makes me sad how people often ignore them because they assume they’ll be inscrutable. (Similar remarks apply to legislative texts. They’re normally pretty easy to understand, and you find all kinds of sometimes fun and sometimes useful gems that potted summaries exclude.)
My only issue is with the width. While whitespace between the sides and the centre content is very useful, this isn’t the 1990s anymore with its 1024×768 monitors. You _can_ make the centre column responsive to the overall width of the screen. Doing so can also give you a lot more room to do things, and make it easier to read. Your line-height is already great, and is perfect for text blocks a good half again wider if not twice as wide. Even on my vertical monitors, which are only 1500px wide, that centre column is pretty much claustrophobically narrow.
How we could we add Snowflake to the compatibility matrices?
It's popular enough to deserve it, IMHO.
Which is even more hilarious, considering the ISO is already being funded by the tax dollars of member countries.
A bit like having to pay a journal to get access to research papers: ridiculous.
Trying to throw together a list of them all, I'm only finding 5 though:
• https://www.iso.org/standard/76583.html
• https://www.iso.org/standard/76584.html
• https://www.iso.org/standard/76585.html
• https://www.iso.org/standard/76586.html
• https://www.iso.org/standard/76587.html
Any idea about the others?
"Inclusive and sustainable industrialization, together with innovation and infrastructure, can unleash dynamic and competitive economic forces that generate employment and income"
Not clear how an arbitrary levy as this can contribute to an SDG goal. It is the definition of exclusion. A bright new thing can use the $230 to get several good database books.
Surely ISO has enough corporate stakeholders that can defray the cost of running the standardization process.
Asking for a friend, of course.
Does anyone else find value in what's in an ISO standard?
This isn't SQL-specific, but this is 100% the problem for me. There's such a big culture gap between the way that we do things in most of the tech world and ISO, and one of the biggest clashes is this weird $180 PDF thing.
If I want to implement a new standards-compliant HTML parser, I can hop right onto whatwg.org and view the complete standard instantly [0]. It's massive and complicated, but it's freely accessible to anyone interested.
In contrast, if I want to implement an ISO 8601-compliant date parser, ISO wants me to buy their PDF for CHF166 (~$180 USD). This spec is for a standard that is orders of magnitude less complex, and they're charging through the nose for it.
I'm unclear what makes the difference between a standard that can be maintained by a community for the benefit of everyone and a standard that needs to be locked behind a paywall.
The only real way of fixing it is for enough people to ignore ISO so they become irrelevant.
If you are building a new DB engine (toy or not), don't use SQL. Either design a new spec or use something that's more openly specified (maybe GraphQL or EdgeQL).
Ridiculous.
ISO defends its own model by arguing that it produces greater independence from vendors, being less reliant on them for funding and thus making it easier for them to say "no" to them. One big difference is ISO is not an industry-specific standards body, it has standards for all kinds of things that have nothing to do with computers – screw threads, metallurgy, analytical chemistry, oil and gas pipelines, you name it. Possibly their defence makes more sense for some of those other industries than it does for ours.
If the major SQL players got together (major proprietary vendors and leading open source projects), they could create their own SQL standards process to supersede the ISO one, and release the standard freely. The ISO standard could still exist, but it could turn into one of these ISO standards where ISO just adopts the text of an existing standards organisation - e.g. the ECMAScript standard is developed by ECMA. Initially ISO republished ECMA's standards under their own number (ISO/IEC 16262), now instead they publish a 3 page standard which just incorporates ECMA's by reference (ISO/IEC 22275). They offer it for free download [0], but are also willing to sell you a copy if you are desperate to give them money. ISO's SQL standard could turn into the same thing – but, that would require the SQL community to decide to push for that, and I'm not sure any of the players feel sufficiently motivated to do it.
[0] https://standards.iso.org/ittf/PubliclyAvailableStandards/in...
Somewhat ironically the standards-body also benefits from such bloat because that is the product they are selling. The bigger the standard the more money they can ask for it.
Finally, for the folks who need professional access to this document, such as RDBMS implementors or professional developers using an RDBMS, the price is a pittance. To anyone for whom the price is a problem - perhaps someone writing a free software application - the lack of access to the standard isn’t a problem, because what they really need is documentation on how their implementations work, and two superb implementations - PostgreSQL and SQLite - have voluminous documentation and they’re 100% free of charge.
The problem is that the standards have to be bought. They should be in the public domain. But someone can still be paying for it (eg a government, or an international organization funded by various public bodies).
While obviously standards don't have virgin birth (some costs are involved) it does smack as rent extraction: By definition there is only one standard so you can't ask for competitive pressures to lower the price.
It might be interesting to have a comparison of where major databases stand (or plan to be) with respect to SQL/PGQ
Ridiculous.
I think tax-payer money should pay for standards, because they benefit us all. It is like the highway system, or clean air, and water.
PostgreSQL does mostly aim at conforming to the standard. They will invent new syntax when needed, But compared to the those previously mentioned, Postgres seems to prefer to stick closer to the standard whenever possible, including adding standard syntax for existing features whenever possible.
PostgreSQL does have some places where there is deliberate non-conformance (beyond just incompletely implemented features). They document many deliberate deviations (other than unimplemented or partially implemented features) and if they think they can be fixed in the future or not: https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL_Standard . Looking at the list I'd say only one especially likely to bite a developer is the default escape character for LIKE clauses, or the non-standard trailing space behavior for character(n) datatypes (but who used fixed length character datatypes instead of varchar or text?). And obviously not yet implemented features could bite people, but many such features are optional to implement anyway, so...
I cannot speak about MySQL or MariaDB, due to insufficient familiarity.
It makes no sense to have a standard SQL when nonsensical implementations like MSSQL or MySQL exist.