Google's new pipe syntax in SQL (opens in new tab)

(simonwillison.net)

328 pointsheydenberk1y ago182 comments

182 comments

Richard Hipp, creator of SQLite, has implemented this in an experimental branch: https://sqlite.org/forum/forumpost/5f218012b6e1a9db

Worth reading the thread, there are some good insights. It looks like he will be waiting on Postgres to take the initiative on implementing this before it makes it into a release.

simonw1y ago

That comment where he explains why he's not rushing to add new unproven SQL syntax to SQLite is fascinating:

> My goal is to keep SQLite relevant and viable through the year 2050. That's a long time from now. If I knew that standard SQL was not going to change any between now and then, I'd go ahead and make non-standard extensions that allowed for FROM-clause-first queries, as that seems like a useful extension. The problem is that standard SQL will not remain static. Probably some future version of "standard SQL" will support some kind of FROM-clause-first query format. I need to ensure that whatever SQLite supports will be compatible with the standard, whenever it drops. And the only way to do that is to support nothing until after the standard appears.

anitil1y ago

It's so ambitious in an almost boring way, exactly the right steward for a project like this

maxbond1y ago

Dr. Hipp is one of my heroes. He seems to labor quietly in semi obscurity for decades, and at the end of it he's produced some amazing software. I was tickled by the curfuffle over his use of a set of guidelines for living in a Christian monastery as SQLite's code of ethics for the purpose of checking a box on an RFQ (part of the fallout of the libsql fork), because he does seem like a sort of programmer monk. (For what it's worth, as an agnostic, I've read them several times and found them unobjectionable. While I think the drama was unnecessary, the libsql people are doing interesting work.)

I choose never to meet this man and be disabused of this notion. Shine on, doctor.

1 more reply

Blackthorn1y ago

FROM first would be nothing short of incredible. I can only hope that Postgres and others can find it within themselves to get together and standardize on such an extension!

pradeepchhetri1y ago

This syntax looks a lot like PRQL. ClickHouse supports writing queries in PRQL dialect. Moreover, ClickHouse also supports Kusto dialect too.

https://clickhouse.com/docs/en/guides/developer/alternative-...

willvarfar1y ago

Yeap I didn't know DuckDB supported it already!

Being able to do SELECT FROM WHERE in any order and allowing multiple WHEREs and AGGREGATE etc, combined with supporting trailing commas, makes copy pasting templating and reusing and code-generating SQL so much easier.

  FROM table  <-- at this point there is an implicit SELECT *
  SELECT whatever
  WHERE some_filter
  WHERE another_filter <-- this is like AND
  AGGREGATE something
  WHERE a_filter_that_is_after_grouping <-- is like HAVING
  ORDER BY ALL <-- group-by-all is great in engines that support it; want it for ordering too

...

croes1y ago

A special keyword like HAVING prevents erros by typing in the wrong line.

How is OR done with this WHERES?

aidos1y ago

What’s group-by-all? Sounds like distinct?

2 more replies

quartesixte1y ago

What exactly is the history of having FROM be the second item, and not the first? Because FROM first seems more intuitive and actually the way you write out queries.

Really hope this takes off and gets more widespread adoption because I really want to stop doing:

  SELECT *
  FROM all_the_joins

into

  SELECT {my statements here}
  FROM all_the_joins

bvrmn1y ago

It's funny how he addresses the new syntax as "from-clause-first". Like a very minor change with a low value.

Cthulhu_1y ago

I think that's important, because a lot of concepts are presented as prohibitively complicated; for example, functional programming makes sense in my head, but if you present it as lambda calculus and write it in concise form with new operators, you lost me.

tehlike1y ago

LINQ, PRQL, Kusto has all preceeded this.

While LINQ is mostly restricted to .NET, PRQL is not. https://prql-lang.org/

It's a welcome change in the industry.

I made this prediction a couple years back: https://x.com/tehlike/status/1517533067497201666

numbsafari1y ago

The paper directly references PRQL and Kusto. The main goal here is to take lessons learned from earlier efforts and try and find a syntax that works inside and alongside the existing SQL grammar, rather than as a wholly separate language.

hn_throwaway_991y ago

I've been following PRQL for some time now since it first got good traction on HN and I like it a lot, but I'm really hoping this pipe syntax from Google takes off for a couple of reasons:

1. Similar to what you mention, while I think PRQL is pretty easy to learn if you know SQL, it still "feels" like a brand new language. This piped SQL syntax immediately felt awesome to me - it mapped how my brain likes to think about queries (essentially putting data through a chain of sieves and transforms), but all my knowledge of SQL felt like it just transferred over as-is.

2. I feel like I'm old enough now to know that the most critical thing for adoption of new technologies that are incremental improvements over existing technologies is to make the upgrade path as easy as possible. I shouldn't have to overhaul everything at once, but I just want to be able to take in small pieces a chunk at a time. While not 100% the same thing, if you look at the famously abysmal uptake of things like IPv6 and the pain it takes to use ES module-only distributions from NPM, the biggest pain point was these technologies made you do "all or nothing" migrations - they didn't have an easy, simple way to get from point A to point B. The thing I like about this piped SQL syntax is that in a large, existing code base I could easily just start adding this in new queries, but I wouldn't really feel the need to overhaul everything at once. With PRQL I'd feel a lot less enthusiastic about using that in existing projects where I'd have a mix of SQL and PRQL.

lupire1y ago

It's wild that the enterprise and connected world has moved on from forcing COBOL compatibility for modern projects, but still insists on SQL compatibility.

andrewguy91y ago

I’m a big kusto user, and it’s wonderful to have pipes in a query language.

If you haven’t tried it, it’s great!

tehlike1y ago

I have not tried it, but I used to be a .net developer and worked a lot with LINQ (and contributed a bit to NHibernate and its Linq provider) and I am a big fan of the approach.

Kusto does seem interesting too, and i think some of the stuff i want to build will find a use for it!

Salgat1y ago

LINQ is so incredibly intuitive. I wonder if this will make creating C# LINQ providers for databases that support this syntax easier.

kbouck1y ago

Indeed. Elastic has also recently released a piped query language called ES|QL. Feels similar to Kusto.

I find piped queries both easier to write, and read.

anonzzzies1y ago

Not having LINQ is a terrible inconvenience everywhere. Most languages have libs that try to hack something similar, but it usually simply isn't.

mrits1y ago

It's a lot easier to design a good DSL when it doesn't have to be compatible with anything

anonzzzies1y ago

Well, .NET was already used a lot when it was built in a few decades ago.

oaiey1y ago

Is "from" keyword originating from .NET (Framework 3.5 in 2007) or is this pre-existing somewhere in research?

aragonite1y ago

> This remains a long-standing pet peeve of mine. PDFs like this are horrible to read on mobile phones, hard to copy-and-paste from ...

I've never understood why copying text from digitally native PDFs (created directly from digital source files, rather than by OCR-ing scanned images) is so often such a poor experience. Even PDFs produced from LaTex often contain undesirable ligatures in the copied text like ﬁ and ﬂ. Text copied from some Springer journals sometimes lacks space between words or introduces unwanted space between letters in a word ... Is it due to something inherent in PDF technology?

crazygringo1y ago

> Is it due to something inherent in PDF technology?

Exactly. PDF doesn't have instructions to say "render this paragraph of text in this box", it has instructions to say "render each of these glyphs at each of these x,y coordinates".

It was never designed to have text extracted from it. So trying to turn it back into text involves a lot of heuristics and guesswork, like where enough separation between characters should be considered a space.

A lot also depends on what software produced the PDF, which can make it easier or harder to extract the text.

vips7L1y ago

My favorite is when they do bold by duplicating and slightly shifting the letters. Bboolldd. PDFs are hell.

lupire1y ago

That's inherited from the original Portable Document Format for machines - the typewriter instructions.

spatulon1y ago

I've never looked into the PDF format, but, does it not allow for annotations that say, "the glyphs in the rectangle ((x0, y0), (x1, y1)) represent the text 'foobar'")? That's been my mental model for how they are text-searchable.

kccqzy1y ago

They do but such annotations are optional.

jonathanyc1y ago

PDF natively supports selectable/extractable text. Section 9.10 of ISO 32000 is literally “Extraction of Text Content.” I’ve implemented it myself in production software.

There are many good reasons why PDF has a “render glyph” instruction instead of a “render string”. In particular your printer and your PDF viewer should not need to have the same text shaping and layout algorithms in order for the PDF to render the same. Oops, your printer runs a different version of Harfbuzz!

The sibling comment is right that a lot depends on the software that produced the PDF. It’s important to be accurate about where the blame lies. I don’t blame the x86 ISA or the C++ standards committee when an Electron app uses too much memory.

jahewson1y ago

It’s due to poor choices made in the implementation of pdfTeX. For example the TeX engine does not associate the original space characters with the inter-word “glue” that replaces them, so pdfTeX happily omits them. This was fixed a few years back, finally. But there’s millions(?) of papers out there with no spaces.

mjevans1y ago

ligatures like fi fl ffi ffl etc are for changes in fonts specific to rendering correctly on a screen or printer. It's intended to be a _rendered_ format, rather than a parse-able format.

Well formatted epub and HTML generally are usually intended to update to end user needs and better fit available layout space.

WorldMaker1y ago

Though it's also a stuck legacy throwback. Modern advice would be to not send ligatures directly to the renderer and instead let the renderer poll OpenType features (and Unicode/ICU algorithms) to build them itself. PDF's baking of some ligatures in its files seems something of a backwards compatibility legacy mistake to still support ancient "dumb" PostScript fonts and pre-Unicode font encodings (or least pre-Unicode Normalization Forms). It's also a bit of the fact that PDF has always been confused about if it is the final renderer in a stack or not.

jahewson1y ago

That wouldn’t work for PDF’s use case of being an arbitrary paper-like format because the various Unicode and OpenType algorithms don’t provide sufficient functionality for rendering arbitrary text: there are no one-size-fits all rules! The standards are a set of generic “best effort” guidelines for lowest-common-denominator text layout that are constantly being extended.

Even for English the exact tweaking of line breaking and hyphenation is a problem that requires manual intervention from time to time. In mathematics research papers it’s not uncommon to see symbols that haven’t yet made it into Unicode. Look at the state of text on the web and you’ll encounter all these problems; even Google Docs gave in and now renders to a canvas.

PDF’s Unicode handling is indeed a big mess but it does have the ability to associate any glyph with an arbitrary Unicode string, for text extraction purposes, so there’s nothing to stop the program that generates the PDF from mapping the fi ligature glyph to the to-character string “fi”.

1 more reply

lupire1y ago

That's fine, but a good compiled format should also include a source map for accessibility.

0cf8612b2e1e1y ago

It is a shame that CSS pagination is still a mess. Not that I like CSS, but it would go a long way towards unlocking some layouts from PDF.

jamesfinlayson1y ago

Agreed - I used CSS to lay out a book a couple of years ago and it wasn't too bad, but the things that have poor support/don't work at all (like page numbers) are a pain to hack around.

ericjmorey1y ago

XPS solved a lot of the problems with PDF, but Microsoft couldn't reach a critical level of adoption to let network effects take hold.

However, I don't know if XPS handles the copying of text better.

meindnoch1y ago

If a PDF doesn't support text extraction, it's the fault of the software that created it. Most likely the software didn't include the glyph → Unicode character mapping in the PDF.

summerlight1y ago

Previous submissions on the paper itself:

https://news.ycombinator.com/item?id=41321876 (first) https://news.ycombinator.com/item?id=41338877 (plenty of discussions)

I tried this new syntax and this seems a reasonable proposal for complex analytical queries. This new syntax probably does not change most simple transactional queries though. The syntax matches the execution semantic more closely, which means you less likely need to formulate query in a weird form to make query planner work as expected; usually users only need to move some pipe operators to more appropriate places.

FridgeSeal1y ago

Kinda looks like a half-assed version of what PRQL does. Like, if we’re going to have nonstandard sql, let’s just fix a whole bunch of things, not just one or two?

summerlight1y ago

> Like, if we’re going to have nonstandard sql, let’s just fix a whole bunch of things, not just one or two?

I think they intentionally kept themselves away from massive redesign of the languages, which has a good chance of becoming multi decades of frustrating death march. I know a number of such cases from C++ standard proposals and probably the team wanted to avoid it.

chubot1y ago

This is addressed in the paper -- it's nice to have something deployable in existing SQL languages, and it also doesn't rule out using PRQL

hn_throwaway_991y ago

> Kinda looks like a half-assed version of what PRQL does. Like, if we’re going to have nonstandard sql, let’s just fix a whole bunch of things, not just one or two?

To be honest, this feels exactly like the kind of mistake that IPv6 made. It wasn't just "let's extend the IPv4 address space and provide an upgrade path that's as incremental as possible", it was "IPv4 has all these problems, lets solve the address space issue with a completely new address space, and while we're at it lets fix 20 other things!" Meanwhile, over a quarter century later, IPv4 shows no signs of going away any time soon.

I'd much rather have an incremental improvement that solves 90% of my pain points than to reach for some "Let's throw all the old stuff away for this new nirvana!" And I say this as someone that really likes PRQL.

andrewshadura1y ago

You can't "just" extend the IPv4 address space while keeping the compatibility.

1 more reply

scrlk1y ago

There was a second submission of the paper, which attracted more comments: https://news.ycombinator.com/item?id=41338877

summerlight1y ago

Thank you, added it to my comment. I missed all the discussions!

BeefWellington1y ago

Every time this FROM-first syntax style crops up it's always the most basic simple query (one table, no projections / subselects / consideration to SP/Views).

Just for once I want to see complete examples of the syntax on an actual advanced query of any kind right away. Sure, toss out one simple case, but then show me how it looks when I have to join 4-5 reference tables to a fact table and then filter based on those things.

Once you do that, it becomes clear why SELECT first won out originally: legibility and troubleshooting.

As long as DBs continue to support standard SQL they can add whatever additional syntax support they want but based on history this'll wind up being a whole new generation of emacs vs vi style holy war.

dietr1ch1y ago

Sounds a bit like "new thing scary" unless you show why having select in front actually avoids problems, and I don't think there's a clear problem they avoid, but it does make it really hard to autocomplete (can you even do it properly?) while something along the lines of just swap select for from is well defined.

garrettgarcia1y ago

> Sounds a bit like "new thing scary" unless you show why having select in front actually avoids problems

This isn't really fair. BeefWellington gave a reason why SQL is how it is (and how it has been for ~50 years). It's reasonable to ask for a compelling reason to change the clause order. Simon's post says it "has always been confusing", but doesn't really explain why except by linking to a blog post that says that the SQL engine (sort of but not really) executes the clauses in a different order.

I think the onus of proof that SQL clauses are in the wrong order is on the people who claim they're in the wrong order.

Sankozi1y ago

But it has been explained many times from many angles.

* SELECT first makes autocomplete hard

* SELECT first is the only out of order clause in the SQL statement when you look at it from execution perspective

* you cannot use aliases defined in SELECT in following clauses

* in some places SELECT is pointless but it is still required (to keep things consistent?)

Probably many more.

1 more reply

mnsc1y ago

This is a case where stating your opinion and credentials will make you sound really old and conservative so it will be easy to take cheap shots like "you are just afraid of change".

At my previous gig I worked for a decade with an application that meant creating and maintaining large hairy sql that was created to offload application logic to the database (_very_ original) And we used to talk about this "wrong order" often but I never once actually missed it. It was at the most a bit annoying when you jumped in a server to troubleshoot and you knew the two columns you were interested in and you could have saved two seconds. But when working with maintaining those massive queries it always felt good to have the projection up top because that is the end result and what the query is all about. I would not have liked if the method signature in eg Java was just the parameters and the return type was after the final brace. This analogy falls apart of course since params are all over the place but swapping things around wouldn't help.

So just go 'SELECT *...' and go back and expand later, I want my sql syntax "simple". /old developer

BeefWellington1y ago

It really isn't. I've been working in this field for ages and did a lot of those years as a DBA and data modeler. I've worked with other syntaxes too, mostly MDX but some others specific to Hadoop/Spark. I'm not afraid of new things. I just want them to improve on what we have. I want them to be honest about situations where their solution isn't great.

SQL has lots of warts, e.g.: the fact that you can write SQL that joins tables without including those tables in a JOIN, which leads to confusion. It's fragmented too -- the other example I posted shows two different syntaxes for TOP N / LIMIT N because different vendors went different ways. The fact that some RDBMSes provide locking hint mechanics and some don't (at least not reliably). The fact that there's no standard set of "library" functions defined anywhere, so porting between databases requires a lot of validation work. It makes portability hard, and some of those features are missing from standards.

You'll note I also mentioned that if they want to add it that's fine but it's gonna wind up being a point of contention in a lot of places. That's because I've seen the same thing happen with the "Big Data" vs "what we have works" crowd.

Having select up front avoids problems in a couple key ways:

1. App devs who are working on their application can immediately see what fields they should expect in their resultset. For CRUD, it's probably usually just whatever fields they selected or `*` because everyone's in the habit of asking for every field they'll never use.

2. Troubleshooting problems is far easier because they almost always stem from a field in the projection. Seeing the projected field list (and thus, table aliases that field comes from) are literally the first pieces of information you need (what field is it and where does that field come from) to start troubleshooting. This is why SELECT ... FROM makes the most sense -- it's literally the two most crucial pieces of information right up front.

3. Query planners already optimize and essentially compile the entire thing anyways, so legibility trumps other options IME.

Another point I'd make to you and everyone else bringing up autocomplete: If you need it, nothing is stopping you from writing your FROM clause first and then moving a line up to write your SELECT. Kinda like how you might stub out a function definition and later add arguments. This doesn't affect the final form for legibility.

nsonha1y ago

> becomes clear why SELECT first won out originally: legibility and troubleshooting

nothing "becomes clear" just by you claiming so, better elaborate

jshute44441y ago

For examples of larger queries, see here for all TPC-H queries in standard syntax and converted to pipe syntax: https://github.com/google/zetasql/blob/master/zetasql/exampl...

And several more examples with pipe syntax here: https://github.com/google/zetasql/blob/master/zetasql/exampl...

WorldMaker1y ago

> Once you do that, it becomes clear why SELECT first won out originally: legibility and troubleshooting.

Select first was as much an accident of "it sounded better as an English sentence" to the early SQL designers. Plus also they were working with early era parsers with very limited look ahead and putting the primary "verb" up front was important at the time.

But English is very flexible, especially in "command syntax" and From first is surprisingly common: "From the middle cupboard, grab a plate". SQL trying to sound like English here only shows how inflexible it still is in comparison to actual English.

I've been using C#'s LINQ since it was added to the language in 2007 and the from/where/join/group by/select order feels great, is very legible especially because it gives you great autocomplete support, and troubleshooting is easier than people think.

mixedCase1y ago

https://prql-lang.org/ has a bunch of good examples on its home page.

If you engage the syntax with your System 2 thinking (prefrontal cortex, slow, the part of thinking we're naturally lazy to engage) rather than System 1 (automated, instinctual, optimized brain path to things we're used to) you'll most likely find that it is simpler, makes more logical sense so that you're filtering down things naturally like a sieve and composes far better than SQL as complexity grows.

After you've internalized that, imagine the kind of developer tooling we can build on top of that logical structure.

meepmorp1y ago

> If you engage the syntax with your System 2 thinking (prefrontal cortex, slow, the part of thinking we're naturally lazy to engage) rather than System 1 (automated, instinctual, optimized brain path to things we're used to)

You might not have intended it this way, but your choice of phrasing is very condescending.

mixedCase1y ago

Re-reading it I can see how it could be perceived by some people as such, thanks for pointing it out. There's probably better phrasing or adding more context could make it more amicable:

The goal was to explicitly tell people not to bother "just reading it" as one (and by one I mean myself and most people I know, surely there are exceptions) is naturally inclined to do unless something is particularly piquing our interest.

Without engaging in active, conscious effort, syntax that is different than what we're used to (specially something as established as SQL) where the changes aren't groundbreaking at first glance can easily make us dismissive without realizing the benefits. And after seeing it too many times with all kinds of technologies that stray away from the familiar, I just want to prepare the reader so that their judgment can be formed with full use of their faculties rather than a reflex response.

BeefWellington1y ago

Edit: In my pre-coffee rush this morning I completely missed the grouping by role (which is not that much harder FWIW). This unfortunately invalidates my entire post as it was posted and I don't want to spread misinfo.

fader1y ago

I don't think your alternatives actually solve the same problem. Your alternatives would give you the single most recently joined employee. The actual problem being solved is to find the most recently joined employee in each role.

You'd need to do some grouping in there to be able to get one employee per role instead of a single employee out of the whole data set.

1 more reply

summerlight1y ago

As a test, I refactored a 500 line-ish analytical query that joins more than 20 tables with tens of complex CTE and I can say that this FROM-first syntax is superior than the legacy syntax on almost every single aspect.

bvrmn1y ago

> SELECT first won out originally: legibility and troubleshooting.

It quite interesting to dive into history of SQL alternatives in 70x/80x.

WesolyKubeczek1y ago

> Once you do that, it becomes clear why SELECT first won out originally: legibility and troubleshooting.

Also, tools can trivially tell DQL from DML by the first word they encounter, barring data-modifying functions (o great heavens, no!).

otabdeveloper41y ago

FROM order is, like, the least offensive and least wrong thing about SQL.

Bikeshedding par excellence.

urbandw311er1y ago

Title should probably be changed, since the article is about using AI to convert a PDF to semantic HTML.

simonw1y ago

A surprising problem I'm seeing with maintaining a link blog is that articles from it occasionally get submitted to Hacker News, where people inevitably call them out as not being as appropriate as the source they are linking to - which is fair enough! That's why I don't tend to submit them myself.

This particular post quickly turned into a very thinly veiled excuse for me to complain about PDFs, then demonstrate a Gemini Pro trick.

In this case I converted to HTML - I've since tried converting a paper to Markdown and sharing in a Gist, which I think worked even better: https://gist.github.com/simonw/46a33d66e069efe5c10b63625fdab... - notes here https://simonwillison.net/2024/Aug/27/distro/

llimllib1y ago

Have you seen gist.io?

If you replace `gist.github.com/<user>/<id>` -> `https://gist.io/@<user>/<id>`, you get a gist with nice typography.

https://gist.io/@simonw/46a33d66e069efe5c10b63625fdabb4e is the same gist you linked, but nicer to read

simonw1y ago

That's pretty neat! I like that it's run by a GitHub employee too (presumably as a side-project, but still) - makes me less nervous about the domain name blinking out of existence one day.

1 more reply

yarg1y ago

This reminds me .NET's short lived Linq to SQL;

There was a talk at the time, but I can't find the video: http://jaoo.dk/aarhus2007/presentation/Using+LINQ+to+SQL+to+....

Basically, it was a way to cleanly plug SQL queries into C# code.

It used this sort of ordering (where the constraints come after the thing being constrained); it needed to do so for IntelliSense to work.

cyberax1y ago

"Short-lived"? LINQ is very much alive in the C# ecosystem.

And FROM-first syntax absolutely makes more sense, regardless of autocomplete. You should put the "what I need to select" after the "what I'm selecting from", in general.

yarg1y ago

LINQ yes, but they killed off the component not long after introducing it.

jiggawatts1y ago

It was replaced by Entity Framework.

BartjeD1y ago

Linq to sql still lives

dragonwriter1y ago

> This reminds me .NET's short lived Linq to SQL;

"Short lived"? Its still alive, AFAIK, and the more popular newer thing for the same use case, Linq to Enntities, has the same salient features but (because it is tied to Entity Framework and not SQL Server specific) is more broadly usable.

yarg1y ago

It was in 3.5 only.

If they've replaced it with something else in the last decade and a half that does not mean that they didn't get rid of it, or that it wasn't short lived.

https://learn.microsoft.com/en-us/dotnet/framework/data/adon...

LeonB1y ago

Yeh. Linq to sql was a much more lightweight extension than EF, and was killed due to internal warring at MS.

Database people were investing a lot of time and energy on doing things “properly” with EF, and this scrappy little useful tool, linq to sql, was seen as a competitor.

1 more reply

plusplusungood1y ago

LINQ is not the same as LINQ-to-SQL. The former is a language feature, the latter a library (one of many) that uses that feature.

1 more reply

neonsunset1y ago

There is https://github.com/linq2db/linq2db which is LINQ to SQL reincarnated.

Of course there's EF Core too.

WorldMaker1y ago

And NHibernate.Linq and Dapper.Extensions.Linq… Most ORMs in the ecosystem have at least one Linq support library, even if just a third-party extension.

Also, there are fun things that support Linq syntax for non-ORM uses, too, such as System.Reactive.Linq and LanguageExt: https://github.com/louthy/language-ext/wiki/How-to-deal-with...

mav3ri3k1y ago

The first piped query language I used was Nushell's implementation of wide-column tables. PRQL offers almost similar approach which I have loved dearly. It also maps to different SQL dialects. There is also proposal to work on type system: https://github.com/PRQL/prql/issues/381.

Google has now proposed a syntax inspired by these approaches. However, I am afraid how well it would be adopted. As someone new to SQL, nearly every DB seem to provide its own SQL dialect which becomes cumbersome very quickly.

Whereas PRQL feels something like Apache Arrow which can map to other dialects.

0xbadcafebee1y ago

As to the writer's problem with PDFs on the web: they aren't for reactive web app viewing on mobile phones. Not everything has to be. If you reeeeeeeally need to read that research paper, find a screen that's bigger than 3" wide.

jillesvangurp1y ago

I think his point is that Google is a web company. And a mobile phone company. And they publish a lot of stuff in a format that's basically optimized for print and kind of useless for anything else.

I did my PhD more than 20 years ago and it was annoying then to be working with all these postscript and pdf documents. It's still annoying. These days people publish content in PDF form on websites and mostly not in printed media. People might print these or not. Twenty years ago, I definitely did. But it's weird how we stick with this. And PDFs are of course very unstructured and hard to make sense of programmatically as well.

I bet a lot of modern day scientists don't actually print the articles they read anymore and instead read them on screen or maybe on some ipad or e-reader. Print has become an edge case. Reading a pdf on a small e-reader is not ideal. Anything with columns is kind of awkward to deal with. There's a reason why most websites don't use columns: it kind of sucks as a UX. The optimal form to deliver text is in a responsive form that can adapt to any screen size where you can change the font size as well. A lot of scientific paper layouts are optimized to conserve a resource that is no longer relevant: paper real estate. Tiny fonts, multiple columns, etc.

Anyway, I like Simon's solution and how it kind of works. It's kind of funny how some of these LLMs can be so lazy. The thing with the references being omitted is hilarious. I see the same with chat gpt where it goes out of its way to never do exactly as you asked and instead just give you bits and pieces of what you ask for until you beg it to just please FFing do as you're told?! I guess they are trying to save some tokens or GPU time.

simonw1y ago

Why shouldn’t I read research papers on my phone? That’s where I read almost everything else.

adrian_b1y ago

Even when reading on the phone, I do not understand the complaint against the two-column format.

The one-column format is fine on a large monitor, but on a small phone I prefer narrower columns, because a wide column would either make the text too small or it would require horizontal panning while reading.

So I consider the two-column format as better for phones, not worse.

9dev1y ago

One of the most complex and battle-tested open source projects is essentially a rendering engine for semantic text that has supported reflowing text to fit the screen for decades. And now you’re seriously considering having to zoom in on a column, then scrolling all the way back up and right to the next column, then down to the footnotes at the bottom, then to a random figure, to be a solution?

2 more replies

slaymaker19071y ago

I actually work on SQL Server, but I also write a lot of KQL queries which also work this way and I totally agree that the sequential pipe stuff is easier to write. I haven't read through the whole paper, but one aspect that I really like is that I think it's easier to guide the query optimization in this sequential style.

beart1y ago

Is there any internal inertia for such changes to SQL server?

WorldMaker1y ago

Given how Entity Framework is quite ubiquitous as "the ORM of choice" for SQL Server and its usage of C# Linq, there's certainly external momentum, whether or not SQL Server devs themselves are paying attention to how the majority of their users are writing queries today.

donatj1y ago

I've been writing SQL for something like 25 years and always thought the columns being SELECTed should have come last, not first. Naming your sources before what you're trying to get from them to me at least makes much more logical sense. Calling aliased table names before I have done the aliasing is weird.

Also it would make autocomplete in intelligent IDEs much more helpful when typing a query out from nothing.

victorbjorklund1y ago

Looks just like writing sql using Ecto in Elixir:

"users" |> where([u], u.age > 18) |> select([u], u.name)

https://hexdocs.pm/ecto/Ecto.Query.html

h0l0cube1y ago

Thought this too. The example queries look very much like Ecto statements. I miss the ergonomics and flexibility of Ecto when I use database wrappers on other platforms.

chubot1y ago

The next thing I would like is to define a function / macro that has a bunch of |> terms.

I pointed out that you can do this with shell:

Pipelines Support Vectorized, Point-Free, and Imperative Style https://www.oilshell.org/blog/2017/01/15.html

e.g.

    hist() {
      sort | uniq -c | sort -n -r
    }

    $ { echo a; echo bb; echo a; } | hist
      1 bb
      2 a

    $ foo | hist
    ...

Something like that should be possible in SQL!

jshute44441y ago

It is, using table-valued functions (TVFs).

There's an example at the bottom of this file:

https://github.com/google/zetasql/blob/master/zetasql/exampl...

chubot1y ago

That's cool, thanks!

What about scalar valued functions? :) So I can reuse an expression in a WHERE and so forth

(and I appreciate that HAVING can be generalized/removed)

wvenable1y ago

I didn't see this the first time:

    GROUP AND ORDER BY component_id DESC;

Is this kind of syntax combining grouping and ordering really necessary in addition the pipe operator? My advice would be to add the pipe operator and not get fancy adding other syntax to SQL as well.

bvrmn1y ago

It could be a custom zetasql extension leaked into the paper.

minkles1y ago

That is basically R with tidyverse.

  flights |>
    filter(
      carrier == "UA",
      dest %in% c("IAH", "HOU"),
      sched_dep_time > 0900,
      sched_arr_time < 2000
      ) |>
    group_by(flight) |>
    summarize(
      delay = mean(arr_delay, na.rm = TRUE),
      cancelled = sum(is.na(arr_delay)),
      n = n()
      ) |>
    filter(n > 10)

If you haven't used R, it has some serious data manipulation legs built into it.

dan-robertson1y ago

An interesting thing to me about all these dplyr-style syntaxes is that Wickham thinks the group_by operator was a design mistake. In modern dplyr you can often specify a .by on an operation instead. I found switching to this style a pretty easy adjustment, and I think it’s a bit better. Example:

  d |> filter(id==max(id),.by=orderId)

I think PRQL were thinking a bit about ways to avoid a group_by operation and I think what they have is a kind of ‘scoped’ or ‘higher order’ group_by operation which takes your grouping keys and a pipeline and outputs a pipeline step that applies the inner pipeline to each group.

_Wintermute1y ago

Given 10 more years dplyr syntax might resemble data.table's

countrymile1y ago

My thoughts exactly, it even uses the same pipe syntax, though I do prefer `%>%`. I've been avoiding SQL for a while now as it feels so clunky next to the tidyverse

AdieuToLogic1y ago

If anyone is interested in the theoretical background to the thrush combinator, a.k.a. "|>", here is one using Ruby as the implementation language:

https://leanpub.com/combinators/read#leanpub-auto-the-thrush

Being a concept which transcends programming languages, a search for "thrush combinator" will yield examples in several languages.

wslh1y ago

I find this [1] from this [2]. Seems like a good explanation. It doesn't exist on Wikipedia though.

[1] https://github.com/raganwald-deprecated/homoiconic/blob/mast...

[2] https://stackoverflow.com/a/285973/88231

AdieuToLogic1y ago

A key thing to keep in mind is that the thrush combinator is a fancy name for a simple construct. The semantics it provides is a declarative form of traditional function composition.

For example, given the expression:

  f (g (h (x)))

The same can be expressed in languages which support the "|>" infix operator as:

  h (x) |> g |> f

There are other, equivalent, constructs such as the Cats Arrow[0] type class available in Scala, the same Arrow[1] concept available in Haskell, and the `andThen` method commonly available in many modern programming languages.

0 - https://typelevel.org/cats/typeclasses/arrow.html

1 - https://wiki.haskell.org/Arrow_tutorial

Ericson23141y ago

We should really standardize a core language for SQL. Rust has MIR, Clang is making a CIR for C/C++. Once we have that, we'll be able to to communicate much better.

Right now, it's everyone faffing around with different mental models and ugly single pass compilers (my understanding is that parsing-->query planning is not nearly as well-separated in most DBs as parsing-->optomize-->codegen in most compilers).

anothername121y ago

> We should really standardize a core language for SQL

Do you mean something other than ISO/IEC 9075:2023 (the 9th edition of the SQL standard)?

roenxi1y ago

It costs 194 CHF to read. There is room for improvement.

Ericson23141y ago

A core language is a minimal AST without surface syntax (and thus no bikeshedding of that) that distills the surface language to its essence.

Ericson23141y ago

SQL is basically the list monad, with various quotients / refinements:

- Sometimes the order doesn't matter - Sometimes there are functional dependencies - Sometimes one knows the length of the list in question is 1 (foreign key constraints)

rrrrrrrrrrrryan1y ago

ANSI SQL is very much a thing, and you should strive to keep your queries as close as possible to standard SQL as your database engine allows, if you want those queries to be portable to other database technology in the future.

yencabulator1y ago

182 comments

samwillis1y ago

Richard Hipp, creator of SQLite, has implemented this in an experimental branch: https://sqlite.org/forum/forumpost/5f218012b6e1a9db

Worth reading the thread, there are some good insights. It looks like he will be waiting on Postgres to take the initiative on implementing this before it makes it into a release.

simonw1y ago

That comment where he explains why he's not rushing to add new unproven SQL syntax to SQLite is fascinating:

anitil1y ago

It's so ambitious in an almost boring way, exactly the right steward for a project like this

maxbond1y ago

I choose never to meet this man and be disabused of this notion. Shine on, doctor.

1 more reply

Blackthorn1y ago

FROM first would be nothing short of incredible. I can only hope that Postgres and others can find it within themselves to get together and standardize on such an extension!

pradeepchhetri1y ago

This syntax looks a lot like PRQL. ClickHouse supports writing queries in PRQL dialect. Moreover, ClickHouse also supports Kusto dialect too.

https://clickhouse.com/docs/en/guides/developer/alternative-...

willvarfar1y ago

Yeap I didn't know DuckDB supported it already!

  FROM table  <-- at this point there is an implicit SELECT *
  SELECT whatever
  WHERE some_filter
  WHERE another_filter <-- this is like AND
  AGGREGATE something
  WHERE a_filter_that_is_after_grouping <-- is like HAVING
  ORDER BY ALL <-- group-by-all is great in engines that support it; want it for ordering too

...

croes1y ago

A special keyword like HAVING prevents erros by typing in the wrong line.

How is OR done with this WHERES?

aidos1y ago

What’s group-by-all? Sounds like distinct?

2 more replies

quartesixte1y ago

What exactly is the history of having FROM be the second item, and not the first? Because FROM first seems more intuitive and actually the way you write out queries.

Really hope this takes off and gets more widespread adoption because I really want to stop doing:

  SELECT *
  FROM all_the_joins

into

  SELECT {my statements here}
  FROM all_the_joins

bvrmn1y ago

It's funny how he addresses the new syntax as "from-clause-first". Like a very minor change with a low value.

Cthulhu_1y ago

tehlike1y ago

LINQ, PRQL, Kusto has all preceeded this.

While LINQ is mostly restricted to .NET, PRQL is not. https://prql-lang.org/

It's a welcome change in the industry.

I made this prediction a couple years back: https://x.com/tehlike/status/1517533067497201666

numbsafari1y ago

hn_throwaway_991y ago

I've been following PRQL for some time now since it first got good traction on HN and I like it a lot, but I'm really hoping this pipe syntax from Google takes off for a couple of reasons:

lupire1y ago

It's wild that the enterprise and connected world has moved on from forcing COBOL compatibility for modern projects, but still insists on SQL compatibility.

andrewguy91y ago

I’m a big kusto user, and it’s wonderful to have pipes in a query language.

If you haven’t tried it, it’s great!

tehlike1y ago

I have not tried it, but I used to be a .net developer and worked a lot with LINQ (and contributed a bit to NHibernate and its Linq provider) and I am a big fan of the approach.

Kusto does seem interesting too, and i think some of the stuff i want to build will find a use for it!

Salgat1y ago

LINQ is so incredibly intuitive. I wonder if this will make creating C# LINQ providers for databases that support this syntax easier.

kbouck1y ago

Indeed. Elastic has also recently released a piped query language called ES|QL. Feels similar to Kusto.

I find piped queries both easier to write, and read.

anonzzzies1y ago

Not having LINQ is a terrible inconvenience everywhere. Most languages have libs that try to hack something similar, but it usually simply isn't.

mrits1y ago

It's a lot easier to design a good DSL when it doesn't have to be compatible with anything

anonzzzies1y ago

Well, .NET was already used a lot when it was built in a few decades ago.

oaiey1y ago

Is "from" keyword originating from .NET (Framework 3.5 in 2007) or is this pre-existing somewhere in research?

aragonite1y ago

> This remains a long-standing pet peeve of mine. PDFs like this are horrible to read on mobile phones, hard to copy-and-paste from ...

crazygringo1y ago

> Is it due to something inherent in PDF technology?

Exactly. PDF doesn't have instructions to say "render this paragraph of text in this box", it has instructions to say "render each of these glyphs at each of these x,y coordinates".

A lot also depends on what software produced the PDF, which can make it easier or harder to extract the text.

vips7L1y ago

My favorite is when they do bold by duplicating and slightly shifting the letters. Bboolldd. PDFs are hell.

lupire1y ago

That's inherited from the original Portable Document Format for machines - the typewriter instructions.

spatulon1y ago

kccqzy1y ago

They do but such annotations are optional.

jonathanyc1y ago

PDF natively supports selectable/extractable text. Section 9.10 of ISO 32000 is literally “Extraction of Text Content.” I’ve implemented it myself in production software.

jahewson1y ago

mjevans1y ago

ligatures like fi fl ffi ffl etc are for changes in fonts specific to rendering correctly on a screen or printer. It's intended to be a _rendered_ format, rather than a parse-able format.

Well formatted epub and HTML generally are usually intended to update to end user needs and better fit available layout space.

WorldMaker1y ago

jahewson1y ago

1 more reply

lupire1y ago

That's fine, but a good compiled format should also include a source map for accessibility.

0cf8612b2e1e1y ago

It is a shame that CSS pagination is still a mess. Not that I like CSS, but it would go a long way towards unlocking some layouts from PDF.

jamesfinlayson1y ago

Agreed - I used CSS to lay out a book a couple of years ago and it wasn't too bad, but the things that have poor support/don't work at all (like page numbers) are a pain to hack around.

ericjmorey1y ago

XPS solved a lot of the problems with PDF, but Microsoft couldn't reach a critical level of adoption to let network effects take hold.

However, I don't know if XPS handles the copying of text better.

meindnoch1y ago

If a PDF doesn't support text extraction, it's the fault of the software that created it. Most likely the software didn't include the glyph → Unicode character mapping in the PDF.

summerlight1y ago

Previous submissions on the paper itself:

https://news.ycombinator.com/item?id=41321876 (first) https://news.ycombinator.com/item?id=41338877 (plenty of discussions)

FridgeSeal1y ago

Kinda looks like a half-assed version of what PRQL does. Like, if we’re going to have nonstandard sql, let’s just fix a whole bunch of things, not just one or two?

summerlight1y ago

> Like, if we’re going to have nonstandard sql, let’s just fix a whole bunch of things, not just one or two?

chubot1y ago

This is addressed in the paper -- it's nice to have something deployable in existing SQL languages, and it also doesn't rule out using PRQL

hn_throwaway_991y ago

> Kinda looks like a half-assed version of what PRQL does. Like, if we’re going to have nonstandard sql, let’s just fix a whole bunch of things, not just one or two?

andrewshadura1y ago

You can't "just" extend the IPv4 address space while keeping the compatibility.

1 more reply

scrlk1y ago

There was a second submission of the paper, which attracted more comments: https://news.ycombinator.com/item?id=41338877

summerlight1y ago

Thank you, added it to my comment. I missed all the discussions!

BeefWellington1y ago

Every time this FROM-first syntax style crops up it's always the most basic simple query (one table, no projections / subselects / consideration to SP/Views).

Once you do that, it becomes clear why SELECT first won out originally: legibility and troubleshooting.

dietr1ch1y ago

garrettgarcia1y ago

> Sounds a bit like "new thing scary" unless you show why having select in front actually avoids problems

I think the onus of proof that SQL clauses are in the wrong order is on the people who claim they're in the wrong order.

Sankozi1y ago

But it has been explained many times from many angles.

* SELECT first makes autocomplete hard

* SELECT first is the only out of order clause in the SQL statement when you look at it from execution perspective

* you cannot use aliases defined in SELECT in following clauses

* in some places SELECT is pointless but it is still required (to keep things consistent?)

Probably many more.

1 more reply

mnsc1y ago

This is a case where stating your opinion and credentials will make you sound really old and conservative so it will be easy to take cheap shots like "you are just afraid of change".

So just go 'SELECT *...' and go back and expand later, I want my sql syntax "simple". /old developer

BeefWellington1y ago

Having select up front avoids problems in a couple key ways:

3. Query planners already optimize and essentially compile the entire thing anyways, so legibility trumps other options IME.

nsonha1y ago

> becomes clear why SELECT first won out originally: legibility and troubleshooting

nothing "becomes clear" just by you claiming so, better elaborate

jshute44441y ago

For examples of larger queries, see here for all TPC-H queries in standard syntax and converted to pipe syntax: https://github.com/google/zetasql/blob/master/zetasql/exampl...

And several more examples with pipe syntax here: https://github.com/google/zetasql/blob/master/zetasql/exampl...

WorldMaker1y ago

> Once you do that, it becomes clear why SELECT first won out originally: legibility and troubleshooting.

mixedCase1y ago

https://prql-lang.org/ has a bunch of good examples on its home page.

After you've internalized that, imagine the kind of developer tooling we can build on top of that logical structure.

meepmorp1y ago

You might not have intended it this way, but your choice of phrasing is very condescending.

mixedCase1y ago

Re-reading it I can see how it could be perceived by some people as such, thanks for pointing it out. There's probably better phrasing or adding more context could make it more amicable:

BeefWellington1y ago

fader1y ago

You'd need to do some grouping in there to be able to get one employee per role instead of a single employee out of the whole data set.

1 more reply

summerlight1y ago

bvrmn1y ago

> SELECT first won out originally: legibility and troubleshooting.

It quite interesting to dive into history of SQL alternatives in 70x/80x.

WesolyKubeczek1y ago

> Once you do that, it becomes clear why SELECT first won out originally: legibility and troubleshooting.

Also, tools can trivially tell DQL from DML by the first word they encounter, barring data-modifying functions (o great heavens, no!).

otabdeveloper41y ago

FROM order is, like, the least offensive and least wrong thing about SQL.

Bikeshedding par excellence.

urbandw311er1y ago

Title should probably be changed, since the article is about using AI to convert a PDF to semantic HTML.

simonw1y ago

This particular post quickly turned into a very thinly veiled excuse for me to complain about PDFs, then demonstrate a Gemini Pro trick.

llimllib1y ago

Have you seen gist.io?

If you replace `gist.github.com/<user>/<id>` -> `https://gist.io/@<user>/<id>`, you get a gist with nice typography.

https://gist.io/@simonw/46a33d66e069efe5c10b63625fdabb4e is the same gist you linked, but nicer to read

simonw1y ago

That's pretty neat! I like that it's run by a GitHub employee too (presumably as a side-project, but still) - makes me less nervous about the domain name blinking out of existence one day.

1 more reply

yarg1y ago

This reminds me .NET's short lived Linq to SQL;

There was a talk at the time, but I can't find the video: http://jaoo.dk/aarhus2007/presentation/Using+LINQ+to+SQL+to+....

Basically, it was a way to cleanly plug SQL queries into C# code.

It used this sort of ordering (where the constraints come after the thing being constrained); it needed to do so for IntelliSense to work.

cyberax1y ago

"Short-lived"? LINQ is very much alive in the C# ecosystem.

And FROM-first syntax absolutely makes more sense, regardless of autocomplete. You should put the "what I need to select" after the "what I'm selecting from", in general.

yarg1y ago

LINQ yes, but they killed off the component not long after introducing it.

jiggawatts1y ago

It was replaced by Entity Framework.

BartjeD1y ago

Linq to sql still lives

dragonwriter1y ago

> This reminds me .NET's short lived Linq to SQL;

yarg1y ago

It was in 3.5 only.

If they've replaced it with something else in the last decade and a half that does not mean that they didn't get rid of it, or that it wasn't short lived.

https://learn.microsoft.com/en-us/dotnet/framework/data/adon...

LeonB1y ago

Yeh. Linq to sql was a much more lightweight extension than EF, and was killed due to internal warring at MS.

Database people were investing a lot of time and energy on doing things “properly” with EF, and this scrappy little useful tool, linq to sql, was seen as a competitor.

1 more reply

plusplusungood1y ago

LINQ is not the same as LINQ-to-SQL. The former is a language feature, the latter a library (one of many) that uses that feature.

1 more reply

neonsunset1y ago

There is https://github.com/linq2db/linq2db which is LINQ to SQL reincarnated.

Of course there's EF Core too.

WorldMaker1y ago

And NHibernate.Linq and Dapper.Extensions.Linq… Most ORMs in the ecosystem have at least one Linq support library, even if just a third-party extension.

Also, there are fun things that support Linq syntax for non-ORM uses, too, such as System.Reactive.Linq and LanguageExt: https://github.com/louthy/language-ext/wiki/How-to-deal-with...

mav3ri3k1y ago

Whereas PRQL feels something like Apache Arrow which can map to other dialects.

0xbadcafebee1y ago

jillesvangurp1y ago

I think his point is that Google is a web company. And a mobile phone company. And they publish a lot of stuff in a format that's basically optimized for print and kind of useless for anything else.

simonw1y ago

Why shouldn’t I read research papers on my phone? That’s where I read almost everything else.

adrian_b1y ago

Even when reading on the phone, I do not understand the complaint against the two-column format.

So I consider the two-column format as better for phones, not worse.

9dev1y ago

2 more replies

slaymaker19071y ago

beart1y ago

Is there any internal inertia for such changes to SQL server?

WorldMaker1y ago

donatj1y ago

Also it would make autocomplete in intelligent IDEs much more helpful when typing a query out from nothing.

victorbjorklund1y ago

Looks just like writing sql using Ecto in Elixir:

"users" |> where([u], u.age > 18) |> select([u], u.name)

https://hexdocs.pm/ecto/Ecto.Query.html

h0l0cube1y ago

Thought this too. The example queries look very much like Ecto statements. I miss the ergonomics and flexibility of Ecto when I use database wrappers on other platforms.

chubot1y ago

The next thing I would like is to define a function / macro that has a bunch of |> terms.

I pointed out that you can do this with shell:

Pipelines Support Vectorized, Point-Free, and Imperative Style https://www.oilshell.org/blog/2017/01/15.html

e.g.

    hist() {
      sort | uniq -c | sort -n -r
    }

    $ { echo a; echo bb; echo a; } | hist
      1 bb
      2 a

    $ foo | hist
    ...

Something like that should be possible in SQL!

jshute44441y ago

It is, using table-valued functions (TVFs).

There's an example at the bottom of this file:

https://github.com/google/zetasql/blob/master/zetasql/exampl...

chubot1y ago

That's cool, thanks!

What about scalar valued functions? :) So I can reuse an expression in a WHERE and so forth

(and I appreciate that HAVING can be generalized/removed)

wvenable1y ago

I didn't see this the first time:

    GROUP AND ORDER BY component_id DESC;

bvrmn1y ago

It could be a custom zetasql extension leaked into the paper.

minkles1y ago

That is basically R with tidyverse.

  flights |>
    filter(
      carrier == "UA",
      dest %in% c("IAH", "HOU"),
      sched_dep_time > 0900,
      sched_arr_time < 2000
      ) |>
    group_by(flight) |>
    summarize(
      delay = mean(arr_delay, na.rm = TRUE),
      cancelled = sum(is.na(arr_delay)),
      n = n()
      ) |>
    filter(n > 10)

If you haven't used R, it has some serious data manipulation legs built into it.

dan-robertson1y ago

  d |> filter(id==max(id),.by=orderId)

_Wintermute1y ago

Given 10 more years dplyr syntax might resemble data.table's

countrymile1y ago

My thoughts exactly, it even uses the same pipe syntax, though I do prefer `%>%`. I've been avoiding SQL for a while now as it feels so clunky next to the tidyverse

AdieuToLogic1y ago

If anyone is interested in the theoretical background to the thrush combinator, a.k.a. "|>", here is one using Ruby as the implementation language:

https://leanpub.com/combinators/read#leanpub-auto-the-thrush

Being a concept which transcends programming languages, a search for "thrush combinator" will yield examples in several languages.

wslh1y ago

I find this [1] from this [2]. Seems like a good explanation. It doesn't exist on Wikipedia though.

[1] https://github.com/raganwald-deprecated/homoiconic/blob/mast...

[2] https://stackoverflow.com/a/285973/88231

AdieuToLogic1y ago

A key thing to keep in mind is that the thrush combinator is a fancy name for a simple construct. The semantics it provides is a declarative form of traditional function composition.

For example, given the expression:

  f (g (h (x)))

The same can be expressed in languages which support the "|>" infix operator as:

  h (x) |> g |> f

0 - https://typelevel.org/cats/typeclasses/arrow.html

1 - https://wiki.haskell.org/Arrow_tutorial

Ericson23141y ago

We should really standardize a core language for SQL. Rust has MIR, Clang is making a CIR for C/C++. Once we have that, we'll be able to to communicate much better.

anothername121y ago

> We should really standardize a core language for SQL

Do you mean something other than ISO/IEC 9075:2023 (the 9th edition of the SQL standard)?

roenxi1y ago

It costs 194 CHF to read. There is room for improvement.

Ericson23141y ago

A core language is a minimal AST without surface syntax (and thus no bikeshedding of that) that distills the surface language to its essence.

Ericson23141y ago

SQL is basically the list monad, with various quotients / refinements:

- Sometimes the order doesn't matter - Sometimes there are functional dependencies - Sometimes one knows the length of the list in question is 1 (foreign key constraints)

rrrrrrrrrrrryan1y ago

yencabulator1y ago