Table Oriented Programming (2002) (opens in new tab)

(web.archive.org)

126 pointsmabynogy4y ago87 comments

87 comments

I remember reading this essay when it first came out. To try and reword it using modern terms: The author wishes that programming languages had database persistence capabilities as 1st-class built-in syntax instead of cumbersome bolted-on API functions.

Examples where database syntax (i.e. SQL syntax) is 1st-class without noisy syntax of function calls, without command strings in quotes, etc :

- business languages like COBOL

- programming languages in ERP systems like SAP ABAP, Oracle Financials

- stored procedural languages inside the RDBMS engine such as T-SQL in MS SQL Server, PL/SQL in Oracle, sp in MySQL

In the above, the "database" is the world the programming language is working in.

The more general purpose programming languages like C++, Java, Javascript, Python omit db manipulation as a core language feature. This 2nd-class status requires 3rd-party libs, which means have extra ceremony syntax of #includes, imports, function calls with parentheses, etc. Some try to reduce the cumbersome syntax friction with ORMs. In contrast with something like SAP ABAP, the so-called "ORM" is already built in to process data tables without any friction.

The author works a lot on CRUD apps so a language that has inherent db syntax would enable "Table-Oriented-Programming".

But we can also twist the author's thesis around. A programmer is coding in SAP ABAP or a stored-procedure in MySQL and wonders why raw memory access where contiguous blocks of RAM can be changed with pointers is not easy to code in those languages. So an essay is written about advantages of "pointer-oriented-programming" because direct memory writes are really convenient for frame buffers in video games, etc.

In any case, I don't see any trend where a general-purpose programming language will include DB SQL as 1st-class. Even the recent languages like Rust and Zig don't have basic SQLite3 db persistence as convenient built-in syntax. If anyone proposed to add such db syntax, they would most likely reject it.

mwexler4y ago

I think you are missing the big twist. It's not just tables as 1st class citizens, but allowing logic to be driven by the tables.

Instead of config files, you update the table. Changes to the processing flow? Update the table, including dates for when the new rules apply. The tables held code which drove the processing, along with tables holding data.

It's not just orm or persistence, and not just programming in the database as stored procedures. It was an odd melange of all of this.

I ran into this in the 90s, and it was great for RAD. But it felt odd to have to code into tables, and each tool was proprietary such that moving off of the table system was a full rebuild. They usually allowed migration to new databases systems to scale, but that was all they had.

I don't expect to see a language like this come around again anytime soon, but the ideas were really interesting in a world before git-ops and yaml configs.

jasode4y ago

>, but allowing logic to be driven by the tables. Instead of config files, you update the table. Changes to the processing flow? Update the table, [...]

I didn't miss that angle and I think it's actually a minor part of his thesis. If you look at the entire essay, the vast majority of his bullet points and supporting examples are mostly about ergonomics of builtin syntax to manipulate tables. If his ideal _language_ (aka the syntax) did that, it would naturally support table-oriented-programming (aka the philosophy). He starts the essay with critique of OOP-the-syntax.

But to your point about config and code itself being persisted in the database, the SAP ABAP environment already works like that. SAP has over 10,000 db tables for configuration -- instead of YAML or JSON files. Change the values in the config tables to alter behavior instead of modifying IF/THEN/ENDIF statements in code. And when ABAP programmers hit "save", the code gets saved to a database table instead of a text file. So if one squints a certain way, the SAP system is a giant million-line stored procedure in the database.

mwexler4y ago

Well put. SAP is the big survivor from that mindset. And perhaps the best example of why we won't see another. :-)

ModernMech4y ago

> In any case, I don't see any trend where a general-purpose programming language will include DB SQL as 1st-class. Even the recent languages like Rust and Zig don't have basic SQLite3 db persistence as convenient built-in syntax.

There is a trend, but you’ll have to look further off the beaten path than even Zig to find it. Languages like Eve [0] tried to do this circa 2015 in the tradition of Datalog. Code was written in “blocks” that resembled Prolog horn clauses, but which featured set semantics on selected records. Natural joins happened automatically on records using identifiers. The whole language was actually a database!

Eve died [1], but you’ll see many such projects that have the same ethos in communities in the web, such as this one [2].

There aren’t a lot of users of these languages, but this is where a lot of big ideas are percolating right now.

And we can verify it’s a trend because the hallmark of all CS trends, the formation of a conference, has made itself known in this area [3].

0: http://witheve.com/

1: https://groups.google.com/g/eve-talk/c/YFguOGkNrBo?pli=1

2: https://futureofcoding.org/catalog/

3: https://www.hytradboi.com/

civilized4y ago

> I remember reading this essay when it first came out. To try and reword it using modern terms: The author wishes that programming languages had database persistence capabilities as 1st-class built-in syntax instead of cumbersome bolted-on API functions.

This reminds me of M/MUMPS, used by Epic to power the biggest EHR system by market share in the US.

Perhaps the big difference is that the M "database" is key-value structured. True tables are flat and do not distinguish part of the tuple as the "key" and part as the "value".

I wonder if this is the source of the oft-discussed "mismatch" between the programmer's model of data and the relational model of data. Programmers like to assign values to things, while relational DBs like to do CRUD operations on records. (This is sometimes called the "object-relational impedance mismatch" but I've always found this term silly - needlessly jargon-laden and scoped overly narrowly to the OO paradigm.)

There's clearly some kind of "isomorphism" or translation between the two models, but they're not quite the same.

Is this what ORMs are about? Translating between the programmer model of data and the relational DB model?

CRConrad4y ago

> Is this what ORMs are about? Translating between the programmer model of data and the relational DB model?

Pretty much, AFAICS.

With varying degrees of success.

nine_k4y ago

Such tools existed and were popular in 1990s: DBase, Clipper, FoxPro.

They worked pretty well in their domain: data entry and report generation, with lightweight transaction processing and general computation.

Then happened the internet and client-server architectures, and these do not map as neatly onto local, single-user, single-transaction tables.

CRConrad4y ago

> Such tools existed and were popular in 1990s: DBase, Clipper, FoxPro.

Also Crystal Reports, Paradox...

> They worked pretty well in their domain: data entry and report generation, with lightweight transaction processing and general computation.

Having bought Ashton Tate, Borland got DBase and Interbase in addition to Paradox, and built data access components into the VCL class library (in effect "almost-first-class citizens" of the language), which IMO made Delphi the natural and superior successor to those languages: Not just "lightweight", but fully advanced (i.e, ~C++-level) general computation. (And with transaction processing built into the RDBMS connection components.)

> Then happened the internet and client-server architectures, and these do not map as neatly onto local, single-user, single-transaction tables.

Weeelll... Seen the spate of recent posts on here about how SQLite is good enough for pretty much anything? :-) And arguably, that's where Delphi was at too, over twenty years ago: AFAICR, there was a "Fishbase" (facts about tropical fish) demo included with Delphi, which in one variant could be built as a standalone Web service / server.

Also, AFAICS, that's where Free Pascal / Lazarus is at now, only using SQLite / Firebird / MySQL / PostgreSQL (and lots of other DBMSes) in stead of DBase / Clipper / FoxPro / Crystal Reports / Paradox. (I've been planning to look into that a bit closer myself, but haven't got around to it. Procrastinating away too much of my time on Hacker News, I suppose. :-( )

CRConrad4y ago

Duh, forgot to mention: "Fishbase" was based on a Paradox (IIRC; could have been DBase) file.

Edit: Also, for FP/Lazarus that's probably SQLite and newer RDBMSes not "in stead of" but in addition to the old file formats and RDBMSes.

molsongolden4y ago

One of my favorite programs ever was built on DBase in the mid-90s.

Installing or moving the entire application and database to a new PC was as easy as "drag folder onto flash drive" -> "drag folder off of flash drive".

Instant loading, tiny file sizes, but it became too complicated for users when 64-bit windows pushed 16-bit usage into VMs.

badsectoracula4y ago

I have a ton of games, bought or downloaded from a variety of places, some even for different systems and some more than once (e.g. Steam or GOG or sometimes some Humble Bundle bundle giving a game i already had) and in some cases a different version. Most of them are on my external HDD.

One thing i want to do is to make a database of all of them since they're easily more than a thousand and i want it on my PC. I want to be able to have a title, description, tags, screenshots, overall category, target platform info, links to the setup/archive files, the versions i have, extras like wallpapers, ringtones, music or even perhaps any box pictures or ads if they are available (some sites like Zoom Platform do provide those with the games you buy), reviews, patches (official and unofficial), links to folders with any mods i might have, etc.

This sounds like something that back in the day would be perfect for dBase or Fox Pro, except for the part where i want it to be graphical (remember: i want screenshots, wallpapers, etc). But if those were made with a GUI in mind, i'd expect them to be perfect. So basically, i think Access would be it. But AFAIK Access is now dying, last time i checked the UI it seemed like a massive downgrade compared to what i remember from when i first saw it back in Access 97 days (look, some stuff really work better with nested windows / MDI applications - having a two field form or 5 field table take up the entire available application space when 90% of is empty space makes no sense when you could have a bunch of windows with those things visible at the same time). And it is limited on Windows and TBH it looks like a massive program anyway.

Some alternatives mentioned are too much "programming" and not enough "database-ing" IMO. Ideally i should only need to bother with code to cover for any missing functionality not provided by the GUI but most of the DB and UI design (for the forms, etc) and ideally most simple behavioral stuff should be done from the GUI without using any code.

But i don't think there is anything really like that and the modern computing world feels way too "incompatible" with what i have in mind - e.g. whenever i mention this to some friends of mine they start thinking in terms of wiring together stuff like SQLite (or even worse, PostgreSQL), Python/PHP/whatever, some web-based framework, etc and other "mass-of-unrelated-software-held-together-by-duct-tape" solutions, when what i really want is a completely self contained program with no external dependencies (aside from the basic stuff for showing a GUI, etc, i mean not requiring stuff like setting up a PostgreSQL DB), no servers, etc, just a binary/EXE, saving the DB somewhere on the filesystem (ideally in a single file like Access so it is easy to copy/backup/pass around), being able to interact with the rest of the OS (remember that bit about keeping track of setup programs / archives /etc ? I'd like being able to run those directly from the DB GUI), etc.

Well, one of the 28378423 things i want to work on at some point in the future. Hopefully those life extension studies that are posted on HN now and then will eventually move on from rats :-P

dijit274y ago

I'd be curious to see with other options exist, but the one that I know is Filemaker Pro. I'm only familiar with the stand alone version, and while expensive, to me it sounds exactly like what you're talking about wanting.

1 more reply

marcle4y ago

Another relevant software category is the statistical analysis languages, including SAS, Stata and SPSS.

Old-school SAS included only two data types (floats and character strings), but allowed for SQL and sequential data-steps to live together. Persistence was baked in. The floats could be used to represent dates, datetimes and other formats. I particularly appreciated being able to use macros to define a data-step view to split the follow-up for an individual from a table. Such a view could then be collapsed using SQL. More recently, R tools such as dplyr have brought together data-frames and relational operations. However, I miss the sequential coding in SAS, using macros as higher-level tools to define the logic, including corner cases.

For strictly typed records, I have always wanted to spend more time with SML# [0] this allows for record updating, with close ties to SQL -- an under-appreciated version of SML.

[0] https://github.com/smlsharp/smlsharp

hdjjhhvvhga4y ago

> In any case, I don't see any trend where a general-purpose programming language will include DB SQL as 1st-class.

Which is a pity I'd say. The new languages you mention would benefit most from such a feature. I believe they won't do it not because it's not useful, but because it's difficult to make it right (efficient, safe, natural, scalable) and then maintain forever.

There are many benefits of having this functionality working out of the box (and a few disadvantages, obviously). Many (if not most) apps are just CRUD apps with some added functionality. But a standard way of connecting the language to a database is still missing. The great success of the ActiveRecord back in the day shows that this is something many developers would benefit from (it was/is good, but still not ideal). And I don't believe patching the situation with a multitude of incompatible ORMs solves anything.

Groxx4y ago

Rust has rather sophisticated macros, which let you do stuff like this outside the core language implementation, which is IMO very much where such things belong.

E.g. a linq clone in rust can look like this: https://github.com/StardustDL/Linq-in-Rust

    linq!(from p in 1..100, where p <= &5, orderby -p, select p * 2).collect();

CRConrad4y ago

What poor old BottomFeeder missed was that with a good object library / framework, you can get so close that it almost doesn't matter. I tried to convince him to even try Delphi, with its marvellous TDataSet descendants in the VCL... But AFAIK he never even downloaded the free version I pointed him to, whatever it may have been called back then.

gpderetta4y ago

The K language of kdb also count, but being proprietary and having a fairly impenetrable and alien syntax haven't helped it branching out of the niche where it is very successfully.

xwolfi4y ago

Isn't Q the language, and Kx the company ?

I have to use it at work, helping quants try to actually maintain production code rather than just vomit horrible one-time scripts they patch together in an endless stream of layers on top of layers.

This thing should be banned. Everything is one-letter, it's impossible to Google, it takes the opposite decision of every imaginable convention, I have yet to see someone who can read his own stuff 2 weeks later. Notwithstanding free tools to query force-expire every 3 months (QPad grrr) and as you said it's so closed they have to take webdevs like me to just help them maintain it all eventually.

It's risky for the bank (and we're not a small one :s), it's expensive for the programmer, and it's misery for the quants (but they all feel like geniuses spending weeks on simple stuff on kdb, so their misery is only in people looking at them wasting away their brain like that).

tluyben24y ago

> but they all feel like geniuses spending weeks on simple stuff on kdb

They are not experienced in the environment then? It is not that hard. Sure it is spartan as the tooling is not what you expect in 2021, but this seems a bit over the top?

> I have yet to see someone who can read his own stuff 2 weeks

http://nsl.com/ can, years later even (as can I but he has an impressive portfolio which makes it obvious).

rak15074y ago

Why is it still used then? Are there just no good alternatives? Or legacy reasons?

1 more reply

gpderetta4y ago

You are right. As far as I understand Q is K plus the additional sql-like syntax.

And yes, it is nigh-unreadable. I have yet to learn it, but I think there is value in succinctness at least for throwaway scripts.

2 more replies

animal_spirits4y ago

The "V" programming language has these features built in. https://github.com/vlang/v/blob/master/doc/docs.md#orm

duped4y ago

Does this doc misuse present tense or is it actually implemented? V's author has done this a bunch, so it's tough to take a doc at face value.

animal_spirits4y ago

I just copy+pasted the example given, it compiles and runs

naasking4y ago

> Examples where database syntax (i.e. SQL syntax) is 1st-class without noisy syntax of function calls, without command strings in quotes

There more examples which I think qualify but don't quite fit into your categories:

* The E language runs all code in "Vats", each of which is a single threaded compartment with transparent persistence.

* Taking inspiration from E, the Waterken server did this for Java, but required annotating mutable fields in a certain way so the persistence layer could track them.

* FoxPro doesn't neatly fit into your categories.

DenisM4y ago

How about LINQ, in particular LINQ2SQL?

srcreigh4y ago

Racket has a first class SQL query DSL. [0]

With some kinda dynamic naming system, you could probably load SQLite table schemas at runtime and provide bindings to columns automatically. Or maybe that's best done once at the compile step. Either is possible with racket :-)

[0]: https://docs.racket-lang.org/sql/index.html

BeFlatXIII4y ago

Although it's not a relational database, MUMPS is another example of a language where there is nothing special whatsoever about manipulating the database compared to manipulating the same data but stored in a variable in RAM.

agumonkey4y ago

Didn't pascal have a way to persist/reload records. Not SQL-like but still something.

Also it's interesting you mention cobol. It really was a cool feature.

badsectoracula4y ago

Pascal's records are basically like C's structs: you can use them to make a database (and many did) but there isn't any real support from the language, though some implementations may have had their own extensions since the original Wirth Pascal had no support for files at all and the standard had some very limited support.

int_19h4y ago

Not as any kind of standard, or even a popular extension.

Microsoft BASIC implementations for DOS however had ISAM in some of the editions (PDS and VB for DOS) which were similar: you'd declare a struct type, and then open a file with that type as the record type.

mikewarot4y ago

I wrote CRUD apps that did this in Turbo Pascal back in the days of MS-DOS. I copied the idea of screen fields that I had seen in DBASE II, and make a set of field editors for each record type, I could put together a database pretty darned quick back in the day.

sethhovestol4y ago

I actually work in a table oriented language, harbour, a child of clipper/xBase mentioned in the article. There are a few issues I've found with a table oriented architecture:

1. Managing state is a bit of a nightmare. Harbour is based off of DBF databases, which are essentially flat files of a 2d array, and keeps your record number within any given db. You can then query a field with the arrow operator (table->field) but you have no guarantee that any subfunction is not changing state.

2. DBMS lock in. Because you're operating is totally different paradigm moving dbs is actually rather challenging. Harbour has a really nice system of replaceable database drivers(rdd), but when your code is all written assuming movement in a flat file, switching to a SQL based system is challenging. I'm currently in the process of writing a rdd to switch us to postgres, but translating the logic of holding state to the paradigm of gathering data then operating on it in an established code base is quite a challenge.

mamcx4y ago

For people like me, that worked in FoxPro, this is the dream.

Despite the claim this kind of tools is for "basic CRUD" they could do much more, much better, precisely because can deal MUCH better with the most challenged kind of programming:

CRUD apps.

Making apps in finance, erps, bussines, etc, are far more complex and challenging than build chat apps, where the scope is MORE clear and the features, reduced.

"Simple" crud apps NEVER stay simple.

NEVER.

If you allow it, in no time you are building a mix of your owm RDBMs, programming language, API orchestation, authorization framework, inference engines, hardware interfaces and more...

then, it must run in "Windows, Linux, Mac, Android, iOS, Web, Rasperry, that computer that is only know here in this industry", "please?"... and it will chases, also, all fads, all the time.

The request/features pipeline never end. The info about what to do is sketchy at best.

The turnaround to bring results is measure in HOURS/DAYs.

So, no.

No language without this, is in fact, good for the niche.

chrisaycock4y ago

I built my own table-oriented language out of frustrations I had with with time-series analysis:

https://www.empirical-soft.com

Empirical has statically typed Dataframes. It can infer the type of a file's contents at compile time using a ton of metaprogramming techniques.

  >>> let trades = load("trades.csv")
  
  >>> trades
   symbol                  timestamp    price size
     AAPL 2019-05-01 09:30:00.578802 210.5200  780
     AAPL 2019-05-01 09:30:00.580485 210.8100  390
      BAC 2019-05-01 09:30:00.629205  30.2500  510
      CVX 2019-05-01 09:30:00.944122 117.8000 5860
     AAPL 2019-05-01 09:30:01.002405 211.1300  320
     AAPL 2019-05-01 09:30:01.066917 211.1186  310
     AAPL 2019-05-01 09:30:01.118968 211.0000  730
      BAC 2019-05-01 09:30:01.186416  30.2450  380
      CVX 2019-05-01 09:30:01.639577 118.2550 2880
      ...                        ...      ...  ...

Functions have generic typing by default; the caller determines the type instantiation. Here is a weighted average:

  >>> func wavg(ws, vs) = sum(ws * vs) / sum(ws)

Queries are built into the language. Here is a five-minute volume-weighted average price:

  >>> from trades select vwap = wavg(size, price) by symbol, bar(timestamp, 5m)
   symbol           timestamp       vwap
     AAPL 2019-05-01 09:30:00 210.305724
      BAC 2019-05-01 09:30:00  30.483875
      CVX 2019-05-01 09:30:00 119.427733
     AAPL 2019-05-01 09:35:00 202.972440
      BAC 2019-05-01 09:35:00  30.848397
      CVX 2019-05-01 09:35:00 119.431601
     AAPL 2019-05-01 09:40:00 204.671388
      BAC 2019-05-01 09:40:00  30.217362
      CVX 2019-05-01 09:40:00 117.224763
      ...                 ...        ...

Everything is statically typed. Misspelled column names, for example, result in an error before the script is even run!

beaumayns4y ago

This is pretty cool. I've had thoughts (or dreams, more accurately :) of a language like this every time I get a runtime 'type error in q. I gotta say, I prefer q's syntax, though :)

maest4y ago

q with static typing and a sensible pricing model would be amazing.

I do think that q's main strength is not its speed, but the fact that qSQL statements are a first class citizen in the language - no network hops, no awkward marshalling and unmarshalling of data, no awkward mismatch around how to use nulls, nans, tz-aware timestamps etc.

chrisaycock4y ago

I started Empirical with the goal of "q like Haskell". The end result went in a radically different direction, but the guiding light has always been to have a statically typed language where tables and queries are a first-class operation.

The source code is publicly available under AGPL with the Commons Clause:

https://github.com/empirical-soft/empirical-lang

rscho4y ago

How does it handle dirty data? Does it assign an "any" type?

Also, why do you think embedding data frames is not possible?

chrisaycock4y ago

Missing and poorly formatted input is given a type-specific value. Eg., Float64 is nan and Int64 is nil.

  >>> Int64("5")
  5

  >>> Int64("5b")
  nil

If inferencing cannot determine a consistent type from a CSV file, then the column will just be a String.

I don't know what you mean by "embedding" a Dataframe.

rscho4y ago

On the website you linked:

"Embedding Dataframes into an existing language would not be possible."

I don't think it would be an issue for languages with good metaprogramming facilities.

1 more reply

zokier4y ago

I'm in the opinion that tables would make a lot of sense as first-class citizens for shell environments. Lots of data typically handled in shells is inherently tabular in nature (for example the outputs of ls and ps etc) and some of the common tools also are intended for tables (awk in the forefront, but also cut and sort as examples). But in practice lot of it is currently very ad-hoc, and handles any sort of edge cases poorly.

osquery already demonstrates that lot of info can be structured into tables, but what I feel is missing is more convenient, shell-like language environment to work with such data.

tkindy4y ago

I think Microsoft Powershell [0] sort of approaches what you’re describing. It’s not exactly table-oriented, but object-oriented such that there’s a lot more structure to data than in traditional command line environments. For example, their equivalent of ls returns an array of objects (i.e. rows) which you can filter, sort, etc. based on the properties of those objects.

[0]: https://docs.microsoft.com/en-us/powershell/scripting/overvi...

tyingq4y ago

Seconded. It also comes with Import-CSV and Export-CSV. And cmdlets like Select-Object and Where-Object.

  Get-Service | Where-Object {$_.Status -eq "Stopped"}

Looks pretty close to what's being described.

DemocracyFTW4y ago

The proliferation of field types has made data more difficult to transfer or share data between different applications and generates confusion. ITOP has only two fundamental data types: numeric and character, and perhaps a byte type for conversion purposes. (I have been kicking around ideas for having only one type.) The pre- and post-validators give any special handling needed by the field. A format string can be provided for various items like dates ("99/99/99"), Social-Security-Numbers ("999-99-9999"), and so forth. (Input formats are not shown in our sample DD.) Types like dates and SSN's can be internally represented (stored) just fine with characters or possibly integers. For example, December 31, 1998 could be represented as "19981231". This provides a natural sort order.

This is very nineties and I must disagree. The datetime-as-string example shows it most clearly: wanting to sort by full date is only one thing you want to do with calendar data; often you will want to compare, say, things that happened on Mondays vs things that happened over the weekend, or things that happened within so-and-so many hours around a given point in time and so, not to mention the complexities of DST and timezones. You can do all that with text-based strings but you'd have to write quite a bit of logic that will get applied to strings over and over again, or else you can store the results of parsing a date string into separate fields. Dates expressed as text also don't allow you to validate "19990229" or "20020631" in a very straightforward manner.

I think our collective and by now decades-old experience with duck/weakly-typed languages like Python, JavaScript, Ruby and so on clearly shows that what you gain in simplicity you lose in terms of assured correctness.

WalterBright4y ago

The way to deal with dates is not by having separate fields. It's by having a single value represent the time (in Linux it's time_t). Every other format gets translated to time_t, all processing is done with time_t, and then the time_t gets translated to the desired output format.

Any other scheme is doomed to working 99% of the time, and that last 1% will be impossible to fix.

ShroudedNight4y ago

> It's by having a single value represent the time

I have limited experience dealing with human originated time references, but from my encounters, the various idiosyncratic forms of date storage often seem to arise out of an aversion to commit to well defined intervals of uncertainty / margins of error. Coercing people of limited mental bandwidth or interest beyond immediate gratification to go through the pain of constricting their mental models to time_t levels of precision seems to basically be a non-starter.

DemocracyFTW4y ago

This is of course exactly what I've been meaning to say here. We want more specific datatypes (e.g. a true date(time) ADT) with better functionality (say interval computation) and constraint checking (AKA domains, such as 'let n be an even, positive integer gt zero').

kerblang4y ago

This idea of "the database should be invisible!" abstraction was widely pursued back in the 90's when people were still obsessed with "the network should be invisible!" and Remote Procedure Calls (RPC). A lot of ORM's still reflect this obsession, and some programmers still get angry that they should have to deal with "this low-level SQL nonsense!"

Attempts to make I/O invisible failed and failed and failed again, and continue failing and failing again because it turns out that I/O is incredibly fundamental and not something you can just wave off as "low-level details". A networked database is a massive abstraction in its own right, and if invisible I/O is a doomed abstraction, forget invisible databases. Well, first go fail a few more times, then forget it, because we're not quite there yet on this one, are we...

The bigger the abstraction, the more it leaks. Sometimes you have enough headroom to go further, and sometimes you have to recognize that you've gone way too far.

scotty794y ago

> Fundamental and Consistent Collection Operations

I recently discovered that Scala collection library was designed with this exact goal in mind.

Interface of collections is highly consistent between various types and you can create custom collections using the same interface with very little custom code.

I found this very insightful https://docs.scala-lang.org/overviews/core/architecture-of-s...

Slick library pretty much turns database access into first class part of the Scala through this collections api

https://scala-slick.org/doc/3.3.3/introduction.html#what-is-...

bob10294y ago

We do a thing where we project all of the domain state (i.e. for a given user's session/work) into an in-memory database and then execute the business's SQL queries against it in order to determine logical outcomes.

I wouldn't really call it low/no code, since developing effective queries is non-trivial for many cases, but it does make it much more feasible for a non-developer to add incremental value to our product.

kragen4y ago

I'm glad this got posted! I wanted to reread this a couple of years ago and couldn't find it. Any idea what happened to TopMind?

CRConrad4y ago

My history with ****e *****s goes way back... we must have had more than a decade of pro-vs-contra-OOP flame wars on various Web fora, starting over twenty years ago; but in the last ten or so I haven't heard (directly) from him.

In the mean time, I have softened my stance and can admit that traditional inheritance-based OOP may not be ultimate panacea, but I doubt he has softened his anti-OOP stance at all. :-)

He is / was present on at least Slashdot and, I noticed, the original (now archived, i.e. read-only) C2 Wiki, and probably a few others I'm forgetting right now, under the names "Tablizer", "TopMind" (or sometimes, IIRC, just "Top".)

gpderetta4y ago

My wish-list for my ideal (non-system) programming language:

- first class tables and named tuples as the primary datastructure. Includes the full set of relational operations, and transaction support. Optional persistence. Everything is not a table though. Tables are great but pragmatism trumps dogmatism.

- structural typing (ties neatly with the above) and support for row polymorphism

- shared nothing, distributed, multiprocessing, except for explicitly shared tables as transactions allow for safe controlled mutation of shared tables. Messages are just named tuples and row polymorphism should allow for protocol evolution. Message queues and stream can be abstracted as one pass tables.

- Async as in Cilk not JS. No red/green functions. Multiprocessing can be cheap, just spawn an user thread. The compiler will use whatever compilation strategy is the best (cactus stacks, full CPS transform, whatever).

- seamless job management, pipelines, graphs. Ideally this language should be a perfectly fine shell replacement. But with transparent support for running processes on multiple machines. And better error management.

A bit more nebulous and needs more thoughts:

- exceptions, error codes and optional/variant results are all faces of the same medal and can look the same with the right syntactic sugar.

- custom table representation. You can optionally decide how your table should be physically represented in memory or disk. Explicit pointers to speed up joins. Nested representation for naturally hierarchical data. Denormalized

- first class graphs. Graphs and relational tables are dual. And with the above point it should be possible to represent them efficiently. What operations we need?

- capabilities. All dependencies are passed to each function, no global data and code. You can tell if your function does IO or allocates by looking at its signature. Subsumes dependency injection. Implicit parameters and other syntactic sugar should make this bearable.

- staged compilation via partial evaluation. This should subsume macros. Variables are a tuple of (value, type), where type is a dictionary of operation-name->operation-implementation. First stage is a fully dynamic language, but by fixing the shape of the dictionary you get interfaces/traits/protocol with dynamic dispatch, by fixing the implementation you get static dispatch. Again, significant sugar is needed to make this workable.

edit:

missed an important element: - transparent remote code execution: run your code where your data is. Capabilities are pretty much a requirement for security.

naasking4y ago

I'm no longer convinced of the need for row or record polymorphism. It encourages passing around types that have no clear domain or purpose, so I think it inhibits understanding in general. Do you have any examples where it's indispensable?

gpderetta4y ago

I don't think it is indispensable, I think it is convenient and still better than what is done today were types without clear domain and purpose are already passed around.

At the very least with row polymorphism, a function can declare which subset of the type it actually care about instead of taking an unwanted dependency on the whole blob.

In particular I'm considering the scenario were a large application (or better a collection of applications) evolve without a central plan and messages tend to grow to accommodate orthogonal requirements (the alternative is splitting the messages, but it has performance, complexity and organizational overhead).

In theory the alternative is message inheritance, but in my experience it has never worked well and it is very hard to retrofit anyway.

naasking4y ago

> At the very least with row polymorphism, a function can declare which subset of the type it actually care about instead of taking an unwanted dependency on the whole blob.

This is the argument I no longer find convincing. Do you have an example where this is so much clearer than alternate, simpler ways of doing it?

For instance, in principle you could easily rewrite a function that works on a record with 3 fields to just accept 3 parameters. The only additional "burden" is that the caller has to pass in those 3 fields, where before they could just pass in the record.

1 more reply

marcosdumay4y ago

Nothing is indispensable as long as you have a Turing complete language. That is a really bad mindset to use.

Anyway, are you complaining that the types are abstract? (That is as bad a complaint as it sounds.) Or do you have something different in mind?

naasking4y ago

You're taking indispensable too literally. If you have to commonly write 1,000 lines of code without a feature, but the feature permits you to to reduce this to 1 line of code, I'd consider that to be pretty indispensable.

Where the indispensable line is is debatable, hence my request for an example.

Avshalom4y ago

This idea (or at the least nostalgia for xBase) pops up every now and then and while it certainly isn't describing Prolog I think the idea would be a lot more interesting if the authors had enough familiarity to compare and contrast.

Animats4y ago

Oh, that kind of table. I was expecting decision tables.[1]

"Smart contracts" for Etherium should have been decision tables. But no, they had to make it Turing-complete. A good thing about decision tables is that there's a finite and small number of cases, so they can be exhaustively tested. Also, they're readable. That's what you want for contracts. Not Solidity programs, which are expensively insecure.

[1] https://en.wikipedia.org/wiki/Decision_table

abss4y ago

I remember this page from geocites... Opend my eye about some ugly aspects of OOP. But, without proper marketing and without some luck a lot of ideas should be rediscovered again and again. And maybe the table oriented programming ideas are too common sense and therefore not a good kind of diferentatior compared with other smart ppl...

slowmovintarget4y ago

I recall debating this on Slashdot back in 2002. (I was a Bertrand Meyer OO convert back then). Good memories.

Functions and data are like spacetime and gravity. Beneath the emergent behavior in any software system, they are the things you find lurking underneath.

teleforce4y ago

Just wondering about the meaning of the satements from the article "Arrays are evil! Arrays are the Goto of the collections world". Anyone know exactly what it means? Is it referring to the raw array with pointers in C/C++ or array in C++ collections?

timemachine4y ago

I took the meaning as: Array processing can cause hidden bugs (like GOTO). The bugs are usually introduced when the size of the array changes while processing.

Also ‘find’ing and ‘filter’ing arrays is more error prone (like not properly handling a miss on a find).

I don’t completely agree with the author, but the idea that more care needs taken when dealing with arrays is worth considering.

Arrays will always have important uses. Tuples, enumerations, and the like. However, ‘n’ database records shoved into an array and then iterated over in a for-loop is a cumbersome way to structure a program.

That’s my take on it.

RandyRanderson4y ago

I stopped reading shortly after:

"a = (b * c) + e + f"

Something like this would have been a better ex:

a = b(c+e) + f

This guy maybe hasn't heard of operator overloading as no one would do as he suggests in most 'OOP' languages:

"a = ((b.times(c)).plus(e)).plus(f) // sillier"

tomcooks4y ago

Might be due to personal preferences, but after having worked on a legacy TOP codebase i must unapologetically say that it sucks.

rsrsrs864y ago

Surprised no one mentioned the relationship of this to Alloy

j / k navigate · click thread line to collapse

87 comments

jasode4y ago

Examples where database syntax (i.e. SQL syntax) is 1st-class without noisy syntax of function calls, without command strings in quotes, etc :

- business languages like COBOL

- programming languages in ERP systems like SAP ABAP, Oracle Financials

- stored procedural languages inside the RDBMS engine such as T-SQL in MS SQL Server, PL/SQL in Oracle, sp in MySQL

In the above, the "database" is the world the programming language is working in.

The author works a lot on CRUD apps so a language that has inherent db syntax would enable "Table-Oriented-Programming".

mwexler4y ago

I think you are missing the big twist. It's not just tables as 1st class citizens, but allowing logic to be driven by the tables.

It's not just orm or persistence, and not just programming in the database as stored procedures. It was an odd melange of all of this.

I don't expect to see a language like this come around again anytime soon, but the ideas were really interesting in a world before git-ops and yaml configs.

jasode4y ago

>, but allowing logic to be driven by the tables. Instead of config files, you update the table. Changes to the processing flow? Update the table, [...]

mwexler4y ago

Well put. SAP is the big survivor from that mindset. And perhaps the best example of why we won't see another. :-)

ModernMech4y ago

Eve died [1], but you’ll see many such projects that have the same ethos in communities in the web, such as this one [2].

There aren’t a lot of users of these languages, but this is where a lot of big ideas are percolating right now.

And we can verify it’s a trend because the hallmark of all CS trends, the formation of a conference, has made itself known in this area [3].

0: http://witheve.com/

1: https://groups.google.com/g/eve-talk/c/YFguOGkNrBo?pli=1

2: https://futureofcoding.org/catalog/

3: https://www.hytradboi.com/

civilized4y ago

This reminds me of M/MUMPS, used by Epic to power the biggest EHR system by market share in the US.

Perhaps the big difference is that the M "database" is key-value structured. True tables are flat and do not distinguish part of the tuple as the "key" and part as the "value".

There's clearly some kind of "isomorphism" or translation between the two models, but they're not quite the same.

Is this what ORMs are about? Translating between the programmer model of data and the relational DB model?

CRConrad4y ago

> Is this what ORMs are about? Translating between the programmer model of data and the relational DB model?

Pretty much, AFAICS.

With varying degrees of success.

nine_k4y ago

Such tools existed and were popular in 1990s: DBase, Clipper, FoxPro.

They worked pretty well in their domain: data entry and report generation, with lightweight transaction processing and general computation.

Then happened the internet and client-server architectures, and these do not map as neatly onto local, single-user, single-transaction tables.

CRConrad4y ago

> Such tools existed and were popular in 1990s: DBase, Clipper, FoxPro.

Also Crystal Reports, Paradox...

> They worked pretty well in their domain: data entry and report generation, with lightweight transaction processing and general computation.

> Then happened the internet and client-server architectures, and these do not map as neatly onto local, single-user, single-transaction tables.

CRConrad4y ago

Duh, forgot to mention: "Fishbase" was based on a Paradox (IIRC; could have been DBase) file.

Edit: Also, for FP/Lazarus that's probably SQLite and newer RDBMSes not "in stead of" but in addition to the old file formats and RDBMSes.

molsongolden4y ago

One of my favorite programs ever was built on DBase in the mid-90s.

Installing or moving the entire application and database to a new PC was as easy as "drag folder onto flash drive" -> "drag folder off of flash drive".

Instant loading, tiny file sizes, but it became too complicated for users when 64-bit windows pushed 16-bit usage into VMs.

badsectoracula4y ago

Well, one of the 28378423 things i want to work on at some point in the future. Hopefully those life extension studies that are posted on HN now and then will eventually move on from rats :-P

dijit274y ago

1 more reply

marcle4y ago

Another relevant software category is the statistical analysis languages, including SAS, Stata and SPSS.

For strictly typed records, I have always wanted to spend more time with SML# [0] this allows for record updating, with close ties to SQL -- an under-appreciated version of SML.

[0] https://github.com/smlsharp/smlsharp

hdjjhhvvhga4y ago

> In any case, I don't see any trend where a general-purpose programming language will include DB SQL as 1st-class.

Groxx4y ago

Rust has rather sophisticated macros, which let you do stuff like this outside the core language implementation, which is IMO very much where such things belong.

E.g. a linq clone in rust can look like this: https://github.com/StardustDL/Linq-in-Rust

    linq!(from p in 1..100, where p <= &5, orderby -p, select p * 2).collect();

CRConrad4y ago

gpderetta4y ago

The K language of kdb also count, but being proprietary and having a fairly impenetrable and alien syntax haven't helped it branching out of the niche where it is very successfully.

xwolfi4y ago

Isn't Q the language, and Kx the company ?

I have to use it at work, helping quants try to actually maintain production code rather than just vomit horrible one-time scripts they patch together in an endless stream of layers on top of layers.

tluyben24y ago

> but they all feel like geniuses spending weeks on simple stuff on kdb

They are not experienced in the environment then? It is not that hard. Sure it is spartan as the tooling is not what you expect in 2021, but this seems a bit over the top?

> I have yet to see someone who can read his own stuff 2 weeks

http://nsl.com/ can, years later even (as can I but he has an impressive portfolio which makes it obvious).

rak15074y ago

Why is it still used then? Are there just no good alternatives? Or legacy reasons?

1 more reply

gpderetta4y ago

You are right. As far as I understand Q is K plus the additional sql-like syntax.

And yes, it is nigh-unreadable. I have yet to learn it, but I think there is value in succinctness at least for throwaway scripts.

2 more replies

animal_spirits4y ago

The "V" programming language has these features built in. https://github.com/vlang/v/blob/master/doc/docs.md#orm

duped4y ago

Does this doc misuse present tense or is it actually implemented? V's author has done this a bunch, so it's tough to take a doc at face value.

animal_spirits4y ago

I just copy+pasted the example given, it compiles and runs

naasking4y ago

> Examples where database syntax (i.e. SQL syntax) is 1st-class without noisy syntax of function calls, without command strings in quotes

There more examples which I think qualify but don't quite fit into your categories:

* The E language runs all code in "Vats", each of which is a single threaded compartment with transparent persistence.

* Taking inspiration from E, the Waterken server did this for Java, but required annotating mutable fields in a certain way so the persistence layer could track them.

* FoxPro doesn't neatly fit into your categories.

DenisM4y ago

How about LINQ, in particular LINQ2SQL?

srcreigh4y ago

Racket has a first class SQL query DSL. [0]

[0]: https://docs.racket-lang.org/sql/index.html

BeFlatXIII4y ago

agumonkey4y ago

Didn't pascal have a way to persist/reload records. Not SQL-like but still something.

Also it's interesting you mention cobol. It really was a cool feature.

badsectoracula4y ago

int_19h4y ago

Not as any kind of standard, or even a popular extension.

mikewarot4y ago

sethhovestol4y ago

I actually work in a table oriented language, harbour, a child of clipper/xBase mentioned in the article. There are a few issues I've found with a table oriented architecture:

mamcx4y ago

For people like me, that worked in FoxPro, this is the dream.

Despite the claim this kind of tools is for "basic CRUD" they could do much more, much better, precisely because can deal MUCH better with the most challenged kind of programming:

CRUD apps.

Making apps in finance, erps, bussines, etc, are far more complex and challenging than build chat apps, where the scope is MORE clear and the features, reduced.

"Simple" crud apps NEVER stay simple.

NEVER.

If you allow it, in no time you are building a mix of your owm RDBMs, programming language, API orchestation, authorization framework, inference engines, hardware interfaces and more...

then, it must run in "Windows, Linux, Mac, Android, iOS, Web, Rasperry, that computer that is only know here in this industry", "please?"... and it will chases, also, all fads, all the time.

The request/features pipeline never end. The info about what to do is sketchy at best.

The turnaround to bring results is measure in HOURS/DAYs.

So, no.

No language without this, is in fact, good for the niche.

chrisaycock4y ago

I built my own table-oriented language out of frustrations I had with with time-series analysis:

https://www.empirical-soft.com

Empirical has statically typed Dataframes. It can infer the type of a file's contents at compile time using a ton of metaprogramming techniques.

  >>> let trades = load("trades.csv")
  
  >>> trades
   symbol                  timestamp    price size
     AAPL 2019-05-01 09:30:00.578802 210.5200  780
     AAPL 2019-05-01 09:30:00.580485 210.8100  390
      BAC 2019-05-01 09:30:00.629205  30.2500  510
      CVX 2019-05-01 09:30:00.944122 117.8000 5860
     AAPL 2019-05-01 09:30:01.002405 211.1300  320
     AAPL 2019-05-01 09:30:01.066917 211.1186  310
     AAPL 2019-05-01 09:30:01.118968 211.0000  730
      BAC 2019-05-01 09:30:01.186416  30.2450  380
      CVX 2019-05-01 09:30:01.639577 118.2550 2880
      ...                        ...      ...  ...

Functions have generic typing by default; the caller determines the type instantiation. Here is a weighted average:

  >>> func wavg(ws, vs) = sum(ws * vs) / sum(ws)

Queries are built into the language. Here is a five-minute volume-weighted average price:

  >>> from trades select vwap = wavg(size, price) by symbol, bar(timestamp, 5m)
   symbol           timestamp       vwap
     AAPL 2019-05-01 09:30:00 210.305724
      BAC 2019-05-01 09:30:00  30.483875
      CVX 2019-05-01 09:30:00 119.427733
     AAPL 2019-05-01 09:35:00 202.972440
      BAC 2019-05-01 09:35:00  30.848397
      CVX 2019-05-01 09:35:00 119.431601
     AAPL 2019-05-01 09:40:00 204.671388
      BAC 2019-05-01 09:40:00  30.217362
      CVX 2019-05-01 09:40:00 117.224763
      ...                 ...        ...

Everything is statically typed. Misspelled column names, for example, result in an error before the script is even run!

beaumayns4y ago

This is pretty cool. I've had thoughts (or dreams, more accurately :) of a language like this every time I get a runtime 'type error in q. I gotta say, I prefer q's syntax, though :)

maest4y ago

q with static typing and a sensible pricing model would be amazing.

chrisaycock4y ago

The source code is publicly available under AGPL with the Commons Clause:

https://github.com/empirical-soft/empirical-lang

rscho4y ago

How does it handle dirty data? Does it assign an "any" type?

Also, why do you think embedding data frames is not possible?

chrisaycock4y ago

Missing and poorly formatted input is given a type-specific value. Eg., Float64 is nan and Int64 is nil.

  >>> Int64("5")
  5

  >>> Int64("5b")
  nil

If inferencing cannot determine a consistent type from a CSV file, then the column will just be a String.

I don't know what you mean by "embedding" a Dataframe.

rscho4y ago

On the website you linked:

"Embedding Dataframes into an existing language would not be possible."

I don't think it would be an issue for languages with good metaprogramming facilities.

1 more reply

zokier4y ago

osquery already demonstrates that lot of info can be structured into tables, but what I feel is missing is more convenient, shell-like language environment to work with such data.

tkindy4y ago

[0]: https://docs.microsoft.com/en-us/powershell/scripting/overvi...

tyingq4y ago

Seconded. It also comes with Import-CSV and Export-CSV. And cmdlets like Select-Object and Where-Object.

  Get-Service | Where-Object {$_.Status -eq "Stopped"}

Looks pretty close to what's being described.

DemocracyFTW4y ago

WalterBright4y ago

Any other scheme is doomed to working 99% of the time, and that last 1% will be impossible to fix.

ShroudedNight4y ago

> It's by having a single value represent the time

DemocracyFTW4y ago

kerblang4y ago

The bigger the abstraction, the more it leaks. Sometimes you have enough headroom to go further, and sometimes you have to recognize that you've gone way too far.

scotty794y ago

> Fundamental and Consistent Collection Operations

I recently discovered that Scala collection library was designed with this exact goal in mind.

Interface of collections is highly consistent between various types and you can create custom collections using the same interface with very little custom code.

I found this very insightful https://docs.scala-lang.org/overviews/core/architecture-of-s...

Slick library pretty much turns database access into first class part of the Scala through this collections api

https://scala-slick.org/doc/3.3.3/introduction.html#what-is-...

bob10294y ago

kragen4y ago

I'm glad this got posted! I wanted to reread this a couple of years ago and couldn't find it. Any idea what happened to TopMind?

CRConrad4y ago

In the mean time, I have softened my stance and can admit that traditional inheritance-based OOP may not be ultimate panacea, but I doubt he has softened his anti-OOP stance at all. :-)

gpderetta4y ago

My wish-list for my ideal (non-system) programming language:

- structural typing (ties neatly with the above) and support for row polymorphism

A bit more nebulous and needs more thoughts:

- exceptions, error codes and optional/variant results are all faces of the same medal and can look the same with the right syntactic sugar.

- first class graphs. Graphs and relational tables are dual. And with the above point it should be possible to represent them efficiently. What operations we need?

edit:

missed an important element: - transparent remote code execution: run your code where your data is. Capabilities are pretty much a requirement for security.

naasking4y ago

gpderetta4y ago

I don't think it is indispensable, I think it is convenient and still better than what is done today were types without clear domain and purpose are already passed around.

At the very least with row polymorphism, a function can declare which subset of the type it actually care about instead of taking an unwanted dependency on the whole blob.

In theory the alternative is message inheritance, but in my experience it has never worked well and it is very hard to retrofit anyway.

naasking4y ago

> At the very least with row polymorphism, a function can declare which subset of the type it actually care about instead of taking an unwanted dependency on the whole blob.

This is the argument I no longer find convincing. Do you have an example where this is so much clearer than alternate, simpler ways of doing it?

1 more reply

marcosdumay4y ago

Nothing is indispensable as long as you have a Turing complete language. That is a really bad mindset to use.

Anyway, are you complaining that the types are abstract? (That is as bad a complaint as it sounds.) Or do you have something different in mind?

naasking4y ago

Where the indispensable line is is debatable, hence my request for an example.

Avshalom4y ago

Animats4y ago

Oh, that kind of table. I was expecting decision tables.[1]

[1] https://en.wikipedia.org/wiki/Decision_table

abss4y ago

slowmovintarget4y ago

I recall debating this on Slashdot back in 2002. (I was a Bertrand Meyer OO convert back then). Good memories.

Functions and data are like spacetime and gravity. Beneath the emergent behavior in any software system, they are the things you find lurking underneath.

teleforce4y ago

timemachine4y ago

I took the meaning as: Array processing can cause hidden bugs (like GOTO). The bugs are usually introduced when the size of the array changes while processing.

Also ‘find’ing and ‘filter’ing arrays is more error prone (like not properly handling a miss on a find).

I don’t completely agree with the author, but the idea that more care needs taken when dealing with arrays is worth considering.

That’s my take on it.

RandyRanderson4y ago

I stopped reading shortly after:

"a = (b * c) + e + f"

Something like this would have been a better ex:

a = b(c+e) + f

This guy maybe hasn't heard of operator overloading as no one would do as he suggests in most 'OOP' languages:

"a = ((b.times(c)).plus(e)).plus(f) // sillier"

tomcooks4y ago

Might be due to personal preferences, but after having worked on a legacy TOP codebase i must unapologetically say that it sucks.

rsrsrs864y ago

Surprised no one mentioned the relationship of this to Alloy

j / k navigate · click thread line to collapse