undefined | Better HN

0 pointsbrightball3y ago0 comments

I've honestly never understood why people have such a distaste for SQL. SQL and Linux/Unix have been the biggest constants of my entire programming career to this point (20ish years). I always know I can count on them.

0 comments

zvrba3y ago

I love the data model of RDBMS / SQL. I hate SQL _the language_. Verbose, irregular, attempting to mimic English and thus making it unintuitive and next to impossible to remember. 80% of my lookups into reference documentation is about syntax for things I don't use that often.

Examples: GRANT TO, REVOKE FROM, DENY TO (oh yes, what's the difference between REVOKE and DENY? and did you know that you can REVOKE a DENY?), arbitrary requirements for punctuation (e.g., = in BACKUP DATABASE AdventureWorks2012 TO DISK = 'X:\SQLServerBackups\AdventureWorks1.bak'), dubious context-sensitive "magical identifiers" (e.g., SELECT DATEPART(year, @t)), non-composability (how do you dynamically give a part specification to DATEPART?), etc, etc, etc.

It's possible to write a novel about just how unstructured and irregular "structured" query language is.

tpetry3y ago

All of your examples are vendor specific extensions to SQL. Nothing of that is part of the standard. So you are effectively hating a specific implementation which is not that logical.

extrememacaroni3y ago

Imagine hating the things that you use rather than a pdf with text, fascinating.

zvrba3y ago

Yes, ANSI syntax is surely composable or dynamic: EXTRACT(YEAR FROM DATE '1998-03-07')

oarabbus_3y ago

Honestly, I think it's because a lot of folks go their entire careers avoiding set-based thinking. Many of these folks are talented programmers so I would assume they'd become quite decent at SQL if they devoted time to it. I'm speaking more on the DML side than the DDL side here.

In my experience <20% of developers are good enough to be dangerous with SQL, and maybe 5% what I'd consider adept.

The rest range from "SELECT * is as far as I'll go; where's the ORM" to "why the hell are you using cursors to aggregate"

SQL is powerful, elegant, and reliable. With modern DB support of JSON, ARRAY, statistical functions, and much more, SQL is probably the #1 most underutilized/improperly leveraged tool today. SQL-killers have been coming out for 40 years now and (for its domain!) SQL's lead is pulling farther away if anything.

*yes there are some questionable implementations, so please replace "SQL" with "PostgreSQL" if nonstandard SQL implementations have caused nightmares for you in the past.

IgorPartola3y ago

The reason isn’t set based thinking avoidance. I think it is because fundamentally we want to work with smart objects and not rows of data in most cases.

oarabbus_3y ago

More often than not that's a good example of set-based thinking avoidance. The application graveyard is full of projects that felt their transactional data was too sophisticated for "dumb rows" and reinvented an inefficient relational model. I'd love to hear more about these "smart objects" - they're usually not that smart and typically can be represented relationally.

Even for use-cases like graph based models you still find Twitter and Facebook using MySQL to build the graph on top of. It’s simply heavily abstracted for the majority of engineers at those companies (where I'd wager the <20% proficiency in set-based thinking holds true) but it still fundamentally relies on SQL.

omginternets3y ago

Where can I learn about the relationship between set-theory and SQL?

oarabbus_3y ago

I'm not too well-versed in the academic/mathematical background of SQL, other than than to say it's closer to relational algebra than Set theory, and the resource to read is probably the 1970 Codd paper: https://www.seas.upenn.edu/~zives/03f/cis550/codd.pdf

SQL does implement the set operations of UNION, INTERSECT, and EXCEPT, but I meant "thinking in sets" more colloquially.

edit: this is more practical than theoretical but the author here actually does a nice job of discussing "set-based" thinking vs. procedural https://www.itprotoday.com/open-source-sql/fangraphs-tags-cl...

aahortwwy3y ago

Discrete Mathematics with Applications by Epp. Chapters 5, 7, 10.

Fundamentals of Database Systems by Elmasri & Navathe. Part 2.

jstarfish3y ago

I wholly agree with you, but I'll say this:

When it comes to prototyping, I'm not going to fuck with something like Java-- I'm going to reach for Python. If I don't know what I'm doing to begin with, I don't need something telling me all the ways in which I'm doing it wrong.

Same goes for SQL/NoSQL. I loosely know what I need to model and may revise it arbitrarily. SQL does not lend itself to this. NoSQL was designed for it.

NoSQL is the "fuck you dad you can't tell me what to do" of the database world. SQL is your annoying dad that tries to institute curfews and won't let you go to underage drinking parties. In the end, it's the latter you're going to be calling from a holding cell, but there's a lot more fun you can have with the former.

funcDropShadow3y ago

In the case of SQL/NoSQL I think it is the other way around. In order to use a NoSQL database to its strength, you have to know all access path up front [1]. Because you are designing your table structure with all the queries in mind. In SQL on the other hand, you can always go for a normalized structure and you are pretty safe to be prepared for most requirements that are coming in the future.

Additionally, I think drafting a DDL schema is a great way to learn about a new application domain. It forces me to ask hard questions. And that improves my understanding of the domain. I guess that is similar to some people who like to prototype using Haskell type signatures.

[1] Unless you mean specifically a document store without schemas.

Aeolun3y ago

Hmm, it’s the other way around for me. The crazy friend may be fun to go drinking with, but I’d never rely on him for anything.

jrop3y ago

> If I don't know what I'm doing to begin with, I don't need something telling me all the ways in which I'm doing it wrong.

This is unrelated to this conversation, but this is my main beef with Rust. I love Rust (like a lot), but it's just not good for prototyping. End of non-pertinent rant.

bakuninsbart3y ago

SQL is great at what it is designed to do, and absolutely horrible for anything else. Sometimes I have had to use tools that only allow (a subset of) SQL for querying data. (Especially BI) Doing regex date validation in a json derived from a string in an sql dialect without function-definitions is horrendous. These kinds of "why the f do i have to use sql for this"-moments happened surprisingly often to me working as a data analyst.

But then I'm trying to do relatively simple things in Pandas or R, like changing column values based on multiple field value conditions, and it is a struggle requiring at least 17 brackets, while in SQL it would be a simple statement that anyone can understand after reading it once.

oarabbus_3y ago

>Doing regex date validation in a json derived from a string in an sql dialect without function-definitions is horrendous.

Doing regex is horrendous, but doing it on SQL in a modern database is no more difficult than in a full-fledged programming language. Most modern DBs have strong JSON support and built-in regex functions

1. JSON_CAST/JSON_PARSE your data

2. REGEXP_EXTRACT() on the result, here's several date validator regex from a SO post (https://stackoverflow.com/questions/15491894/regex-to-valida...)

And that's it. In fact in many cases it's probably faster to do it natively in SQL than to export it to python or R and parse there.

funcDropShadow3y ago

The problem you are describing is probably rooted in the specific SQL dialect you had to use. Selecting from inside JSON strings and matching regular expressions should be a no brainer. And it is straight forward in PostgreSQL, e.g.

nequo3y ago

> simple things in Pandas or R, like changing column values based on multiple field value conditions, and it is a struggle requiring at least 17 brackets

With dplyr, wouldn’t this be mutate(x = case_when(… ~ y, TRUE ~ x)) or the same but with ifelse?

oarabbus_3y ago

I haven't used much R, but how would you do something like:

  CASE WHEN
  SUM(daily_revenue) OVER (PARTITION BY department, TRIM(SUBSTR(region, 5)) IN ('North','West','Misc')) > 
  AVG(revenue) OVER (ORDER BY sale_time ASC rows BETWEEN 28 PRECEDING AND CURRENT ROW)
  AND NOT COALSECE(had_prev_month_party, FALSE)

  THEN pizza_party_points + 5

  WHEN <above> AND had_prev_month_party THEN pizza_party_points + 3

  WHEN MIN(sale_time) over (PARTITION BY department) = DATE_TRUNC('month', current_date) then 5

  ELSE GREATEST(pizza_party_points - 1, 0)

  END as pizza_party_performance_points_current

this example may be a bit esoteric but it actually draws from lots of real-world cases (comparing a partitioned sum with an overall average, checking conditions on other columns, messy data parsing and manipulation, implicit waterfall nature of CASE WHEN vs. explicit elseif, etc)

SkeuomorphicBee3y ago

SQL is also a big constant of my programming career, I know it can do everything I need, but sometimes it is frustrating how its limitations make some simple things complicated.

For example, one of the biggest gripes I have with SQL is the fact that the table is the only output format, it doesn't allow outputting data with different cardinalities in a single query. The internal relational model of a RDBMS is extremely rich and powerful, but when extracting that data using SQL you have to leave all the richness behind and flatten the data into a simple dumb table. So as result we have to do much back and forth with separate queries based on results of previous ones (or even ugly hacks like concatenate higher cardinality data into a single field to then re-separate it in the application). A suitable successor for SQL should have a way to output a subset of the relational model with the relational part intact.

randomdata3y ago

I'm not sure a successor to SQL is even necessary, rather there would be benefit to some higher level interfaces provided by the RDMBS that compliment SQL to cut down on the egregious application boilerplate that becomes necessary to achieve common tasks using SQL directly.

SQL is essentially the assembly language of the database. That makes it very powerful, but sometimes you just want a language that gives you a "garbage collector" for free, while still being able to drop down to assembly when you need additional power.

There is no technical reason why an RDBMS can't support different modes for accessing data. We are just so used to putting that work into the application that it has become a hard sell to want to move it to the right place.

Akronymus3y ago

> For example, one of the biggest gripes I have with SQL is the fact that the table is the only output format, it doesn't allow outputting data with different cardinalities in a single query.

With MSSQL at least you can return multiple result sets. Not sure about other database vendors.

randomdata3y ago

SQL has some annoying quirks but on balance I am not sure there is the distaste for SQL that you claim. It's fine, even good in many circumstances, but also too low level for some tasks which pushes common problems onto application developers to solve over and over again or build up layers of dependencies to utilize someone else's solution. That's not the fault of SQL but the fault of the RDBMS.

Our layers of abstractions atop SQL along with approaches taken by DBMSes outside of the common relational names have shown that there is room for improvement within those tasks, able to be done natively by the DBMS, and it does not have to come at the expense of losing SQL when you need to be able to describe lower level questions. Different tools for different jobs, but little will within the RDMBS space to explore those other tools.

zwkrt3y ago

I don’t think that people have a distaste for SQL, I think that large companies have a vested interest in trying to sell no SQL solutions because they more tightly integrate into a “cloud” ecosystem. A database solves that problem domain too well, it’s not marketable.

eurasiantiger3y ago

Can you write an SQL query to return arbitrary JSON? Returning queried data in a nested form is a must nowadays.

ElectricalUnion3y ago

> Can you write an SQL query to return arbitrary JSON? Returning queried data in a nested form is a must nowadays.

Yes. https://www.sqlite.org/json1.html

TheaomBen3y ago

Also https://www.postgresql.org/docs/current/functions-json.html

1 more reply

ithrow3y ago

Make more than a single query and nest on the application layer.

robonerd3y ago

I blame ORMs. ORMs are promoted by scare mongering novice developers away from learning SQL in the first place. I'm ashamed to say I fell for it for a few years. When I eventually learned SQL it was like a fog being lifted from my mind that I hadn't even noticed before.

reidjs3y ago

I tried not to use an ORM for my last project, but I you ended up rolling my own ORM for the application anyways because it was easier to keep the structures in the web application tied to the database. How else do you keep the two consistent?

pfarrell3y ago

The irony is, to effectively use an ORM, you need to be able to debug the SQL it generates when it’s not performing correctly (eg operating on a collection, row by row). I like to use ORM for row based, transactional operations in an OLTP. But usually look for the way to write straight SQL when doing OLAP style reporting stuff. Both of which happen in almost every business.

j / k navigate · click thread line to collapse

0 comments

zvrba3y ago

It's possible to write a novel about just how unstructured and irregular "structured" query language is.

tpetry3y ago

All of your examples are vendor specific extensions to SQL. Nothing of that is part of the standard. So you are effectively hating a specific implementation which is not that logical.

extrememacaroni3y ago

Imagine hating the things that you use rather than a pdf with text, fascinating.

zvrba3y ago

Yes, ANSI syntax is surely composable or dynamic: EXTRACT(YEAR FROM DATE '1998-03-07')

oarabbus_3y ago

In my experience <20% of developers are good enough to be dangerous with SQL, and maybe 5% what I'd consider adept.

The rest range from "SELECT * is as far as I'll go; where's the ORM" to "why the hell are you using cursors to aggregate"

*yes there are some questionable implementations, so please replace "SQL" with "PostgreSQL" if nonstandard SQL implementations have caused nightmares for you in the past.

IgorPartola3y ago

The reason isn’t set based thinking avoidance. I think it is because fundamentally we want to work with smart objects and not rows of data in most cases.

oarabbus_3y ago

omginternets3y ago

Where can I learn about the relationship between set-theory and SQL?

oarabbus_3y ago

SQL does implement the set operations of UNION, INTERSECT, and EXCEPT, but I meant "thinking in sets" more colloquially.

aahortwwy3y ago

Discrete Mathematics with Applications by Epp. Chapters 5, 7, 10.

Fundamentals of Database Systems by Elmasri & Navathe. Part 2.

jstarfish3y ago

I wholly agree with you, but I'll say this:

Same goes for SQL/NoSQL. I loosely know what I need to model and may revise it arbitrarily. SQL does not lend itself to this. NoSQL was designed for it.

funcDropShadow3y ago

[1] Unless you mean specifically a document store without schemas.

Aeolun3y ago

Hmm, it’s the other way around for me. The crazy friend may be fun to go drinking with, but I’d never rely on him for anything.

jrop3y ago

> If I don't know what I'm doing to begin with, I don't need something telling me all the ways in which I'm doing it wrong.

This is unrelated to this conversation, but this is my main beef with Rust. I love Rust (like a lot), but it's just not good for prototyping. End of non-pertinent rant.

bakuninsbart3y ago

oarabbus_3y ago

>Doing regex date validation in a json derived from a string in an sql dialect without function-definitions is horrendous.

1. JSON_CAST/JSON_PARSE your data

2. REGEXP_EXTRACT() on the result, here's several date validator regex from a SO post (https://stackoverflow.com/questions/15491894/regex-to-valida...)

And that's it. In fact in many cases it's probably faster to do it natively in SQL than to export it to python or R and parse there.

funcDropShadow3y ago

nequo3y ago

> simple things in Pandas or R, like changing column values based on multiple field value conditions, and it is a struggle requiring at least 17 brackets

With dplyr, wouldn’t this be mutate(x = case_when(… ~ y, TRUE ~ x)) or the same but with ifelse?

oarabbus_3y ago

I haven't used much R, but how would you do something like:

  CASE WHEN
  SUM(daily_revenue) OVER (PARTITION BY department, TRIM(SUBSTR(region, 5)) IN ('North','West','Misc')) > 
  AVG(revenue) OVER (ORDER BY sale_time ASC rows BETWEEN 28 PRECEDING AND CURRENT ROW)
  AND NOT COALSECE(had_prev_month_party, FALSE)

  THEN pizza_party_points + 5

  WHEN <above> AND had_prev_month_party THEN pizza_party_points + 3

  WHEN MIN(sale_time) over (PARTITION BY department) = DATE_TRUNC('month', current_date) then 5

  ELSE GREATEST(pizza_party_points - 1, 0)

  END as pizza_party_performance_points_current

SkeuomorphicBee3y ago

SQL is also a big constant of my programming career, I know it can do everything I need, but sometimes it is frustrating how its limitations make some simple things complicated.

randomdata3y ago

Akronymus3y ago

> For example, one of the biggest gripes I have with SQL is the fact that the table is the only output format, it doesn't allow outputting data with different cardinalities in a single query.

With MSSQL at least you can return multiple result sets. Not sure about other database vendors.

randomdata3y ago

zwkrt3y ago

eurasiantiger3y ago

Can you write an SQL query to return arbitrary JSON? Returning queried data in a nested form is a must nowadays.

ElectricalUnion3y ago

> Can you write an SQL query to return arbitrary JSON? Returning queried data in a nested form is a must nowadays.

Yes. https://www.sqlite.org/json1.html

TheaomBen3y ago

Also https://www.postgresql.org/docs/current/functions-json.html

1 more reply

ithrow3y ago

Make more than a single query and nest on the application layer.

robonerd3y ago

reidjs3y ago

pfarrell3y ago

j / k navigate · click thread line to collapse