All DBs have their warts and while MySQL has an over abundance of warts they tend to be quite well documented. The warts that postgres has tend to be quite buried and their documentation is very good for syntax comprehension but rather light when it comes to deeper learning.
1. https://stackoverflow.com/questions/50364475/how-to-force-po...
One which bit me recently, and is still utterly baffling to me, is that a column defined as an array type will accept values of that array's type in any number of dimensions greater than that specified for the column. In other words, `{{{{{{text}}}}}}` can be inserted into columns of the following types:
- `TEXT[]`
- `TEXT[][]`
- `TEXT[][][]`
- `TEXT[][][][]`
- `TEXT[][][][][]`
- `TEXT[][][][][][]`
The inverse is true as well! A column specified `TEXT[][]` (and so on) will accept `{text}`. Of course, none of this (as far as I've been able to find) is documented.
But wait, there's more!
`UNNEST` does not allow you to specify depth, it always unnests to the deepest dimension. This, too, is undocumented. In fact, it's anti-documented. The documents provide an example function to unnest a two-dimensional array that is wholly unnecessary (and likely performs worse than the built-in `UNNEST`, but I'm just guessing). Said documentation would seem to imply that the depth of `UNNEST` is 1, but of course that's not the case.
But wait, there's more still!
What if you want to get at a nested array? Idk, I'm sure it's possible, but if you thought `SELECT that_array[1]` is the way to do it, look under your seat because you're getting a `NULL`!
- - -
Postscript: I discovered the first part of this in a production system where a migration had incorrectly nested some data, and where that data was in turn causing certain requests to unexpectedly fail. Of course, given that this was in production, I didn't have a lot of time to research the issue. Found the problem, fixed it, moved on with my day. In the course of fixing it, I discovered the `UNNEST` issue, which... okay fun, fix it a slightly different way than I expected.
So in the course of verifying the particulars to write this comment, I played around with some things, and discovered the `NULL` issue.
At least when Postgres has wildly unexpected behavior, it's exceptionally unexpected behavior.
https://www.postgresql.org/docs/13/arrays.html
> The current implementation does not enforce the declared number of dimensions either. Arrays of a particular element type are all considered to be of the same type, regardless of size or number of dimensions. So, declaring the array size or number of dimensions in CREATE TABLE is simply documentation; it does not affect run-time behavior.
Similar text extends at least all the way back to the documentation for 7.1.
[1]: https://stackoverflow.com/questions/715432/why-is-sql-server...
Of course it is, the documentation is where TFAA got the information in the 4th paragraph of the story, out of 15 or so.
The range itself is what nerd-sniped the author and led them to try and find out why mysql had such an odd yet specific range.
I'm going to have to disagree with you there. This issue is quite well documented in the "SQL Functions Returning Sets" section [0]. The relevant bit starts thusly:
> ...Set-returning functions can be nested in a select list, although that is not allowed in FROM-clause items. In such cases, each level of nesting is treated separately, as though it were a separate LATERAL ROWS FROM( ... ) item...
And there's even a note about the crazy behavior pre-PostgreSQL 10:
> Before PostgreSQL 10, putting more than one set-returning function in the same select list did not behave very sensibly unless they always produced equal numbers of rows. Otherwise, what you got was a number of output rows equal to the least common multiple of the numbers of rows produced by the set-returning functions. Also, nested set-returning functions did not work as described above; instead, a set-returning function could have at most one set-returning argument, and each nest of set-returning functions was run independently. Also, conditional execution (set-returning functions inside CASE etc) was previously allowed, complicating things even more. Use of the LATERAL syntax is recommended when writing queries that need to work in older PostgreSQL versions, because that will give consistent results across different versions.
I agree that allowing SRFs in the SELECT clause is a wart that should never have been permitted, but I think the PostgreSQL docs do a pretty great job describing both the old behavior and the new behavior that has to balance backwards compatibility with sensibility.
(And, indeed, the 9.6 docs have this to say on the behavior of SRFs in the SELECT list: "The key problem with using set-returning functions in the select list, rather than the FROM clause, is that putting more than one set-returning function in the same select list does not behave very sensibly.")
I do think one notable defect with the PostgreSQL docs is that they were designed in a time before modern search engines. They are better understood as a written manual in electronic form. Almost always the information you need is there, but possibly not in the chapter that Google will surface. But there are all sorts of tricks you can use if you update your mental model of how to read the PostgreSQL docs. For example, there's an old-style index! [1]
[0]: https://www.postgresql.org/docs/current/xfunc-sql.html#XFUNC... [1]: https://www.postgresql.org/docs/current/bookindex.html
I think of the Postgres docs as significantly better than most other documentation for this reason. The information is there and it's organized in a way that makes sense.
In terms of the documentation quality I do agree that it's documented but not obviously, you mentioned that it's in the "SQL Functions Returning Sets" section however this section isn't indicated to by the Array functions and operators page - additionally while, if you know that you're dealing with an issue related to SRFs it's pretty easy to find - attempting to get at that information via google isn't going to get you anything unless you specifically hone in on set returning functions. I ended up finding that doc page after finding a SO answer that mentioned SRFs while searching for "postgres unnest cartesian product".
The information is in the documentation, but the documentation isn't always super good at linking to other relevant portions of the documentation and, honestly, reading the documentation about the feature you need is reasonable, but I don't expect most people are reading the full postgres docs before starting to play around with it. So I don't disagree that the information is there, but I do think it is mostly inaccessible due to the structure of the documentation.
Not necessarily an odd choice in the Olden Days, after all BCD representation used to be pretty popular. By modern standards it's insane, but at a time where binary to decimal conversions could be a serious performance concern it might have made sense. For instance if you had a date in "hours, minutes, seconds" and wanted to add or subtract one of these TIME values, you could do it without a single multiply or divide.
Now I was 8 when MySQL first released in 1995, so I can't really comment on whether that choice really made sense back then. 1995 does seem a bit late for BCD shenanigans, but maybe they based their design on existing applications and de-facto standards that could easily go back to the 80's.
Edit: plenty of things still stored dates as strings where the emphasis of the app was on displaying information. Int and float types carried the day whenever any kind of math was going to be used, or when you wanted to output the data in multiple formats.
Earlier than the 60s. Had to work on a system in the 90s to “transpile” IBM minicomputer code to C.
Useful for accounting since what you see is what you get as far as cents and other decimals go. 1/5, and thus 1/10, can only be approximated in floating point.
Not that I’m defending MySQL
The reason I reach for Postgres over MySQL isn't features or technical superiority. Although those result from the reason. Which is, PG devs consistently have "taste", they have "good" style. They make good choices. MySQL devs are not consistently strong in these areas. I'm guessing that MySQL is now so full of tech and design debt (like OP issue) that they're just stuck, without choice.
And not in a desirable way.
"postgre" feels and sounds like a plate full of very wet, oily rice and the subsequent addition of clear vomit to the mix.
These three points has made me raving mad from working with mysql:
- The default 'latin1' character set is in fact cp1252, not ISO-8859-1, meaning it contains the extra characters in the Windows codepage. 'latin2', however, is ISO-8859-2. - The 'utf8' character set is limited to unicode characters that encode to 1-3 bytes in UTF-8. 'utf8mb4' was added in MySQL 5.5.3 and supports up to 4-byte encoded characters. UTF-8 has been defined to encode characters to up to 4 bytes since 2003. - Neither the 'utf8' nor 'utf8mb4' character sets have any case sensitive collation other than 'utf8_bin' and 'utf8mb4_bin', which sort characters by their numeric codepoint.
utf8 being effectively alias of utf8mb3 has cost us so much work its not even funny.
An extra warning about that mess: mysqldump in many configurations will silently convert utf8mb4 down to utf8mb3. So when you're testing your backups or migrations, do an extra check to make sure that emoji and rarer characters didn't get eaten!
Most weirdly, the fact that the default collation is SWEDISH.
It is a complete freak show, the users kinda got used to it, butchering our language (portuguese) to use only characters valid in english, hoping MySQL won't barf spetacularly on them.
Oh you sweet summer child. No it isn't. It's somewhat like Windows CP1252, but it also defines 8 other extra characters that are not in cp1252.
Actually, it's generally saner to assume that people mean Windows-1252 when they say ISO-8859-1. Charset labeling is frequently incorrect, and C1 characters are so infrequently used that seeing one pop up probably means you actually wanted Windows-1252 instead.
So the correct query is "SELECT id, clmn FROM tbl WHERE clmn = '999999999999999999'"
This list is missing the WTF that cascaded deletes/updates don't cause triggers to fire on child tables:
The problem is the weird non-bit packing they did before.
DOS is different, partly because there are fields that fit worse, and partly because 32 bits is just barely enough to store seconds in the first place. If you did a DOS-style packing with 34 or 38 bits it would work fine. And it would be able to represent leap seconds, making it arguably better than unix timestamps!
https://people.cs.nctu.edu.tw/~tsaiwn/sisc/runtime_error_200...
Just saying - I have no horse in this race.
First time I ever saw a number where the leading sign bit has to be set to 1 to indicate non-negative.
(from https://dev.mysql.com/doc/refman/8.0/en/time.html):
> but also elapsed time or a time interval between two events (which may be much greater than 24 hours, or even negative).
For representing temporal intervals, the specification defines two kinds of INTERVAL types (year-month and day-time). Year-month intervals can represent intervals in terms of years, months, or a combination of years and months. Similarly, day-time interval, can represent intervals in terms of days, hours, minutes or seconds, or combinations of them (e.g, days+hours, days+hours+minutes, hours+minutes, etc.)
As a sidenote, the TIME and DATE types are related to the TIMESTAMP type in that TIMESTAMP can be thought of as combination of a DATE part (year, month, day) and a TIME (hour, minute, second) part.
select sum(y.a), count(y.a) from(select distinct x.a from ( select 1 as a union all select 2 as a union all select 1 as a)x)y
sum | count -----+------- 4 | 3
Sqlite3 returns the correct results of sum of 3 count of 2.
To fix this don’t use subqueries.
https://dbfiddle.uk/?rdbms=postgres_12&fiddle=dc4ce40bd52695...
In Postgres, it just works out of the box.
I was using MSSQL prior to 2010 so I have no idea of MySQL unicode handling before that
Less than 10 years ago; MySQL 5.5 went GA in December 2010.
If only I had a penny for everyone I heard this argument and we ended up breaking regression tests or something really obscure in the qa or customer setup
And yes, I have done it so many times myself too
But, seriously? Who cares? It's not even close to an extra order of magnitude of range. The type is obviously meant to be used for time values that have a context of hours within a day, supporting a few days as headroom... so supporting 1,024 instead of 838 is pointless -- if you're getting anywhere even close to the max value, you probably shouldn't be using this type in the first place.
And yes, it's probably best not to change it for backwards compatibility. Can I imagine a case where it could break something? No, not off the top of my head. But it probably would break some application somewhere. And for such a widely deployed piece of critical foundational infrastructure, being conservative is the way to go.
Nothing about this seems WTF at all, except for the author's seeming opinion that elegant, power-of-two ranges ought to trump backwards compatibility with things that probably made sense at the time.
I think the workaround was to pass some parameter in the connection string.
So the answer is just it's for backwards compatibility with MySQL 3?
I was kind of hoping for more.
"I think this is stupid" is a really poor reason to break backwards compatibility, despite how many other software projects use this reasoning.
But of course, MySQL bad, PostgreSQL good.
True. But his argument seems to be in the direction of "this is highly unexpected behaviour" and I tend to agree. The number of applications broken by extending the date range probably dwarves to the number of bugs avoided by not having the time span break at such a strange length.
The behavior in question is having an upper and lower bound on a time interval. This strikes me as highly expected behavior.
The maximum on the interval is lower than the author expected; and reading the docs quickly cleared up what the interval maximum is.
The entire rant boils down to "MySQL's choice to keep backwards compatibility is stupid, because I think this interval limit should be larger."
I was lucky and could simply redeploy my application, but I have never used Mysql since.
With Postgres, you can just drop the binary table data files in, restart the database and pg rebuilds the indexes and relationships.
From a quick google, it now looks like MySQL is now more robust in this regard.