Can PostgreSQL pickup where MySQL left off? (opens in new tab)

(news.cnet.com)

40 pointsruffdev14y ago45 comments

45 comments

When I started with PHP3 development in 1998, PostgreSQL was still called Postgres95 and it had some severe limitations which were directly adverse to the features needed for web development:

1) performance was very slow

2) There was an effective row size limit of 8K over the whole row which worked against people who wrote forum software (big posts) or were (ugh) storing binaries in the database.

3) Even today, PostgreSQL requires a bit more maintenance and configuration until it works right, whereas MySQL ran out of the box. Back then, VACUUM wasn't at all optional and still required an exclusive lock over the tables it was working on.

4) There was a misconception that each Postgres user needed to be a Unix system user (this might or might not be true. Probably isn't, but I and others were certainly thinking that), so it was working against the usual cheap virtual hosting.

5) The MySQL extension for PHP was much more advanced than the postgres one, leading to people preferring MySQL which in the end lead to the virtual hosts only installing MySQL at which point Postgres' fate was sealed.

6) MySQL was coined as a spiritual successor to mSQL (just look at the names), which was very popular at the time as it was the first SQL-database for "normal people". You either did flat-files or mSQL.

7) MySQL had Windows support from the beginning. Postgres95 was still using cygwin (if you could get it to work). Back then, many people were using Windows as their development environment and being able to run the whole stack on your development machine does have some advantages.

By now, aside of the maintenance thing, Postgres has lost all these drawbacks, but now it's too late as people are using what they know is working for them, so they are just going with MySQL (or SQLite if they need more simplicity, which is interesting in itself as MySQL for ages provided an embedded linkable and serverless variant which also never took on).

mgkimsal14y ago

#2 - even if you don't store binaries, just very long posts, it's a problem.

#4 - whether it's a misconception or not, the majority of tutorials and even the default utilities make that assumption. Working around that is not something as widely documented as the defaults.

Also, forum software is a big arena. Not being able to get accurate count() values back hurt(s) postgresql. It's hurting people who now rely on MySQL's innodb as well, but you always have the option of MyISAM for certain tables that you want accurate count()s on for pagination. I've heard for years that "it's so easy to do this in PostgreSQL, just write some triggers and stored procedures... " yet... if it's "so easy", why not just bundle that in as default functionality (or perhaps even just a script that would create triggers for you) in postgresql?

I don't ever recall hearing about the "embedded linkable and serverless variant" of MySQL, and have been doing PHP since 1996. Perhaps it was a licensing issue? MySQL seemed to take off largely due to a rise in PHP, but sqlite seems to have taken off more because of embeddability and the public domain aspect of it first, well, before I ever saw PHP adoption of sqlite.

pilif14y ago

> I don't ever recall hearing about the "embedded linkable and serverless variant" of MySQL, and have been doing PHP since 1996. Perhaps it was a licensing issue?

http://dev.mysql.com/doc/refman/5.1/en/libmysqld.html

pretty much a licensing issue as all of MySQL is released under the GPL or a proprietary license. This also includes the mysql client by the way, but they made a license exception there that allowed linking against PHP (extensions mysql and mysqli) and later there was a reimplementation of the on-the-wire protocol inside a PHP extension (mysqlnd) released under the PHP license.

> Not being able to get accurate count() values back hurt(s) postgresql.

count() is totally accurate within the limits of MVCC. If you need it to be 100% accurate across statements, make your transaction SERIALIZABLE.

There's one thing about count(): count(*) under MySQL, if using MyISAM tables, is optimized so it doesn't have to actually count and thus is much faster than, say count(row), count(whatever) with a where clause or count([asterisk as to not confuse the HN parser.]) in any other database.

But the moment you use InnoDB or any other database that supports transactions, count(whatever) unfortunately requires counting in all cases.

1 more reply

jacques_chester14y ago

> why not just bundle that in as default functionality (or perhaps even just a script that would create triggers for you) in postgresql?

Multi-version concurrency control means that a single universal definition of count() is basically insensible.

bretthoerner14y ago

Under what circumstances is count() inaccurate in Postgres?

1 more reply

masklinn14y ago

> 2) There was an effective row size limit of 8K over the whole row which worked against people who wrote forum software and were (ugh) storing binaries in the database.

Could you not just use the dedicated TEXT or BLOB type for that kind of stuff, and store them out-of-row? (yes, would probably have been even slower)

pilif14y ago

Nope. There was blob-support, but that relied on special library functions to extract the data to a file which you'd then have to manually read.

Of course this also means that you couldn't search in these fields or do anything else you'd do directly on the database.

TOAST tables were added in 7.1 AFAIK which lifted that limit and allowed for arbitrary sized rows by moving the big fields away into their own storage.

The old functionality is still there (http://www.postgresql.org/docs/9.0/interactive/largeobjects....) though I totally failed and still fail to see any practical use.

1 more reply

Zak14y ago

I wonder why MySQL became super-popular and Postgres didn't. I remember reading for years about how Postgres had a bunch of stuff (e.g. transactions) that MySQL didn't (until recently) and that anybody who knew anything about databases would choose Postgres, yet people still chose MySQL.

What did MySQL have during the last decade that made it better for certain applications than Postgres?

buro914y ago

MySQL was simple.

You didn't have to deal with schema/tablespaces, or the chance that someone used stored procs, or security (almost everyone just used the equivalent of a root account).

MySQL just worked.

For the most part, with zero monitoring, it stayed working until the server ran out of disk space or died.

MySQL was always available.

No matter which hosting company you looked at, no matter how little you wanted to spend, MySQL was pre-installed and ready for use.

I never thought it was the better choice, but I understood why it dominated. Once things like Wordpress, vBulletin, phpBB and other stuff was written only for MySQL the choice was then made for you.

I've long been a fan of applications that give you a choice of DB, but it's now unrealistic as many applications have designed schemas to cope with the quirks and limitations of MySQL specifically. These would now be harder to migrate than simply changing a database string, and most application devs don't invest in such things when what they have works well enough and is all anyone is asking for.

Zak14y ago

The second half of your explanation, which boils down to "it's everywhere" makes sense now, but is only true because it was already in demand.

That it was faster (for certain applications), easier and more reliable does make sense. I understand, for example why PHP got popular even though there were other options available at the time that I believe were better from a software engineering[0] perspective. Things get popular by being better at the things the majority of users care about, even if they're awful at everything else.

[0] Did I really just use that term? I can't think of a better one to describe the issue I'm talking about.

1 more reply

igorw14y ago

Actually, phpBB 2.0.0 which was released in 2002 supported: IBM DB2, MS Access, MS SQL, MySQL3, MySQL4, Oracle... and Postgres 7.

But you are right, many of the other popular projects did not make this effort.

epo14y ago

MySQL is a triumph of branding and fanboy self-delusion.

MySQL made it easier initially by omitting lots of difficult stuff (data integrity, management, transactions) thus making easier to get something working quickly. It also had a reputation for being faster then PostgreSQL, a reputation spread by fanboys (in truth people who were bigging up the only database they had a faint comprehension of).

Then it became ubiquitious. Now, like some nasty rash, we can't get rid of it.

masklinn14y ago

> It also had a reputation for being faster then PostgreSQL

It was indeed much faster when reading (probably still is on MyISAM) due to the lack of everything (especially transactions and any concept of data integrity).

lamby14y ago

> MySQL made it easier initially by omitting lots of difficult stuff (data integrity, management, transactions) thus making easier to get something working quickly.

Remind anyone of NoSQL? :)

jacques_chester14y ago

MySQL (MyISAM, really) was faster for loading chunks of text and spitting them into Perl and PHP. Back in the 133Mhz days, people cared about that. Given that MyISAM was pretty much flat files in a clown suit, you didn't get much in the way of serious database functionality.

Udo14y ago

I can say from a web dev's perspective that the ubiquitousness of MySQL is only part of the allure (an important part though). MySQL makes simple stuff very easy to implement, it's a great solution if you just want your database to store your stuff and get out of the way. Sure, like any other DB it requires some expertise if you want performance, but it makes this incredibly simple. It just works and you don't spend a lot of time arguing with the software.

It also has a single feature that I find incredibly cool: it supports automatically generated row IDs. It's been a while since I last tried Oracle and Postgres, but at the time you had to create a second table just for a sequence number and it was extremely awkward to handle.

Could the auto-increment feature in MySQL be better? Sure, for example it would be nice to have an option where the row is identified by a hash value instead of a continuous sequence. But even in its limited form today it beats having to wrestle with the glued-together equivalent as implemented in "real databases".

masklinn14y ago

> It also has a single feature that I find incredibly cool: it supports automatically generated row IDs. It's been a while since I last tried Oracle and Postgres, but at the time you had to create a second table just for a sequence number and it was extremely awkward to handle.

Must have been a while indeed. Postgres's SERIAL datatype has been available at least since 6.4. Postgres 6.4 was released in 1998.

wlll14y ago

Postgres has had auto-incrementing row ids for a while, see this example: https://gist.github.com/1178237

1 more reply

wisty14y ago

MySQL was popular for the same reason No-SQL is - it lacked features, but it was fast as hell with no tuning. Postgres did more, so it had to be slower, and it was tuned to work on minimal resources (and not steal all your RAM). Like MongoDB today, MySQL was tuned to work fast, at the costs of both reliability, and being a poor citizen on a shared host.

There are other factors - you could limit its use of disk space (so hosts liked it), but mostly it was a good key-value store with some database functionality.

People who wanted a "real" database used Postgres. People who wanted to write a web app used MySQL.

Tichy14y ago

I seem to faintly remember that in the beginning Postgres was Unix only, and MySQL was available for Windows, too.

Nelson6914y ago

Well, MySQL has been easier to set up, that cannot be underestimated; I've basically saved off a pair of configuration files I've just reused a dozen times to set up PostgreSQL. MySQL also works pretty fast right out of the box, if you data is smaller than physical memory, it might be nearly as fast as you can make it by default. PostgreSQL has fairly conservative performance settings when first installed, you can make it really sing but it takes some tuning and in some cases that tuning requires a fairly deep understanding of your data. Think most MySQL users are not DBAs, all the data specific settings for a PostgreSQL like database might frighten them some; it's a really odd juxtaposition when you think about it, stereotypically, geeks are drawn to the knobs and dials you can play with but I honestly think in this case it turned them away.

Then a little bit of support from ISPs and corporations and it's a no-brainer for most folks. All the oss CMS and blogs and everything support MySQL really really well. ALl the ISPs pretty much use MySQL as the "database" solution.

smackay14y ago

The query browser and admin tools were easier to use. They weren't really better, rather the presentation was simpler. The Postgres tools (pgAdmin) shows a little too much detail right away (namespaces, etc.) and looks a lot more complicated to use which might put off people who just need to get it working and build from there.

enobrev14y ago

When I first got into full-time pro web development (about a decade ago), I started learning PHP. Just about every tutorial I read or book I wished I could afford had 'MySQL' next to 'PHP' as though they were one in the same.

From there forward, any time I was looking into a new host, they ALL had both PHP and MySQL. Any job I was bidding, discussion I was having in IRC or on Forums, anywhere - the two came as a matching set. I can't say that's WHY it became more popular, but I've always assumed their popularity were closely aligned.

sandGorgon14y ago

Right now - the biggest problem is Wordpress which has ZERO postgres support. A PG-supported version of Wordpress will go a long, long way in popularizing Postgres.

jacques_chester14y ago

It'll never happen. Wordpress core might be ported, but there's thousands of plugins and themes that assume MySQL.

dramaticus314y ago

MySQL got the wrong results faster.

davidw14y ago

I've always likened it to riding a bicycle downhill with no brakes. Yes, it's faster, but...

Of course, now they've converged a lot, with Postgresql getting significant speed upgrades, and Mysql having InnoDB, with a few 'minor details' like transactions.

mkup14y ago

Well, PostgreSQL is licensed under BSD, which makes closed-source, commercial forks possible. That's why it is attractive for large businesses, like Apple.

GPL-licensed MySQL serves the opposite purpose: it is impossible to fork it into commercial DB product, grow it as times goes by, and threaten Oracle. It is nearly impossible to create commercial product for this market, because there's a free product there: MySQL. MySQL is like a ditch around the Oracle stronghold, a ditch to keep competitors far away. GPL is a perfect tool for such tricks.

In contrast to GPL which suppresses commercial developments in it's market segment, BSD encourages them.

For the same reason, Apple slowly gets rid of GCC in favour of clang. It is perfectly OK from legal point of view to include BSD-licensed compiler into commercial closed-source IDE, unlike GPL-licensed compiler.

sigzero14y ago

I actually think the license had very little to do with its popularity. The fact that MySQL was a lot simpler than Pg made it popular.

jacques_chester14y ago

IMO the short answer is: no.

Most opensource software that uses MySQL is riddled with MySQLisms.

riledhel14y ago

Many forks exists to deal with this situation http://en.wikipedia.org/wiki/MySQL#Forks

pornel14y ago

You don't need fork for that — just enable ANSI mode.

The problem is that MySQL-based software doesn't do that and relies on the default "quirks" mode.

gierach14y ago

In my mind, replication has historically been a pretty big deciding factor in favor of MySQL. PostgreSQL didn't support it natively until version 9.0 (which still isn't part of most Linux distro repositories). You'd have to install Slony or something similar. MySQL made master/slave or master/master replication simple to configure and a breeze to maintain.

ruffdevOP14y ago

thanks for all these points. I am doing a presentation tomorrow on the same and all these points will help!

pablospr14y ago

When you’re choosing a database, you’re making a long-term decision, because changing your mind later is difficult and expensive. You want to get it right the first time

Here's a comparison in a tabluar format http://database-management-systems.findthebest.com/compare/1...

Halienja14y ago

that's a good comparison. We use MySQL and PostGre both in our organization and the deciding factor is mainly cost.

masklinn14y ago

> PostGre

It has never been named "PostGre".

Its original name was Postgres, the current one is PostgreSQL with Postgres the accepted nickname.

j / k navigate · click thread line to collapse

45 comments

pilif14y ago

When I started with PHP3 development in 1998, PostgreSQL was still called Postgres95 and it had some severe limitations which were directly adverse to the features needed for web development:

1) performance was very slow

2) There was an effective row size limit of 8K over the whole row which worked against people who wrote forum software (big posts) or were (ugh) storing binaries in the database.

mgkimsal14y ago

#2 - even if you don't store binaries, just very long posts, it's a problem.

#4 - whether it's a misconception or not, the majority of tutorials and even the default utilities make that assumption. Working around that is not something as widely documented as the defaults.

pilif14y ago

> I don't ever recall hearing about the "embedded linkable and serverless variant" of MySQL, and have been doing PHP since 1996. Perhaps it was a licensing issue?

http://dev.mysql.com/doc/refman/5.1/en/libmysqld.html

> Not being able to get accurate count() values back hurt(s) postgresql.

count() is totally accurate within the limits of MVCC. If you need it to be 100% accurate across statements, make your transaction SERIALIZABLE.

But the moment you use InnoDB or any other database that supports transactions, count(whatever) unfortunately requires counting in all cases.

1 more reply

jacques_chester14y ago

> why not just bundle that in as default functionality (or perhaps even just a script that would create triggers for you) in postgresql?

Multi-version concurrency control means that a single universal definition of count() is basically insensible.

bretthoerner14y ago

Under what circumstances is count() inaccurate in Postgres?

1 more reply

masklinn14y ago

> 2) There was an effective row size limit of 8K over the whole row which worked against people who wrote forum software and were (ugh) storing binaries in the database.

Could you not just use the dedicated TEXT or BLOB type for that kind of stuff, and store them out-of-row? (yes, would probably have been even slower)

pilif14y ago

Nope. There was blob-support, but that relied on special library functions to extract the data to a file which you'd then have to manually read.

Of course this also means that you couldn't search in these fields or do anything else you'd do directly on the database.

TOAST tables were added in 7.1 AFAIK which lifted that limit and allowed for arbitrary sized rows by moving the big fields away into their own storage.

The old functionality is still there (http://www.postgresql.org/docs/9.0/interactive/largeobjects....) though I totally failed and still fail to see any practical use.

1 more reply

Zak14y ago

What did MySQL have during the last decade that made it better for certain applications than Postgres?

buro914y ago

MySQL was simple.

You didn't have to deal with schema/tablespaces, or the chance that someone used stored procs, or security (almost everyone just used the equivalent of a root account).

MySQL just worked.

For the most part, with zero monitoring, it stayed working until the server ran out of disk space or died.

MySQL was always available.

No matter which hosting company you looked at, no matter how little you wanted to spend, MySQL was pre-installed and ready for use.

I never thought it was the better choice, but I understood why it dominated. Once things like Wordpress, vBulletin, phpBB and other stuff was written only for MySQL the choice was then made for you.

Zak14y ago

The second half of your explanation, which boils down to "it's everywhere" makes sense now, but is only true because it was already in demand.

[0] Did I really just use that term? I can't think of a better one to describe the issue I'm talking about.

1 more reply

igorw14y ago

Actually, phpBB 2.0.0 which was released in 2002 supported: IBM DB2, MS Access, MS SQL, MySQL3, MySQL4, Oracle... and Postgres 7.

But you are right, many of the other popular projects did not make this effort.

epo14y ago

MySQL is a triumph of branding and fanboy self-delusion.

Then it became ubiquitious. Now, like some nasty rash, we can't get rid of it.

masklinn14y ago

> It also had a reputation for being faster then PostgreSQL

It was indeed much faster when reading (probably still is on MyISAM) due to the lack of everything (especially transactions and any concept of data integrity).

lamby14y ago

> MySQL made it easier initially by omitting lots of difficult stuff (data integrity, management, transactions) thus making easier to get something working quickly.

Remind anyone of NoSQL? :)

jacques_chester14y ago

Udo14y ago

masklinn14y ago

Must have been a while indeed. Postgres's SERIAL datatype has been available at least since 6.4. Postgres 6.4 was released in 1998.

wlll14y ago

Postgres has had auto-incrementing row ids for a while, see this example: https://gist.github.com/1178237

1 more reply

wisty14y ago

There are other factors - you could limit its use of disk space (so hosts liked it), but mostly it was a good key-value store with some database functionality.

People who wanted a "real" database used Postgres. People who wanted to write a web app used MySQL.

Tichy14y ago

I seem to faintly remember that in the beginning Postgres was Unix only, and MySQL was available for Windows, too.

Nelson6914y ago

smackay14y ago

enobrev14y ago

sandGorgon14y ago

Right now - the biggest problem is Wordpress which has ZERO postgres support. A PG-supported version of Wordpress will go a long, long way in popularizing Postgres.

jacques_chester14y ago

It'll never happen. Wordpress core might be ported, but there's thousands of plugins and themes that assume MySQL.

dramaticus314y ago

MySQL got the wrong results faster.

davidw14y ago

I've always likened it to riding a bicycle downhill with no brakes. Yes, it's faster, but...

Of course, now they've converged a lot, with Postgresql getting significant speed upgrades, and Mysql having InnoDB, with a few 'minor details' like transactions.

mkup14y ago

Well, PostgreSQL is licensed under BSD, which makes closed-source, commercial forks possible. That's why it is attractive for large businesses, like Apple.

In contrast to GPL which suppresses commercial developments in it's market segment, BSD encourages them.

sigzero14y ago

I actually think the license had very little to do with its popularity. The fact that MySQL was a lot simpler than Pg made it popular.

jacques_chester14y ago

IMO the short answer is: no.

Most opensource software that uses MySQL is riddled with MySQLisms.

riledhel14y ago

Many forks exists to deal with this situation http://en.wikipedia.org/wiki/MySQL#Forks

pornel14y ago

You don't need fork for that — just enable ANSI mode.

The problem is that MySQL-based software doesn't do that and relies on the default "quirks" mode.

gierach14y ago

ruffdevOP14y ago

thanks for all these points. I am doing a presentation tomorrow on the same and all these points will help!

pablospr14y ago

When you’re choosing a database, you’re making a long-term decision, because changing your mind later is difficult and expensive. You want to get it right the first time

Here's a comparison in a tabluar format http://database-management-systems.findthebest.com/compare/1...

Halienja14y ago

that's a good comparison. We use MySQL and PostGre both in our organization and the deciding factor is mainly cost.

masklinn14y ago

> PostGre

It has never been named "PostGre".

Its original name was Postgres, the current one is PostgreSQL with Postgres the accepted nickname.

j / k navigate · click thread line to collapse