PostgreSQL 11.3 and 10.8 (opens in new tab)

(postgresql.org)

281 pointsoskari7y ago157 comments

157 comments

rtpg7y ago

For those stuck on older versions of Postgres, I highly recommend paying the downtime to upgrade. Going from 9.x to 11 will get you a measurably large performance gain for free.

dspillett7y ago

Out of interest (SQL Server guy mainly, so only partly keep up with what other engines are doing), what changes significantly affect performance (without making changes to your own code/configuration to make use of new features) in 10.x & 11.x?

cldellow7y ago

The query planner in 10 got a lot better at enforcing row-level security constraints efficiently for some common scenarios, like 10-20x speedups. See https://github.com/postgres/postgres/commit/215b43cdc8d6b4a1... and the linked mailing list thread for more info, if you're curious.

SEJeff7y ago

One of the huge ones was the ability to use > 1 cpu core for big aggregations or huge select queries. That and a massively better query planner.

fabian2k7y ago

There is usually a bunch of small improvements in every release, and those can add up over time. In Postgres 10 and 11 a lot of stuff happened related to parallel queries, and many more queries can be run in parallel now. 11 added a JIT compiler to the query planner, but I'm not sure whether that is enabled by default yet.

ape47y ago

It kinda bugs me that people say "SQL Server" to mean "Microsoft SQL Server". I mean, there are other sql servers.

cstejerean7y ago

SQL Server is a registered trademark of Microsoft at least in the US and I’m sure other jurisdictions. Using SQL Server to refer to Microsoft’s relational database product is perfectly reasonable and pretty much every will know what you mean. The generic term seems to be RDMS or some variation on that.

1 more reply

mr_overalls7y ago

Same here. FWIW, I informally refer to it as "MS-SQL" to differentiate.

1 more reply

giovannibonetti7y ago

Good to know! Can you give an example? When I google "SQL Server" it seems like most results are related to MS SQL Server

1 more reply

dragonwriter7y ago

> It kinda bugs me that people say "SQL Server" to mean "Microsoft SQL Server". I mean there are other sql servers.

I don't mind that; sure, there are other RDBMS servers that support SQL as their main or exclusive language, but unless you are talking about ancient Sybase products (for which there is a very good reason for the shared branding) “SQL Server” is a clear, exclusive Microsoft product identity and not any worse of a label, say, FTP (yes, there are other file transfer protocols).

OTOH,what does bug me is when people say “SQL” to mean “Microsoft SQL Server”.

1 more reply

astine7y ago

I find that the bigger problem for me is the number of RDBMs that use 'SQL' in the name somewhere means that non-technical people get confused between 'knowing SQL', as in knowing how to develop SQL code, and 'knowing SQL' as in knowing how to manage a particular database product (usually SQL Server). That's how I got turned into a DBA without realizing it.

Scarbutt7y ago

Out of interest ;) SQL Server is such an expensive beast, ~$15K per core, what are your reasons for prefering it over PG?

eropple7y ago

I quite like Postgres and use it preferentially, but your numbers are off. SQL Server Enterprise costs more like $7500/core; that $15K pack, as far as I am aware, comes with two core licenses. SQL Server Standard 2016 costs $931 for a license if you use CALs ($209 a pop), or $3700/core.

Also bear in mind that almost nobody pays list price for any of this.

2 more replies

dspillett7y ago

Day job. Never had to pay for it out of my own pocket! I've worked at MS shops for quite some time.

Also while we can do what we like with our newer SaaS offerings, getting some of our on-prem clients to use PG over SQL Server would be an uphill struggle so not considering other options isn't entirely our decision.

That $15K sounds like an Enterprise Edition costing too, IIRC Standard is significantly cheaper. Most don't need Enterprise, especially since 2016Sp1 when a lot of previously Enterprise-only features became available in all editions (including the free Express edition in many cases). Developer edition is currently free (licensed for development use only, of course) and is essentially Enterprise edition with different licensing terms.

sovnade7y ago

It's about $7000 per core for enterprise - and if you do something like license a whole blade and put VMs on top of it, you could fairly easily run 10-20+ SQL VMs on one decent sized blade for < 150k. And anything pre-prod is free. It's a small cost for most larger size companies of this scale. If you have one or two servers, then no, enterprise is probably out of reach.

mr_overalls7y ago

I use MS-SQL for my day job, and Postgres for personal projects. Pros of MS-SQL are mainly 1) easy integration with the universe of Microsoft dev tools, frameworks, and infrastructure, and 2) great database tooling in itself:

* SSMS - SQL Server Management Studio

* The MS BI stack: SSRS (reporting), SSAS (analysis), SSIS (integration)

1 more reply

polskibus7y ago

Analysis Services, Integration Services, Clustered Index

greggyb7y ago

Big parallelism updates that the query planner can take advantage of.

I believe also updates to index seek or scan in that time.

barrkel7y ago

I've just upgraded my hobby app from 9.6 to 11 and some of my old queries are now timing out :-|

I'll update this thread when I find out why.

barrkel7y ago

    analyze

fixed pretty much everything :)

in97y ago

what do you mean? You looked at the analyse report and changed the query to avoid some unnecessary computation?

eberkund7y ago

I maintain a couple of MySQL based applications. I don't really use any features outside of "standard SQL" is there a reason to switch over to Pg? I haven't used Pg before and usually default to MySQL.

atombender7y ago

One big argument: Transactional DDL. For example:

  begin;
  alter table foos add answer int not null default 42;
  alter table foos drop column plumbus;
  update foos set name = upper(name);
  create table bars (t serial);
  drop table dingbats;
  rollback;  // Or, of course, commit

What's the benefit? Atomic migrations. You can create, alter, drop tables, update data, etc. in a single transaction, and it will either commit complete if all the changes succeed, or roll back everything.

This is not possible in MySQL, or almost any other database [1], including Oracle — DDL statements aren't usually transactional. (In MySQL, I believe a DDL statement implicits commits the current transactions without warning, but I could be wrong.)

Beyond that, I'd mention: PostGIS, arrays, functional indexes, and window functions. You may not use these things today, but once you discover them, you're bound to.

[1] https://wiki.postgresql.org/wiki/Transactional_DDL_in_Postgr...

Deimorz7y ago

I use transactional DDL in my tests. All the tables, triggers, etc. are set up inside a transaction, and then the actual tests run inside nested transactions. At the end of the test run, the outer transaction gets rolled back, and everything disappears.

I don't know if it accomplishes anything truly new (other than ideas that aren't very useful in practice like being able to have multiple test runs going in parallel), but it's a pretty neat way to be able to do it and works well.

atombender7y ago

Transactional tests have some downsides, unfortunately. If your tests test transactional code, that code itself cannot create transactions; they have to use savepoints, which aren't quite the same. Transactional tests also don't work with testing anything concurrent, unless you share the session across threads/goroutines/whatever.

Lastly, if a test fails you'd typically like to leave the data behind so that you can inspect it. A transactional test that rolls back on failure won't allow that.

3 more replies

pritambaral7y ago

I use Pg's Template Databases[1] for this. Gives me the freedom to use transactions and concurrency in tests.

Without this built-in feature, I'd have used filesystem snapshots, if I didn't mind the time it'd take to stop and start Pg.

----

1: https://www.postgresql.org/docs/current/manage-ag-templatedb...

h1d7y ago

I don't think one would migrate if Pg's strength is on 'alter table' which isn't what people do on a daily basis

Might want to mention the downside of using MySQL as well. (Am also interested to know as a daily MySQL user.)

taffer7y ago

If Oracle DDL is not transactional, what's the point of its Edition-Based Redefinition feature?

atombender7y ago

Oracle's "editions" are more like versioning, I think. Last I checked, only a very limited set of schema objects were editionable (views, triggers, procedures, etc.), not including tables or indexes.

kangoo17077y ago

At my PHP-shop company, most projects are limited to MySQL 5.7 (legacy reason, dependency reason, boss-likes-MySQL reason...). They are all handicapped by MySQL featureset, and can't update to 8 yet. If they had used Postgres some years ago, they would get:

- JSON column (actually MySQL 5.6 supports it but I doubt if it's as good as Postgres)

- Window functions (available in MySQL 8x only, while this has been available since Postgres 9x)

- Materialized views, views that is physical like a table, can be used to store aggregated, pre-calculated data like sum, count...

- Indexing on function expression

- Better query plan explanation

Macha7y ago

Also suffering under mysql 5.7 here and agree. Also even stuff like CTEs/WITH make queries more readable and composite field types like ARRAY are still missing (you see GROUP_CONCAT shenanigans being used instead).

For indexing on function expressions in particular, the workaround we use is to add a generated column and index that.

colanderman7y ago

Be warned that in PostgreSQL, WITH is an optimization barrier, and is planned to remain that way to serve that purpose. If you can, prefer using views to enhance readability (and testability as a bonus). PostgreSQL views (unlike those in MySQL) do not prevent optimization across them.

2 more replies

evanelias7y ago

> Indexing on function expression

MySQL 5.7 fully supports this. See https://dev.mysql.com/doc/refman/5.7/en/create-table-generat... and https://dev.mysql.com/doc/refman/5.7/en/create-table-seconda...

> JSON column (actually MySQL 5.6 supports it but I doubt if it's as good as Postgres)

Actually MySQL 5.6 doesn't support this, but 5.7 does, quite well: https://dev.mysql.com/doc/refman/5.7/en/json.html

hans_castorp7y ago

Indexing a generated/computed column is not the same as creating an index on an expression. If you want to support several different expressions you need to create a new column each time.

Additionally, an ALTER TABLE blocks access to the table. Indexes can be created concurrently while other transactions can still read and write the table.

But MySQL doesn't support indexing the complete JSON value for arbitrary queries. You can only index specific expressions by creating a computed column with that expression and indexing that.

1 more reply

noir_lord7y ago

I'm stuck on 5.7 because previous dev used the worst sprocs I've seen (no exaggeration) and until I've ripped them all out I daren't move to 8, it was on 5.5 when I started but with much effort I got it tested enough to reasonably confident that 5.7 would work.

It's an excruciating process though.

hans_castorp7y ago

Actually window functions were introduced in Postgres 8.4

oarabbus_7y ago

If there is an analytics db/replica, your data analysts will worship the ground you walk on if you migrate from MySQL to Postgres.

sandstrom7y ago

Interesting, can you elaborate, I'm considering a switch.

evanelias7y ago

Not the parent, but even as a MySQL expert I'd agree that Postgres is often a better choice for OLAP-heavy workloads (e.g. very complex reporting and analytical queries). MySQL's query planner historically hasn't been great for these, and ditto for its feature set overall, although it is improving.

Meanwhile, for high-volume OLTP workloads, MySQL (with either InnoDB or MyRocks on the storage engine side) has some compelling advantages... this is one reason why social networks lean towards MySQL for their OLTP product data, and/or have stayed on MySQL despite having the resources to switch.

As with all things in computer science, there are trade-offs and it all depends on your workload :)

oarabbus_7y ago

Please humor me if I sound a bit triggered in this response, as MySQL pre-v8.0 is like if the creators went to data analysts and asked: "what features could we remove, and what changes could we make to an SQL-standard-compliant DB (postgres), in order to slow you down as much as possible, hinder your ability to read and write queries, and generally make your life a living hell?"

I really get that MySQL is good for what it does, from an engineer's point of view. It is an absolute piss-poor excuse for a database, prior to v8.0.

So what's wrong with MySQL (again, prior to v8.0, but no one seems to use the damn current version)

-Not ANSI SQL compliant (unlike Postgres)

-No CTEs/WITH clause (?!)

-no WINDOW FUNCTIONS (?!?!?!?)

-"schemas are called databases" which makes for bizarre interpretation of `information_schema` queries, which behave the same across all other DBs except mySQL. What I mean to say is MySQL calls each schema it's own database. This results in having to connect the same DB multiple times to other programs/APIs/inputs which accept JDBC.

-Worse replication options than postgres, not default ACID compliant,

-Don't know the programming term for this... but the horrendous "select col1, col2, col3... colN, count(<field>) from table group by 1" implicit group by. Meaning the system takes your INVALID query, and does things underneath the hood to return a result. Systems should enforce correct syntax (you must group by all non-aggregation columns... mysql implicitly does this under the hood).

-on a tangentially related note to the prior one, MySQL returns null instead of a divide by zero error when you divide by zero. Divide by zero errors are one of the few things that should ALWAYS RETURN AN ERROR NO MATTER WHAT -mysql doesn't support EXCEPT clauses

-doesn't support FULL OUTER JOIN

-doesn't support generate_series,

-poor JSON support

-very limited, poor array/unnest support

-insert VALUES () (in postgres) not supported

-lack of consistent pipe operator concatenation,

-weird datatype suppport and in-query doesn't support ::cast

-doesn't support `select t1._* , t2.field1, t2.field2 from t1 join t2 on t1.id = t2.id` ; that is, you cannot select * from one table, and only certain fields from the other.

-case dependence in field and table names when not escape quoted (mysql uses backtick, postgres uses double quote for escaping names). What the fuck is this? SQL is a case-insensitive language, then the creators build-in case sensitivity?

-As I mentioned above, mysql uses backticks to escape names. This is abnormal for SQL databases.

-mysql LIKE is case-insensitive (what the hell, it's case-sensitive everywhere else). Postgres has LIKE, and ILIKE (insensitive-like).

-ugly and strange support for INTERVAL syntax (intervals, despite being strings, give a syntax error in mysql. Example: In postgres or redshift etc you would right `select current_timestamp - interval '1 week'. In MySQL, you'd have to do `select current_timestamp - interval 1 week` (the '1 week' could be '7 month' or '2 day'... it's a string, and should be in single quotes. MySQL doesn't do this)

-mysql doesn't even support the normal SQL comment of `--`. It uses a `#` instead. No other database does that.

-probably the worst EXPLAIN/EXPLAIN ANALYZE plans I've ever seen from any database, ever

-this is encapsulated in the prior points but you can't do something simple like `select <fields>, row_number() as rownum from table`. Instead you have to declare variables and increment them in the query

-did I mention it's just straight up not SQL standard compliant?

At least MySQL 8.0 supports window functions and CTEs (seriously it's a death knell to a data analyst not to have these). They are the absolute #1 biggest piece of missing functionality to an analyst in my opinion.

This entire post focused on "mySQL have-nots", rather than "Postgres-haves" so I do think there are actually _even more_ advantages to using Postgres over MySQL. I understand MySQL is very fast for writes, but to my understanding it's not even like Postgres is slow for writes, and on the querying side of the coin, it's a universe of difference.

If you ever use MySQL in the future and there will be a data analyst existing somewhere downstream of you, I implore you to use MySQL v8.0 and nothing older, at any cost, for their sake.

1 more reply

CuriouslyC7y ago

- A robust security model with role inheritance that supports both column and row policies.

- PLV8/PLPython/C functions/etc (with security!)

- TimescaleDB

- Better JSON query support

- Foreign Data Wrappers

- Better window function support

- A richer extension ecosystem (IMO)

Honestly, at this point I wouldn't use MySQL unless you only care about slightly better performance for very simple queries and simpler multi-master scaling/replication. Even saying that, if you don't need that simple multi-master scaling RIGHT NOW, improvements to the Postgres multi-master scaling story are not too far off on the roadmap, so I would still choose PG in that case.

mixmastamyk7y ago

The benefits are better defaults in terms of data reliability. Hard to say if migration is worth it to you without a lot more details, but I'd definitely recommend trying it in a new project.

stilldavid7y ago

We just migrated a medium size project using pgloader with great success and minimal headaches, which seems like a big step up from the last time I had to migrate a project. Highly recommended, and it might be easier than you think!

evanelias7y ago

Frankly, data reliability concerns with modern MySQL / InnoDB are very outdated FUD.

Many of the largest tech companies rely on MySQL as their primary data store. They would not do so if it was unreliable with persistence.

There are many valid reasons to choose Postgres over MySQL, or vice versa -- they have different strengths and weaknesses. But there are no major differences regarding data reliability today, nor have there been for many years now.

pygy_7y ago

I haven't kept up, but I remember having constrains that were accepted by the parser, but ignored by the engine (using InnoDB, I don't remember what specific constrain was concerned).

Is it still the case?

1 more reply

mixmastamyk7y ago

To be clearer, it's about the defaults regarding strictness (of the acceptance) of data that help to avoid trouble up front, not that mysql randomly corrupts/deletes it later.

1 more reply

barrkel7y ago

MySQL defaults to InnoDB. Is there a different metric for reliability you had in mind?

Where I work, we chose MySQL back in 2012 due to production quality async replication. I think (but am never sure) that that is now good in Postgres land.

PG has a lot of SQL features I'd love to use and can't. OTOH MySQL's query planner is predictably dumb, which means I can write queries and have good idea about how well (or not) they'll execute.

scardine7y ago

I can't remember the last time I started a project using MySQL, may be it is catching up - but PITR, partition tables and document-oriented columns are some features I use a lot. Postgis also used to be stronger then the MySQL counterpart.

dbrgn7y ago

I'm pretty sure you're frequently using LIMIT, which (TIL) is non-standard SQL. (PostgreSQL shares that syntax though.)

brightball7y ago

It's hard to believe that Google Cloud SQL still only has 9.6 available.

EDIT: Apparently 11.1 is available in beta as of April 9th.

cglace7y ago

Actually, 11 is now in Beta. If you create a new instance it is listed as an option.

brightball7y ago

I tested it just before I posted the comment (to confirm) and didn't see it listed. Maybe it depends on your account?

EDIT: I'll try again. Looks like it was added April 9th

https://cloud.google.com/sql/docs/postgres/create-instance

fernandotakai7y ago

i created an 11 (beta) instance yesterday and it worked as expected.

SnowingXIV7y ago

Running 10.7 and 10.6 on two production applications with Heroku. Thinking about moving to 11 to ensure support for the long run as I rarely need to touch this and it's very stable but would like to minimize any headaches in the future.

Any complications or hiccups I need to worry about moving from 10 to 11?

Per Heroku Docs: By supporting at least 3 major versions, users are required to upgrade roughly once every three years. However, you can upgrade your database at any point to gain the benefits of the latest version.

skymt7y ago

The release notes for version 11 include a list of potentially incompatible changes: https://www.postgresql.org/docs/11/release-11.html#id-1.11.6...

SnowingXIV7y ago

Thanks, yeah I just did some testing locally and made the upgrade on Heroku (the documentation was rock solid).

rooam-dev7y ago

Question for PG happy users.

How do you manage failover and replication? At my previous job this was done by a consultant. Is this doable on a self hosted setup?

Thank you in advance.

combatentropy7y ago

PostgreSQL has replication built in now. I set it up at work, and it replicates reliably, in a fraction of a second. I've never had to fail over, but it seems straightforward to do so. The only hard part was following Postgres's documentation in setting it all up. It seemed to me a bit scattered to me. I had to jump around to different sections before I put it all together in my mind.

throw0101a7y ago

What do you use? Are there some instructions/articles that you'd recommend reading? Is it anything like Galera?

I know of BDR, but there hasn't much news about it lately, especially with more recent versions of Pg.

We like Galera for our simple needs: we use keepalived to do health checks, and if they pass the node participates in the VRRP cluster. If one node goes down/bad, another takes over.

duckehlabs7y ago

If you want multi master in Postgres, I think BDR is going to be your best option, but the version for PG 10+ isn't open source so you'll have to pay for it. We're using the open source version on PG 9.4 currently in production, it's worked fine so far.

If you're just looking for a hot standby and dont need a multi master setup, you can set those up just with pg. https://www.postgresql.org/docs/9.4/hot-standby.html

truth_seeker7y ago

Citus extension:

https://github.com/citusdata/citus

cromantin7y ago

We've been doing replication for 3+ years with zolando patroni. It works great. We run pg in docker and patroni too. First it was patroni with consul and right now its patroni with kubernes store (it store leader in endpoint). Highly recommend. There are other popular tools for this, it just a preference.

htn7y ago

One option for automated health monitoring and failover is pglookout: https://github.com/aiven/pglookout

Izkata7y ago

We use this, pgbouncer, and a bash script to link the two, for completely automated failover.

Queries done through pgbouncer just pause as if the query is really really slow when the db goes down, then when pglookout does the failover, the bash script switches pgbouncer's config and those pending queries are sent immediately.

kumarvvr7y ago

Question from a Python web developer. (Django mainly, exploring Flask presently)

For a complex web-app, would you suggest an ORM (looking at SQLAlchemy) or a custom module with hand written queries and custom methods for conversion to python objects?

My app has a lot of complex queries, joins, etc. and the data-model is most likely to change quite a bit as the app nears production. I feel using an ORM is an unnecessary layer of abstraction in the thinking process. I feel comfortable with direct SQL queries, and in some cases, want to directly get JSON results from PGSQL itself.

Would that be a good idea, and more importantly, scalable?

Note : My app will be solely developed by me, not expecting to have a team or even another developer work on it.

kangoo17077y ago

Use both. Many of the business logics are just as simple as query by id, filter/sort by a couple of columns. A smart ORM will handle fetching relationships without hitting N+1 problem

For advanced queries, you can write raw SQL

The way I see it, an ORM has three useful features:

- A migration/seed mechanism (you will need it anyway)

- A schema definition for mapping tables to object

- A query builder

If you feel that an ORM is too heavy, you can seek for just the query builder.

fernandotakai7y ago

i worked on a mid-sized django app and that was basically what we did:

* for normal queries (select /cols from table where id etc etc) we just used plain django orm. even for weird joins, django orm makes it a lot easier than using raw sql

when we needed raw speed, we just wrote raw sql and delegated to django sql layer -- that way we leverage everything the framework has with raw sql power.

scardine7y ago

Even when the ORM models start to get cumbersome I like to use sqlalchemy.sql to assemble SQL queries.

It maps pretty much 1:1 to SQL and for me it beats the alternative (using text interpolation for composing queries).

CuriouslyC7y ago

SQL alchemy is good for fairly straightforward queries where you mainly need to do "select * from ..." and you want to pull down related rows from another table based on a foreign key in the first table as a separate query. It's also good if you have a lot of junior devs that don't know SQL and you want to encapsulate complex sql logic for them.

If you're doing anything more complex than these basic sorts of queries and subqueries, or your developers are proficient in sql, using even a very good ORM like sqlalchemy is going to be a step down.

Since you say you're doing this all yourself, and SQL is probably the most ubiquitous programming language (in terms of percentage of jobs requiring it, not total LOC) so learning opportunities there are more valuable, I would go direct.

dliff7y ago

I have used Postgres with both Django and Flask quite a bit now.

Since you're probably used to dealing with and migrating your tables manually, I would keep custom SQL for all your complex operations, and use SQLAlchemy for doing basic insert/update/select. Django also has an "unmanaged" mode where you can create a model and it will avoid trying to create a migration to create the table.

Of course, you have to manually update the model if you manually change your DDL.

Watch out for differences on how you are serializing data from Django/SQLAlchemy models vs. raw dicts from PsychoPG.

I like to organize my SQL by keeping each query in a separate .sql file and writing a little wrapper that fetches the files (+1 for caching it) and then executing it. I'm not a fan of lots of inline SQL mixed with Python.

Overall I think it's a great + powerful setup!

kumarvvr7y ago

I usually put all my SQL statements in a single python file, and import the file as a module. I get to have descriptive variable names for the queries.

mitch3x37y ago

Pyscopg2 + Raw SQL inside of “”” “”” strings, and use %(foo)s as a parameter placeholder. Cur.execute will accept a parameter dictionary like:

cur.execute(query, {‘foo’: bar})

Passing values directly into cur.execute is the best way to prevent SQL injection as well since it will sanitize the input params upon running

luhn7y ago

Fellow Python-Postgres web dev here. (Pyramid is my framework of choice, check it out!)

I'm developing a web application that uses SQLAlchemy. The ORM has been a huge boon for CRUD functionality. We also have some very complicated reporting features and use SQLAlchemy's query builder almost exclusively. I find that the query builder maps very cleanly to SQL, so I can still "think" in SQL while writing and reading it. And the query builder makes complex query composition easier to manage.

rch7y ago

I've found that 'sqlalchemy.sql.text' works well for complex queries that don't need to be built up incrementally, and the fluent sql interface is great otherwise. Also, reflection can be really useful when working with existing databases, and for maintenance scripts that might not need to be tied directly to your model.

SQLAlchemy provides more than just the ORM... I actually wish the docs were structured differently to better emphasize that in search results, etc.

nicwolff7y ago

You can use SQLAlchemy Core for SQL generation and execution, without using its ORM. This lets you build queries from reusable Python objects rather than strings, and use Alembic for DB migrations, while still retaining control over the generated SQL.

mixmastamyk7y ago

> likely to change

Hard to say, but don't forget about migration support, which is quite helpful.

Tomdarkness7y ago

Totally wish we could upgrade but for some reason AWS have still not implemented any upgrade path for Aurora PostgreSQL other than dump and reimport despite apparently working on it for a year...

mevile7y ago

Does AWS Aurora actually use postgres or is it simply a postgres compatible API on top of their own technology?

darkr7y ago

As with RDS Postgres, it's Amazon's fork of Postgres.

With Aurora, the storage layer is swapped out entirely for a distributed storage engine, that I believe is based upon DynamoDB.

The wire protocol and server interface are much the same as regular Postgres, though there are some additional benefits as well as caveats as you might expect

jadbox7y ago

I'm pretty sure it's a fork of PG based on my experience.

jontonsoup7y ago

this is a huge issue for us and I'm extremely unhappy this was not clear in the docs

Tomdarkness7y ago

What's worse is the documentation straight up lies. It states you can perform a major version upgrade by resorting a snapshot and selecting a higher version. I mean it's still not ideal except if you do try this you'll find the option doesn't actually exist - either via the console or API/CLI!

https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide...

This has really put us off using other AWS managed products and was a major factor in us deciding against using Amazon Elasticsearch Service.

throw0101a7y ago

MySQL has Galera: is there a multi-master option for Pg?

I know of BDR, earlier versions of which are open source, but there hasn't been much movement with Pg 10 or 11 AFAICT.

We don't do anything complicated, but simply want two DBs (with perhaps a quorum system) that has a vIP that will fail-over in case one system goes down (scheduled or otherwise).

Galera provides this in a not-too-complicated fashion.

smilliken7y ago

PostgreSQL has logical replication built-in since version 10. This allows you to replicate specific tables between multiple master databases, accepting writes on each. You define a merge function in case there's conflicts.

mistrial97y ago

impressive and .. upgrade on 10.x now in process, easily, quickly, thanks to the Postgres PGDG Debian/Ubuntu repos .. BUT do not choose meta-package postgres ! Under Ubuntu at least, upgrading the meta-package postgres adds an entire new server 11+ without confirmation .. why is this tolerated.. genuinely annoying

shawnz7y ago

I think you are looking for "apt-get upgrade" and not "apt-get dist-upgrade". Or, just install the version you specifically want

dochtman7y ago

It's unfortunate that the official Docker images haven't been updated yet (on DockerHub).

Xylakant7y ago

keep in mind that the "official" docker images are "offical" in the sense of docker inc marking them as official, not in the sense of "the upstream provides these". This is the repo for the Dockerfiles https://github.com/docker-library/postgres and it begins with:

> This is the Git repo of the Docker "Official Image" for postgres (not to be confused with any official postgres image provided by postgres upstream)

micmil7y ago

I'm not a database guy so have no clue, but why are there so many versions receiving support? Is there just that much legacy crap they can't get away from, like Python?

dspillett7y ago

People are slow to upgrade database systems, as it can take a log of regression testing to make absolutely sure your applications don't rely on unsupported/undocumented/undefined behaviours that make them compatible with the newest release (or are affected by officially acknowledged breaking changes). Especially in enterprise systems. Even if developers upgrade quickly, their clients with on-prem installations may not. That means that to be taken seriously you need to support your major and minor releases for some time to be accepted as a serious option in some arenas.

Supporting five versions is no more than MS do: currently SQL Server versions 2017, 2016sp2, 2016sp1, 2014sp3, 2014sp2, 2012sp4, 2008R2sp2 and 2008sp3. 2008sp3, 2008R2sp2, and 2016sp1 will hit their final EOL in a couple of months taking SQL Servers's supported list back down to 5 too.

I expect other significant DB maintainers have similar support life-time requirements for much the same reasons, though I'll leave researching who does[n't] as an exercise for the reader.

greggyb7y ago

2008 and R2 are still in a supported phase of life. It's the "exorbitant support fee" phase. Nevertheless, you can still get Microsoft support for the two after the "EOL". It's more an end-of-public life

dspillett7y ago

Aye, and by the same technicality you can still get support for 2005.

Similar with PG I assume. You could always pay someone an expensive contracting fee to support your use of an older version than is publicly supported.

Erwin7y ago

With databases being often mission critical, the PostgreSQL people decided heroically to support major versions for 5 years -- and as they come out with a new major version every year, minor updates come out for 5 different branches.

Note the recent versioning change: 9.4, 9.5, 9.6 were the previous 3 major versions bases, and the last two are 10 and 11.

sargun7y ago

1) It's stateful, so upgrades also have to upgrade the state (MBs, GBs, TBs of data)

2) It's horrifically high risk because downgrading is usually not a thing

3) It usually requires downtime.

profmonocle7y ago

Moving between major versions of Postgres requires downtime proportional to the size of the database. Supporting older versions allows users to go many years without having to do this.

Symbiote7y ago

I upgraded from 9.3 → 11.2 a few months ago using pg_upgrade[1], on a master+slave database with 150GB of data. I did a fair amount of testing, but the final procedure was very fast and smooth.

1. Test the upgrade: set up an additional secondary (9.3), break the replication link (promote it to a master). Test the upgrade on that. It was really fast, under 30 seconds to shut down the old DB, run the in-place upgrade, and start up the new DB.

2a. In production: set up an additional secondary (9.3). Make the primary read-only. Promote the new secondary to a master. Shut down, upgrade to 11.2, restart. Point applications at it.

2b. Backout plan: leave the applications pointing at the original database server, make it read-write.

There are other options, including with only seconds of downtime, but <1 minute with pg_upgrade was simple and very acceptable for us.

[1] https://www.postgresql.org/docs/current/pgupgrade.html

[2] https://www.postgresql.org/docs/current/upgrading.html

devereaux7y ago

This is a nice way to do that, but you have a low volume of data, and you think 30 seconds is fast and 1 minute of downtime is acceptable. I question these assumptions.

Consider the situation when you're adding thousands of new records per seconds, and the database is being used every second (quite literally: to compute per seconds statistics).

A better solution is to have triggers on the old master, to do the same inserts on the new master (after copying the data/promoting a replica/whatever), and have similar triggers on the new master when the IP is not the old master (to be able to backout to the old server)

Then both the new and the old master run "in parallel", with the same data, and you can have the apps use the new server (on a new domain name, new ip, new port, whatever) when you want - on a app by app basis if you want. You can keep both until you decide to decommission the old master.

2 more replies

Alex39177y ago

Postgres users actually generally upgrade faster than those using other databases because there are a lot of new features each year. But once your database gets huge then upgrading still becomes a pain, so that's why they keep providing security support and bug fixes for older versions as well.

hans_castorp7y ago

pg_upgrade with the --link option is extremely fast and doesn't really depend on the size of the database.

Alex39177y ago

Interesting. I just use RDS and it always seems to take 15+ min even though our database is tiny.

75dvtwin7y ago

Well supported older releases of the database engine, with clearly defined migration documentation and technology -- are the hallmark of successful Open source software ecosystem.

Because it mirrors and supports the reality of the business world.

Every large or small organization that manages their business, every year make 'Grow/Invest', 'Maintain', 'Disinvest' decision for each of the product/service lines.

Does not matter if is software, or making kielbasa. Postgres is exceptional, and is supporting the first 2.

todd38347y ago

With semantic versioning each time the major version changes it signifies a breaking change. If you have an application that breaks from one of those breaking changes you may not see it as a business opportunity to update because it “works” as it is. However, minor version changes can include anything that doesn’t break. So security patches are hopefully added to any major version that is officially supported.

grzm7y ago

PostgreSQL versioning is similar to semantic versioning, but doesn't follow it precisely. Major versions require a dump and restore (or other transform, such as an upgrade) of the on-disk data. Minor versions are fixes. Prior to PostgreSQL 10, the changes in the second numeric place are considered "major" versions. So, the past 5 major versions are 11, 10, 9.6, 9.5, and 9.4. The most recent versions of each of those are respectively 11.3, 10.8, 9.6.13, 9.5.17, and 9.4.22.

ddorian437y ago

It's not "legacy crap". There just is long-term-support for versions cause it's not that easy to upgrade (both technically & others).

There's legacy crap everywhere, all langs,db,versions etc. Supported sometimes for 10+ years.

FraaJad7y ago

why did you have to bring Python into this? Every language used widely will have "legacy" crap.

baq7y ago

not broken, don't fix

chungy7y ago

There are some shockingly old releases of PostgreSQL still in production for this reason.

Security updates should push the upgrade path a little harder, but there are still cases where a database can be completely isolated from the network and that might not even matter.

Torgo7y ago

I inherited a production system with a PostgreSQL 8.1 database. It's one of the most reliable systems I have.

j / k navigate · click thread line to collapse

157 comments

rtpg7y ago

For those stuck on older versions of Postgres, I highly recommend paying the downtime to upgrade. Going from 9.x to 11 will get you a measurably large performance gain for free.

dspillett7y ago

cldellow7y ago

SEJeff7y ago

One of the huge ones was the ability to use > 1 cpu core for big aggregations or huge select queries. That and a massively better query planner.

fabian2k7y ago

ape47y ago

It kinda bugs me that people say "SQL Server" to mean "Microsoft SQL Server". I mean, there are other sql servers.

cstejerean7y ago

1 more reply

mr_overalls7y ago

Same here. FWIW, I informally refer to it as "MS-SQL" to differentiate.

1 more reply

giovannibonetti7y ago

Good to know! Can you give an example? When I google "SQL Server" it seems like most results are related to MS SQL Server

1 more reply

dragonwriter7y ago

> It kinda bugs me that people say "SQL Server" to mean "Microsoft SQL Server". I mean there are other sql servers.

OTOH,what does bug me is when people say “SQL” to mean “Microsoft SQL Server”.

1 more reply

astine7y ago

Scarbutt7y ago

Out of interest ;) SQL Server is such an expensive beast, ~$15K per core, what are your reasons for prefering it over PG?

eropple7y ago

Also bear in mind that almost nobody pays list price for any of this.

2 more replies

dspillett7y ago

Day job. Never had to pay for it out of my own pocket! I've worked at MS shops for quite some time.

sovnade7y ago

mr_overalls7y ago

* SSMS - SQL Server Management Studio

* The MS BI stack: SSRS (reporting), SSAS (analysis), SSIS (integration)

1 more reply

polskibus7y ago

Analysis Services, Integration Services, Clustered Index

greggyb7y ago

Big parallelism updates that the query planner can take advantage of.

I believe also updates to index seek or scan in that time.

barrkel7y ago

I've just upgraded my hobby app from 9.6 to 11 and some of my old queries are now timing out :-|

I'll update this thread when I find out why.

barrkel7y ago

    analyze

fixed pretty much everything :)

in97y ago

what do you mean? You looked at the analyse report and changed the query to avoid some unnecessary computation?

eberkund7y ago

atombender7y ago

One big argument: Transactional DDL. For example:

  begin;
  alter table foos add answer int not null default 42;
  alter table foos drop column plumbus;
  update foos set name = upper(name);
  create table bars (t serial);
  drop table dingbats;
  rollback;  // Or, of course, commit