I’ve used listen/notify with triggers for a while with good success. The flexibility of my apps controlling what they want to listen for instead of having that config elsewhere.
Downside to l/n is that if you aren’t listening you aren’t getting changes. Seems like the same is true if the webhook fails here since there isn’t a retry.
If you want to check out another cool alternative see Debezium[1]. It’s uses PGs logical decoding feature to stream changes. Supports protobufs and JSON output.
I did use Debezium in a Kubernetes cluster to stream Postgres changes into Kafka and then through a custom application that can send them to clients via HTTP or WebSockets. But the entire system had multiple points of failure. So we then decided to write our own thing that connects to the LR slot and sends over WebSockets. Working on open sourcing that too!
https://www.confluent.io/blog/bottled-water-real-time-integr...
I wrote a similar program that uses LR to stream changes to AWS's Kinesis[1], using a mini library that parses the output of the default test decoder[2] as a result, in case anyone else is in RDS and has the same limitation.
[1] https://github.com/nickelser/pg_kinesis [2] https://github.com/nickelser/parselogical
I wrote a similar thing to notify on changes, but had to just fire notifies on what changed (id, column names) and there was a second step in the listening code to fetch the full details.
(I work at Hasura)
What ive been thinking of lately is doing triggers to write to a different table with a timestamp and doing a rolling log - the worker can listen/notify on that but can also keep track of the last message it saw and can go back and catch up when it first starts, so if it fails you dont lose anything. All I really need is a type (table) and the relevant keys. I think I would prefer a way to configure which tables I want to listen on and which columns (keys) I care about.
Really wish there was a standard postgres provided way to notify an external system of changes though. Its certainly not an easy issue.
You just have to be careful with failing queries and set timeout to low.
https://github.com/subzerocloud/pg-amqp-bridge
You can then push to websockets from rabbit with STOMP
Thanks for posting this. Can already see a couple of weekends burned trying to drive celery from pg.
Do you know if it is used in production somewhere; how battle-tested is it?
Seeing that this isn't hooked into replication is a bit worrying, but it seems they are going to implement it as a replication client which should make it much more robust.
Postgres's per statement triggers don't let you capture the rows that are modified and skor doesn't do any sort of batching currently.
They do these days. Since 10 you can specify:
REFERENCING { { OLD | NEW } TABLE [ AS ] transition_relation_name } [ ... ] ]
in a FOR EACH STATEMENT ... AFTER trigger.Is an approach like this best practise or rather meant as intermediate hack?
(I work at Hasura)
In order to be able to reason with certainty about your data, information should be coming from the source closest to the truth, and that would be your database. You can’t really trust the application, it may have been written wrong.
Postgres’s LISTEN/NOTIFY commands, which are not part of the SQL standard, exist for these cases.
In larger applications, if you’ve got a need to have a stored procedure handle a complex modification your application will be blind to it.
Plus, at the application level you don’t have any way to enforce that the particular place in your code that modified the data and then emits the change is the ONLY place in your code that can modify that data. This becomes more pronounced over time too.
The problem is amplified if you find something that needs to be done to inbound data that is better handled in another language for some reason. Then you have to remember to duplicate the logic or hook your new code into the application, forcing a new layer in front of your database.
There are some things that the database is better left handling. :)