It's obvious to look at `add_new_payment` for the code that runs when adding a new payment, but then the code isn't there, so you have to know/ask or search in either migrations, a fresh structure dump or poke at the actual db (!).
I think they're great for other, well, effects when needed. PostgreSQL is a real powerhouse.
1. Just discovered this. 2. Implemented a bunch of them. 3. Hasn’t been maintaining this solution for more than a couple of months.
This is a trade-off that only manifests itself after maybe a couple years when you’ve built a system and you have many hands maintaining it. Either there’s some performance problem that makes it worth it or you’ve just obfuscated half your code for no reason. In the latter case those many hands are going to wreak havoc on your system before you figure out how to make it maintainable.
This is basically the clickbait in the wider world affecting software development, even though it might not look like it. Boiled down to the essentials, we are telling each other the software version of "Here's How to Lose 10 Pounds in Time for Summer." way more often than "How To Balance Diet and Exercise to Remain Healthy Over A Decade".
But it's up to us to those of us who like more discussion to reverse these trends. In that vein -- do you have good sources that typically talk about tradeoffs with technology rather than promoting a specific one? Kleppmann's book on Data Intensive applications is one I have found in the past.
"Triggers should be used with caution since they can obscure critical logic and create an illusion of automatic processes. While this can be advantageous in certain scenarios, it can also pose a challenge in terms of debugging, testing, and monitoring since they are not readily visible to developers."
1. Migrations. Your schema needs to live in version control, and changes to your schema must be applied by an automated system. Django migrations are the gold standard here in my opinion, but you can stitch together a custom system if you need to, one that tracks which migrations have been run already and provides a mechanism to apply new ones.
2. Automated tests. Your triggers MUST be covered by automated tests - call them unit tests or integration tests depending on your preferences, but they need to be exercised. Don't fall into the trap of mocking your database in your tests!
With these two measures in place I'm 100% comfortable using triggers that split my business logic between my application code and my database code.
Even if they just created a generated file of the final schema, that sat in version control and errored the makemigrations check (just like a missing migration) if it was out of sync, that would be a significant improvement IMO. But I think the Django maintainers at least would say they want the ORM DSL to be that. (But it's way too incomplete, you'd be limited to a tiny subset of postgres, and even then you have to be on board with that being a reasonable description of your schema, not wanting the actual db schema anywhere.)
Intriguing. In JavaWorld it's really only Flyway and Liquibase, and the important self-imposed rule, "Thou shalt never remove a column from a table."
I’d highly recommend people avoid them, unless you feel that you really need them _and_ you have very robust development processes. As soon as you deploy your first table trigger, from that moment you have to check every DML statement for unintended side effects.
If that's the case, you have a documentation problem. It should be easy to decide if there are side effects or not just by looking at the DML and the metadata you need for it.
In fact, the case here is that the trigger on the article is an incredibly bad one. It's not a natural consequence of the table, or the database structure. It's probably not even always true to the business rules, what is the one property triggers must have no matter what.
But IMO, even migrations are a stopgap solution to this problem. In some cases, such as in ORMs like Rails' ActiveRecord, even the column names aren't present in the model by default, they are only migration files.
The real solution is putting the definition of the triggers (and all other database things too) closer to the code that operates on them. In models, for example. And models should enforce that only the defined triggers are present in the database. Of course, since most ORM users aren't really big DB users, this isn't exactly common. Until we get features like this, I agree that triggers can be confusing.
But otherwise, this fulfills the condition of triggered events residing in the codebase. Of course, this means you have to be comfortable with using an ORM in the first place.
Databases exist to manage data, not just store and retrieve it.
Instead adopt a solution for structuring your business logic in a sane way, such as using a workflow engine. Your code will become simpler and well organized that way without creating a tangled web of distributed rules, as well as exist all in one place.
Alternatively, write all the business logic in the database. This way you can better leverage the DB features and ensure that logic only needs to be written once.
It was a nightmare. Deployments were very difficult, there was little tooling, reasoning about the system was difficult, and of course running so much code in Oracle required very expensive licenses.
And it is much harder to hire hard core PL/SQL devs over Java, C#, Python or whatever.
You really don’t want significant code in the DB. It can be useful for some cases, like automating audit tables, but that is about it.
And for triggers - I feel for anyone maintaining business logic located in triggers. What a debugging hell that can be.
Rather, lets rejoice with latest C#, zig, rust, PoweShell or whatever-beautiful-language-and-ecosystem-perfected-in-latest-decade we have, instead of horrible SQL.
I think this could be great for certain projects, but there is a lot room to put yourself into a situation that's hard to maintain if you/you're team doesn't possess the right amount of discipline around documentation, developer tooling, observability etc..
what was the cause of pain?
Morale of the story, use it with caution. I know postgres is different but when you start turning you database into a ball of mud things get dangerously difficult to debug and fix.
Triggers to do something that's simple and always required, e.g. updating a primary key index for a new row (before autoincrement was available) can be OK, but use them sparingly.
I like putting business logic in the database, because you only write it once and not for each client application. Client applications and platforms and their development languages come and go a lot more frequently than databases. But I use stored procedures almost always, and rarely triggers.
Don't write business logic in the datastore.
Summarizes why in my opinion using triggers is rather risky and confusing. You introduce side effects to operations that one might suspect are CRUDlike. Your code is made non-atomic, in that you need knowledge of what happens elsewhere to guess why it's behaving a certain way. On small projects it's rather tempting, but small projects become large projects that then get given to someone else to maintain, and 4y later someone will spend a week trying to understand why the amount column gets updated to another value that they're pushing.
The only use that I find safe is for database metadata, say if you're using triggers to keep track of write origins, or schema metadata. For everything that's business logic, I'd stay away from them
But every single time this comes up, people on the engineering teams Ive been on all throw their hands up and accuse folks of overengineering or underengineering. You need rabbit or kafka. We should move to mongo. Etc.
Thats the part thats hard.
I think a lot of the problem in industry is simply lack of rigor. We talk a big game about being software engineers but nobody takes the time to talk about values, requirements, trade-offs or business context for decisions. It’s a shame because these topics are the actual engineering of the system.
To some this is the most obvious thing to do, but you'd be surprised that some people wouldn't do this (I wouldn't have before reading a few books) and how I even got pushback at first, despite it being a 10-20 minute to wrapper logic in a class.
A good abstraction for things like this makes it really justifiable to take advantage of Postgres and Redis for things that aren't their forte for the time being until you eventually need to swap the out for a more robust solution. My experience is at startups mostly, and that ability to make complete, but small implementations to get going and being able to make them more robust over time is an essential skill.
Like everything, it depends on the application.
Now you'll never* don't need to shard your Postgres.
* Unless you work at a very rare company.
Luv Postgres
Simple as
A quick note for anyone thinking about triggers: if there's any case whatsoever that you're going to have more than one row insert into a table per transaction, please use statement level triggers -- especially if you're doing network calls inside the trigger. Triggers execute within the same transaction, they're synchronous, and will soak up resources.
Hell, if you're using network calls in your triggers... please don't. Use `LISTEN/NOTIFY` or queues to offload that to some other process (whether postgres itself or another process), so PG isn't left hanging around waiting for a network return.
What I would recommend instead is making use of CTE (Common Table Expression), because DML (modifying queries) inside `WITH` are allowed and taking leverage of `RETURNING` keyword in both `UPDATE` and `INSERT` we can execute multiple inter-dependent updates within single query.
With such approach we can trivially run multiple versions of an application in parallel (during deployment, for canary deployment etc.) and we have similar performance advantage of a single roundtrip to database. Additional advantage is the fact that there is only one statement which means that our query will see consistent database view (with very common read committed isolation level it is easy to introduce race conditions unless optimistic locking is used carefully).
Personally, it’s much more valuable to have all business logic in one place, in a single language, available at a glance. The perforance gain isn’t worth the increased complexity in codebase.
They are coupled tightly with database schema design. Making changes to them needs careful consideration and a extensive testbed environment which cannot be mocked with a fraction of the data.
- Engineers being more comfortable expressing the required business logic in the other languages they were working in then PL/pgSQL
- Challenging to write tests for the triggers
- Harder to deploy variations for testing if needed
You can make the development of your backend more simple, by shoving the complexity into the database, meaning your backend just does less. That in itself does not make your application any simpler.
How do we know who created the rule, edited the rule? How can we reason about the sequence in which these rules are executed based on larger use cases with complex interactions.
Seems like a fire waiting to happen.
You are presumably operating inside of a database, a place where the above concerns can be tracked in ~3 additional columns. More complex rule arrangements can be addressed with additional tables & relations. If you are starting from a blank schema, everything is possible. As noted by others here, you either go all-in, or all-out. The middle ground where half the logic is in the database and half is in GitHub is where things get yucky.
Consider the simplification angle. There are some techniques that allow for running entire apps directly out of the database. You might not even need Node, .NET, Go, Rust, etc. Hypothetically, if 100% of the things are in a database, you can simply record the binary log to S3 and have a perfect log of everything over time. Imagine how easy it would be to set up a snapshot of a given environment on a developer machine. Inversely, you could directly ship a developer's machine to production. You can also do some crazy shit where you merge bin logs from different timelines via marker transactions.
The other major advantage includes being able to update production while its live, even if production is running on a single box. I saw a particular PL/SQL install earlier in my career that was utilized for this exact property - production literally could not ever drop a single transaction or stop servicing them. Latency beyond 2 seconds could be catastrophic for system stability. Production could come down, but it had to be all or nothing. Think - shutting down a nuclear reactor and the amount of time you are locked out due to the subsequent safety and restart checklists. You absolutely need a way to safely & deterministically update in-between live, serialized transactions or you can't run your business effectively.
The big pro is that you no longer need to remember to update tableB, which is derived from data in tableA due to performance, in your application code every time you update tableA.
The cons are that:
* You add more state to your DB
* You can't express the logic in your backend language
Thinking more about it, I don't think the cons outweigh the pros. I would prefer this trigger logic to be part of the DBMS, so I can express the logic in my backend language and also avoid increasing the dependency of my application logic on DB state.
You have got to be kidding me.
Been first in all my teams since then to loudly voice my opposition when someone suggests it as a quick fix for something more complicated
They migrated away from it at some point, but some of the people who handled that migration were still around when I was there. Didn’t sound fun at all, sounded like a horrific nightmare.
Where Git isn't looking...
Now you have two problems.
Right?