His points are all valid, you definitely shouldn't release something to production that you haven't tested thoroughly in a separate identical environment.
That doesn't mean you should never test with SQLLite though.
A good pattern I see all the time is to have a final stage of system tests that run slowly but very accurately in a production-equivalent environment, and to also have a preceding test stage that uses SQLLite instead. By using SQLLite in memory it's much much faster and easier to manage and easier to parallelize etc etc, so you find out quicker if your code is fundamentally broken.
The goal of testing is to find out as quickly as possible (minimal context-switching, easy to check locally before commit) whether your application will work. That means checking the things that are most likely to break most of the time as early as possible. It's typically not going to be a complex database incompatibility issue that makes your tests fail. It's going to be the SQL query you wrote 30 seconds ago, or the leftover query that doesn't work at all the table structure you've just moved to, etc etc. These are things you can check perfectly effectively in a lot of cases, and much quicker, with SQLLite and friends.
Quick test stages that catch all the obvious problems early, final definitive production-equivalent test stage that makes sure it'll definitely definitely definitely work in real life. If you do it well, you can set it up so they both use exactly the same test definitions, and just swap out the DB config.
You wouldn't test your Facebook consuming API code with a Twitter endpoint, why do you apply the same logic to your DBMS?
I suggest using syncrhonous_commit=off to sped up the test suite and to use unix sockets (assuming you do not run Windows).
The fact is if you have a serious site there's a good chance SOME specific feature will start looking appealing ( maybe array values, JSON , window functions, a pg extension...). These features are written because people use them, we aren't all writing dead simple CRUD apps. Or you your ORM will interact with one slightly differently than the other. I don't expect to convince any of you doubters, but the first time you get a bug in prod that slipped through the testing process but really could have been caught in dev, you'll ask yourself why you weren't using pg there too.
There's no sane reason to complicate your life by running two databases unless you either have a dead simple app, or are one of the jabbering idealists I see on here. Productive programmers simplify the problem and minimize their tech stacks.
we run a brutaly different dev environment than production: windows, 32 bit, hsqldb, windows codepage against linux, 64bit, postgres, utf8 codepage
we also have a beta environment that's a perfect mirror of production down to the vm vendor and package version and an alpha environment that's on a cheaper vendor and uses a more updated version of production os/packages (and has experimental features of our software enabled)
this maybe slows down development a little, but catches a brand whole class of interesting error that are normally hidden and wait to happen till the next time you need to upgrade something in the production environment
then again we do have some sane lib that hides the horrible differences between databases so we have a whole class of bugs that's managed by that layer. (no it's not an ORM I hate those)
I believe neither maintaining a code that's compatible with multiple databases, nor using a single one is a mistake per se. Whatever rocks your boat is the attitude - one just needs to be aware of consequences (both good and bad ones) of chosen path.
But, yeah, the linked article's warning is completely valid - testing on a single platform, then deploying to another is likely to encounter some issues one day.
Write only pure, portable SQL. Using RDBMS extensions is evil. Enligtenment only comes to those who are pure(ly using SQL).
Last year I worked at a small startup that was focused on medical records. They used PostGres in production, but SQLLite in development. The frontend was pure AngularJS. They had a massive Python code base for pulling data from the database, turning it into JSON, and then sending it to the frontend.
But then things began to change.
PostGreSQL gained the ability to handle JSON natively. We began the preliminary process of switching over to use that ability. It would mean less work for the Python code.
Here is where it got strange: some of the team was so deeply committed to using SQLLite, that they began hacking SQLLite to add the ability of SQLLite to handle JSON natively. That is, any feature that PostGreSQL had, which we used, they wanted SQLLite to have.
On the one hand, the technology they developed was amazingly cool. On the other hand, it was a massive effort that had nothing to do with the goals of the startup.
I could not fathom why we didn't simply use PostGreSQL in both development and production.
Also to be super pedantic... it's SQLite, not SQLLite. ;)
On the mocking thing: I thought the point of that was to completely avoid DB calls for speed when testing other logic layers, so even a fast in-memory DB isn't needed and anything (i.e. just some code that manufactures a response object) that produces a result set in the right format will do? In that case even using an in-memory DB is less efficient than the other option so is at best a "half optimisation". Am I missing a point here?
.net Entity Framework has this problematic limitation that you can't check if SQL queries will be properly built without DB - example situation is if you're trying to use not mapped .net method in your SQL-targeting queries. That's a runtime error you can't (yet) test without DB.
I'm cautious enough to believe other ORMs may have similar quirks, only testable with something to query on.
1. Local development is done with the simplest of everything (local memory cache, SQLite3 database, console based email backend, local static file storage (vs S3, etc.)). The result is that there is very little overhead and everything is easy to work on and test quickly. This also gives me the ability to quickly wipe out things locally (erase and make new migrations for trivial changes, delete the database file, clear cache more quickly by just restarting the Django server, etc.).
2. Final testing takes place on a staging server (Heroku), which is free, and which can contain all the production-level backing services that production will have. This server will be treated like production (no in-place hot-fixes and database wipes / restarts). Separate dev accounts will of course be used for things like APIs, email, etc.
3. Production (Heroku).
This gives me the best of both worlds; the simplicity of local development with simple backing services, with the comprehensiveness of a staging server with better parity.
Just run postgres on a ramdisk, or on an SSD with sync disabled and a scripted setUp/tearDown.
In my test database (sqllite,hsqldb,derby,...) i create actual tables containing test data with the same definition as the views.
This allows me to have rather complex queries that are database specific and still have fast running database queries.
The views themselves i test separately. They are also easier to fix in production than code releases, and can also be replaced by materialized views if necessary.
Inserts/updates are typically simple too.
"Test what you fly. Fly what you test."
- All the concerns listed in this article are addressed by ORMs.
- SQLite supports CTE[1] (subqueries).
- It's safer to use the exact same setup in development and production, but it's slower for applications with many tests. It's a trade-off and that's all there is to say.
[1] http://stackoverflow.com/questions/21819183/how-to-use-ctes-...
Both this facts can bite you in the ass if you are not careful. It's great to run development tests on SQLite, but your CI environment, staging or whatever you have before you push your changes to production should try to mimic production as much as possible.
a) Also test with your real production database b) You restrict yourself to using the lowest common feature denominator of both databases (which is probably a pretty low figure)
Still, different databases behave differently in many aspects. And despite how cool SQLite is, PostgreSQL is so much more advanced (specially in SQL querying) that I don't see the point losing all those features. As I have already mentioned in HN, check this amazing presentation: http://use-the-index-luke.com/blog/2015-02/modern-sql before deciding to restrict yourself to a subset of the SQL supported by PostgreSQL.
Given that it's easy to start PostgreSQL from a unit test, and how lightweight PostgreSQL is, I see no real point in using SQLite for testing. Use PostgreSQL!
That, and occasionally we find syntax differences that cause a headache when doing a database setup/teardown. A single minor SQL difference requires us to create separate H2/Postgres flyway configurations.
I think a better option than H2 or SQLite, that we're currently investigating, is using Docker to bring up a local postgres instance for testing.
(All of that assumes a certain dedication to using Postgres. If you want to be database agnostic, then you may in fact be better off not using Postgres in dev/test just to force yourself to remain compatible with other DBs.)
I simply cannot understand where the friction on this is existing today. Even pre-docker its fairly easy to setup any of the big dbms in their own little environments (5 versions of mysql on the same server for a migration project).
https://gist.github.com/tobiasviehweger/cbfd9a1a55bff0862f9e
Using SQLite would actually be more trouble and I've never seen a need to use it all for development. Any time I've tried to use SQLite, it's been pointless as some of my queries written to run on MySQL or Postgres fail due to lack of support for some feature or another. And I already have plenty of MySQL and pg running on my systems, so...
I can't recommend it enough. Maybe a local memory database could be faster... until your environment grows beyond just a db
This is all really simple to setup to. Think about how a new developer gets up to speed and starts coding in your company. Does it involve downloading and installing Postgres.app[0]? If it does it's no good. Starting a local environment should require a single command. `vagrant up` is one option if you use vagrant. Local environments should ideally use the same method of provisioning as production servers. It can however be simpler, one of our codebases has a bootstrap.sh file that sets everything up, it works surprisingly well. No version conflicts, no weird bugs due to slightly varying versions for different developers and no fake service such as SQLLite.
For the life of me I can't understand the test speed issues that people talk about. We have a pretty small codebase with some 2k test(94% coverage). That takes about 6-7 minutes to run inside a vagrant VM using all the same services as production. 6-7 minutes is a long time, but you shouldn't be running your whole test suite during development. During development only run the one or two tests that are relevant for the feature you are building or the bug you are fixing. These are typically really fast even with a proper database. The whole test suite should be ran before pushing and by CI. If your database is still too slow look at using a RAMdisk or tweaking the configuration like people have suggested in the comments
It also prevents you from doing things that only one RDBMS does (I was bitten by this because SQLite supports timestamps with timezone data and, at that time, the MySQL we used in production didn't) making your app more "robust" (you are using an ORM for a reason, right?).
Imagine you only test your app in PostgreSQL and, because of that, your app makes assumptions about the environment only PostgreSQL satisfies. You simply can't move to anything else without extensive rewriting. Now, when the workload changes, you can't just change your storage backend to one that suits your workload. You need to make PostgreSQL do whatever your new needs are. PostgreSQL is probably a good enough solution, but it's not the optimal solution for all problems.
(The common alternative in Java is to use H2 or Derby, which are similar in concept to SQLite)
1. https://github.com/airlift/testing-postgresql-server
Because it's so much easier and it still lets me test 95% of my code. And the alternative is not testing at all because there is limited time for testing and setting up a proper database for this is so much more trouble.
The choices are not good test vs bad test. They are test-with-issues vs. no test.
(Obviously you have to do SOME testing with the real DB but this article is talking about unit tests (or related))
Have your test bootstrapper run "createdb somethingunique" and then "export APP_DBCONN_STR='somethingunique'". Adding this to your bootstrapper will take minutes.
If you're talking integration tests, then of course you should use an environment as close to production as possible.
In any case, using SQLite when doing TDD, and testing with postgres when you are done implementing is an acceptable trade-off for most use cases.
Attempting to separate out the two often means that you end up miscategorising business logic as data access logic (resulting in poor test coverage) or data access logic as business logic (resulting in poor performance and unnecessary complexity).
As soon as your code touches the database (or a mock of it), it looses most of its re-usability. It suddenly becomes much harder to use it elsewhere. In most cases, the code is then tied to that particular framework (like Django, or Flask), and you cannot move it out into a separate independent module. This becomes a problem in large projects, and for example micro-service architectures.
It also mixes up the levels of abstraction. Getting and instantiating objects is mixed right (hard coupled) in with the code that uses them.
It doesn't really take a lot of discipline or work to switch over, and once you're entering beta candidate territory, that's when features and performance tuning start to take over and that's where something like Postgres starts to shine in comparison.
Even if you keep with the same schema and queries, just moving over to Postgres on a separate box, you'll probably start to see immediate performance improvement and you'll get better scaling performance almost immediately.
It allows you to create, populate, test, and teardown an entire relational DB in hundredths of a second, which makes it ideal for unit tests where you want to clean the slate between tests to ensure that you aren't accidentally creating silent dependencies on a database state created from a previous test.
On the other hand, when you're done doing your Red-Green-Refactor cycle for a new feature, you want to immediately run integration tests, which will exercise your program against not only your production model database, but the other APIs you had mocked out for regular unit testing.
Mocking is good for iterative development, and SQLite is a great 80% tool for mocking.
And yes, you can do CTEs in SQLite but that's besides the point.
* test with SQLite
* test with Postgres
then using the same environment as production (Postgres) is of course better, but if the real options are
* test with SQLite
* don't test
SQLite is much better!
For example this query is invalid with HQL
SELECT p FROM person p INNER JOIN Invitation i ON i.email LIKE p.email
And that is one of MANY gotchas. Hibernate has it advantages though, especially when it comes to developer productivity. But JDBC is needed for some edge cases.
I often use Groovy SQL instead of using JDBC, comes with excellent transaction support and helps simplifying you DB specific code so it makes sense for newcomers too.
An ORM may allow you to pass through queries or other directives that have DBMS specific behaviours so if you use any features like that the ORM can't protect you at all. Further more the ORM might change the way it talks to the underlying DB depending on what it is, to make use of efficiency enhancing features that one DB has but others may not, again you are not testing like-for-like in this case. The ORM itself may have bugs that are only apparent when exposed to a given DB.
An ORM often protects you from needing to know the specifics of the storage engine underneath, but you still need to test against the same storage engine(s) as you expect to see in production.
I know this is a bit too much against the person, but here that is directly relevant to the topic at hand.
Not that he is totally wrong, but he is wrong in the universality of his argument. It depends on what postgres-specific features the software is using and how the communication with the database is handled. If it is handwritten SQL, it is not a good idea regardless. If it is a ORM that maps the two very well and there is noting postgres-specific about the software, it is unlikely to be a problem.
- It's a relatively simple app - You're using an ORM - You aren't using any advanced SQL features - It's only to make local development easier - There's still a full CI test run with your production database
That said, if your test suite is large enough that database performance is an issue during testing then either your app is too complex for the above to apply, or you are probably doing something else wrong.
(1) SQLite did never claim to be as "complete" as other databases -- it is and will be a "lightweight" database.
(2) Everybody with marginal knowledge of different databases should know, that using different databases always puts you on risk and needs extra testing. You would also not recommend to develop your application on Linux, use a crosscompiler and ship the product on Windows untested. And with that said, most of the arguments can be deleted.
(3) When you are using an ORM, most of the arguments are obsolete, too.
At the cost of pretty much everything you get out of using something that isn't SQLite.
Thankfully, the only guarantee they needed was that provided by a key-value store, so a ConcurrentHashMap was used on dev machines.
Then the true database was used by the CI server, and commits only occurred when all of the tests passed with the true db.
All of the OP's reasons would (in many cases) pale in comparison to 'My tests take hours to run'.
We have a slow, unoptimized test suite (it does some really stupid things) which hits the database in most test cases and it still manages to complete 6166 tests in 4 minutes on my laptop (this includes setting up and tearing down the database).