Created a new table that contains `user_id, created_at, phone_primary, phone_secondary`. Inserted all 10,200,000 rows. Notably (I'll come back to this) due to the generation of the rows, the primary key (`user_id`) is an unsorted integer - this was _not_ done with a serial or identity.
postgres=# CREATE INDEX sec_phone_created_at ON hn_phone_new (phone_secondary, created_at) WHERE phone_secondary IS NOT NULL;
I reset `shared_buffers` down to the same as before - 263 MB - although the size of this index is tiny, < 10 MB, so realistically I can't shrink buffers down that far anyway. I then did the same `sync/drop cache` as before.
postgres=# SELECT * FROM hn_phone_new WHERE phone_secondary IS NOT NULL ORDER BY created_at LIMIT 10;
id | created_at | phone_primary | phone_secondary
--------+---------------------+--------------------+--------------------
58816 | 1995-05-23 03:22:02 | +49 030 522866-87 | +1 159-445-4810
49964 | 1995-05-23 03:23:00 | +61 02 7440 8606 | +254 20 925 892
171828 | 1995-05-23 05:06:47 | +380 32 393-35-89 | +49 030 429376-29
78333 | 1995-05-23 05:31:22 | +380 32 147-11-20 | +52 55 6409 5253
24264 | 1995-05-23 06:47:21 | +44 0131 6506 1823 | +49 030 610965-83
96662 | 1995-05-23 06:57:03 | +52 55 1473 0538 | +61 02 5414 8204
15023 | 1995-05-23 07:55:37 | +44 0131 7959 1581 | +44 0131 8491 6194
52029 | 1995-05-23 08:59:19 | +380 32 430-77-54 | +254 20 374 856
20518 | 1995-05-23 09:51:14 | +380 32 264-21-79 | +52 55 7787 0236
80273 | 1995-05-23 14:59:26 | +61 02 8863 4466 | +33 01 16 10 78 56
(10 rows)
Time: 2258.807 ms (00:02.259)
So yes, significant improvement as you'd expect. I then dropped the index and swapped the order:
postgres=# DROP INDEX sec_phone_created_at;
postgres=# CREATE INDEX created_at_sec_phone ON hn_phone_new (created_at, phone_secondary) WHERE phone_secondary IS NOT NULL;
Reset everything as before, and re-ran the same query:
Time: 221.392 ms
Thinking that like MySQL, a portion of the `shared_buffers` had been saved to disk and put back in upon restart (honestly I don't know if Postgres does this), I attempted to flush it by running a few `SELECT COUNT(*)` on other, larger tables, then re-running the query.
Time: 365.961 ms
This is what `EXPLAIN VERBOSE` looks like for the original index:
Limit (cost=8.44..8.45 rows=1 width=61)
Output: id, created_at, phone_primary, phone_secondary
-> Sort (cost=8.44..8.45 rows=1 width=61)
Output: id, created_at, phone_primary, phone_secondary
Sort Key: hn_phone_new.created_at
-> Index Scan using sec_phone_created_at on public.hn_phone_new (cost=0.42..8.43 rows=1 width=61)
Output: id, created_at, phone_primary, phone_secondary
(7 rows)
And this is what it looks like for the second, with the columns swapped:
Limit (cost=0.42..8.43 rows=1 width=61)
Output: id, created_at, phone_primary, phone_secondary
-> Index Scan using created_at_sec_phone on public.hn_phone_new (cost=0.42..8.43 rows=1 width=61)
Output: id, created_at, phone_primary, phone_secondary
(4 rows)
So it actually needs to be reversed, so that the query planner doesn't have to add a sort step for the ORDER BY.