The Slotted Counter Pattern (opens in new tab)

(planetscale.com)

92 pointshollylawly3y ago31 comments

31 comments

tantalor3y ago

Basically same as "Sharding counters" (2008)

https://download.huihoo.com/google/gdgdevkit/DVD1/developers...

Also in Brett Slatkin's "Building Scalable Web Apps with App Engine" (2008)

https://youtu.be/Oh9_t5W6MTE?t=1181

ris3y ago

Was going to post this. "Sharding" seems like a better term for communicating the idea.

zimpenfish3y ago

I dunno - "sharding" would generally imply (to me, at least) that you're spreading the counters amongst different databases (or tables in a smaller context.) All the counters are in the same table here which is counterintuitive to "sharding".

1 more reply

iv423y ago

Also, it's basically identical to the pattern described as Counter Tables in "High Performance MySQL" (3rd edition), published in 2012.

tantalor3y ago

Indeed, it also calls them "slots":

> You can get higher concurrency by keeping more than one row and updating a random row... just choose a random slot and update it

https://www.oreilly.com/library/view/high-performance-mysql/...

_vvhw3y ago

Balance tracking seems to be the Achilles' heel of SQL databases.

For example, my experience has been that in the world of payments, the nature of money transactions is to debit perhaps many different accounts on the one side, but credit only a handful of accounts on the other, so that group commit can't fully amortize fsync.

Performance panaceas mostly come down to sharding, but sharding doesn't work well where you need strict updates to balances.

At work, we saw this play out several times in different systems, and decided to do something about it. We took the ledger of an open-source payments switch called Mojaloop, and extracted it as a distributed financial accounting database called TigerBeetle, designed to track financial transactions at scale.

The key performance insight was to dial up group commit. We batch up balance updates so that a single DB query can do on the order of 10k balance updates. We then fsync the batch with a single write before commit, moving the performance needle out from 1k-10k TPS to 1m TPS.

This is the advantage of a purpose-built database that's designed for counting at scale.

More information, including our design decisions are in the repo here: https://github.com/coilhq/tigerbeetle

elsurudo3y ago

Makes sense, but I can't help but feel it's a solution at the wrong abstraction level. It's a shame the DB can't figure this out for you.

yccs273y ago

Yeah, it seems like it would be possible for the DB engine to aggregate all these increments into one update. If you have two increments by one each in the queue, why not make it a single increment by two? I'm not sure though how much computing power it would need to figure that out...

LesZedCB3y ago

wouldn't that break the atomic and isolated rule of ACID?

2 more replies

cryptonector3y ago

The DB could figure it out IF it had monoid counter type to use as the column's type.

yccs273y ago

Interesting. This basically implements a parallel algorithm for summation/counting, by calculating a partial sum in each slot, and then merging the results with every read. This approach could be applied more generally to values in a commutative monoid, and there are probably other parallel algorithms that could be implemented in a database a similar way.

ukd13y ago

Cool, but this method will still get locks in whatever percent of cases, regardless of if there are slots not in transactions. In Postgres you can probably do this with slots using SKIP LOCKED; though in practice I belive you have to deal with the case where everything is locked, by falling back to waiting for a lock.

UPDATE counters SET count = count + 1 WHERE name = ? AND slot = (SELECT slot FROM counters FOR UPDATE SKIP LOCKED LIMIT 1) LIMIT 1

robocat3y ago

How does that avoid race conditions?

SKIP LOCKED doesn’t seem to be designed for that purpose: https://www.enterprisedb.com/blog/what-skip-locked-postgresq...

ukd13y ago

Yep, it's meant for queues - a kinda popular one I maintain for ruby (QueueClassic) uses it. However, you can use it outside of that - this is basically a queue to update a column; note the caveat in my original post. SKIP LOCKED will I believe not find anything and just be fine with that if all are locked; you can deal with this, but I didn't in my example SQL.

deepsun3y ago

In SQL `count = count + 1` does not have race conditions. The discussion about contention (bad performance), not about race conditions (incorrect results).

hiptobecubic3y ago

I would definitely complain about this in code review without some performance testing showing that 1) it's actually faster than waiting for a lock and 2) it's better than just increasing the number of slots.

ukd13y ago

I'd be doing the same in the case of the post; it's guarneteed to hit locks some percentage of time. Should be better than a single slot though.

The whole point of the article was that idle-in-transaction due to locks on a counter in another transaction cause bottlenecks; skipping them using SKIPLOCKED with enough slots eliminates this. Randomly selecting them also randomly picks the locked one, causing a wait.

wcarss3y ago

At the extreme end of this, you could also append a new row for every event and count them. If the number of rows would be too big over some period of time, you could similarly aggregate them occasionally and clear the "scratch" table.

teej3y ago

This "Slotted Counter" approach optimizes for a write-contention constraint, specifically row locks. From 10 seconds of Googling it seems InnoDB has other locks it uses on INSERT, I'd first check if moving the write contention to gap locks actually helps or not.

One side benefit of this approach is that getting the final aggregate is cheap, where compacting an append-only log table might not be.

wcarss3y ago

Thanks for taking the time to look into it, that's an interesting point. According to the manual [1], InnoDB's insert locking should not prevent other inserts from executing, it only takes an exclusive row lock on the inserted row. I agree that measuring would likely be smart.

This makes some intuitive sense, though: general purpose databases are expected to be _pretty good_ at handling the case of "add new data" with no other specific conditions, e.g. on other rows or tables' existing data.

I also agree with your last point. Running count() all day on this wouldn't be great, and compaction would take real time. I assumed that most high throughput write scenarios for something like an event count or view count can be a few minutes (or hours, or days) out of date, at which point read caching would be my first stop before clever summation algorithms, which are still pretty cool.

1 - https://dev.mysql.com/doc/refman/8.0/en/insert.html

idk13y ago

I accidentally ended up at a similar solution once.

I has the same issue, and I fixed it by adding an associated 'count_table' row for each hit, and deleting the row once it had been added later on to the final count. Which actually fixed the issue. Then refactored it so each user or ip had it's own 'count_table' row. It meant the final total count lagged a bit, by 60 seconds or so once the count_table rows had been counted up and deleted, that was the downside but it was totally acceptable.

I wish I'd have thought of this, it's much better and simpler I think haha.

asadawadia3y ago

the fundamental problem with counters is that it does a read-modify-write cycle which is quite harsh on the DB - a better approach is to take advantage that counters are cumulative and we can keep the delta events only and 'merge' them on reads

or just buffer the counters in redis and then flush them out

I have a counters API that does precisely this

Docs are here: cmd+f: 'Counters API now live'

https://blog.aawadia.dev/api/

nyanpasu643y ago

This feels like the database locking version of https://travisdowns.github.io/blog/2020/07/06/concurrency-co..., which instead covers in-process concurrency speed.

dima_vm3y ago

If we need slotted counters, we're pretty close to the boundary of "right tool for the job", and we need to consider using a time-series database.

For one use-case it might be ok, although if we expect more counters in the future, it's easier to make it now than refactor later.

sonicgear13y ago

Wouldn't querying the count be slow using a WHERE clause?

LesZedCB3y ago

it was indexed which should help, but i assume that the reads are far less common than the inserts. reads could even be scheduled/automated and stored in a cache table if they need to be faster and ok being a little stale

Andys3y ago

There's only dozens (or at most ~100) rows.

j / k navigate · click thread line to collapse

31 comments

tantalor3y ago

Basically same as "Sharding counters" (2008)

https://download.huihoo.com/google/gdgdevkit/DVD1/developers...

Also in Brett Slatkin's "Building Scalable Web Apps with App Engine" (2008)

https://youtu.be/Oh9_t5W6MTE?t=1181

ris3y ago

Was going to post this. "Sharding" seems like a better term for communicating the idea.

zimpenfish3y ago

1 more reply

iv423y ago

Also, it's basically identical to the pattern described as Counter Tables in "High Performance MySQL" (3rd edition), published in 2012.

tantalor3y ago

Indeed, it also calls them "slots":

> You can get higher concurrency by keeping more than one row and updating a random row... just choose a random slot and update it

https://www.oreilly.com/library/view/high-performance-mysql/...

_vvhw3y ago

Balance tracking seems to be the Achilles' heel of SQL databases.

Performance panaceas mostly come down to sharding, but sharding doesn't work well where you need strict updates to balances.

This is the advantage of a purpose-built database that's designed for counting at scale.

More information, including our design decisions are in the repo here: https://github.com/coilhq/tigerbeetle

elsurudo3y ago

Makes sense, but I can't help but feel it's a solution at the wrong abstraction level. It's a shame the DB can't figure this out for you.

yccs273y ago

LesZedCB3y ago

wouldn't that break the atomic and isolated rule of ACID?

2 more replies

cryptonector3y ago

The DB could figure it out IF it had monoid counter type to use as the column's type.

yccs273y ago

ukd13y ago

UPDATE counters SET count = count + 1 WHERE name = ? AND slot = (SELECT slot FROM counters FOR UPDATE SKIP LOCKED LIMIT 1) LIMIT 1

robocat3y ago

How does that avoid race conditions?

SKIP LOCKED doesn’t seem to be designed for that purpose: https://www.enterprisedb.com/blog/what-skip-locked-postgresq...

ukd13y ago

deepsun3y ago

In SQL `count = count + 1` does not have race conditions. The discussion about contention (bad performance), not about race conditions (incorrect results).

hiptobecubic3y ago

ukd13y ago

I'd be doing the same in the case of the post; it's guarneteed to hit locks some percentage of time. Should be better than a single slot though.

wcarss3y ago

teej3y ago

One side benefit of this approach is that getting the final aggregate is cheap, where compacting an append-only log table might not be.

wcarss3y ago

1 - https://dev.mysql.com/doc/refman/8.0/en/insert.html

idk13y ago

I accidentally ended up at a similar solution once.

I wish I'd have thought of this, it's much better and simpler I think haha.

asadawadia3y ago

or just buffer the counters in redis and then flush them out

I have a counters API that does precisely this

Docs are here: cmd+f: 'Counters API now live'

https://blog.aawadia.dev/api/

nyanpasu643y ago

This feels like the database locking version of https://travisdowns.github.io/blog/2020/07/06/concurrency-co..., which instead covers in-process concurrency speed.

dima_vm3y ago

If we need slotted counters, we're pretty close to the boundary of "right tool for the job", and we need to consider using a time-series database.

For one use-case it might be ok, although if we expect more counters in the future, it's easier to make it now than refactor later.

sonicgear13y ago

Wouldn't querying the count be slow using a WHERE clause?

LesZedCB3y ago

Andys3y ago

There's only dozens (or at most ~100) rows.

j / k navigate · click thread line to collapse