First, a key limitation that every architect should pay attention. Redis reaches the limits of what you can do in well-written single-threaded C. One of those limits is that you really, really, *really* don't want to go outside of RAM. Think about what is stored, and be sure not to waste space. (It is surprisingly easy to leak memory.)
Second, another use case. Replication in Redis is cheap. If your data is small and latency is a concern (eg happened to me with an adserver), then you can locate read-only Redis replicas everywhere. The speed of querying off of your local machine is not to be underestimated.
And third, it is worth spending time mastering Redis data structures. For example suppose you have a dynamic leaderboard for an active game. A Redis sorted set will happily let you instantly display any page of that leaderboard, live, with 10 million players and tens of thousands of updates per second. There are a lot of features like that which will be just perfect for the right scenario.
You can have massive amounts of RAM these days. You’re sooner to hit big-O limits from bad architectural decisions than run out of memory. If you do get to that point you likely have enough value in your usage to justify scaling out further and sharding.
> And third, it is worth spending time mastering Redis data structures.
Bingo. The true secret to properly using Redis: understanding the big-O complexity of each operation (…and ensuring that none of your interactions are more than logarithmic).
Absolute disagreement.
It is very easily to accidentally leak a few hundred MB per week in a busy Redis system. The code will look and work fine...at first. It is correspondingly hard to track down and clean up the leak a few months later. (Particularly if there are multiple such to track down.) Yes, you can go for years just buying larger and larger EC2 instances. But that will also come with a shocking price tag.
I know of a number of organizations that this happened to. And pretty much every bad Redis story I hear about had this as a root cause. That is why I brought it up as an important consideration.
True, but I am finding that balancing CPU and RAM can be tricky. Slapping 128GB on a 1-core machine means you quickly have CPU limitations.
This is a good idea, maybe a prompt for another post.
Do you face any consistency issues with doing this?
This is said to have a 10μs latency in the chart. But I'm fairly sure that is a calculation of bandwidth based on 1KB / 1GBps
10μs is about 3Km, so at most a 1.5Km round-trip.
For a chart labelled latency, I'm surprised to see bandwidth calculations included. Any network hop would actually have far greater latency, if nothing else because communication typically involves more than a single round-trip for acknowledgement, etc.
It might be worth making it clear some of the numbers are about bandwidth not latency.
The simplest scenario in the article is a single Redis instance residing on the same machine as the application. What's the benefit to this versus just storing data directly within the application?
I personally first reached for Redis when I needed to asynchronously process a bunch of JSON uploaded by clients via POST. I initially just stuck them in a ConcurrentQueue in memory, but no matter how much I fiddled with HostedServices and BackgroundWorkers and whatever the MS documentation recommended, the ASP.NET Core app would occasionally 'lose' that queue before it could be consumed (or the consuming loop would get stuck, with the same result).
You are also probably running your app on a pretty high-level language, with bytecode and reflection and all that nice stuff - if not even an interpreted language - while Redis is raw C code and will outperform your homebrew double-linked list or hash set.
So if you can fetch some cached data from a Redis key, even if on the same machine, it will cost you significantly less than querying a relational database.
I'm not a Redis user, but that's based on what I've read
What tool do you use for your diagramming, is it all hand-drawn?
HN automatically combines submissions so that subsequent submissions count as upvotes for the first submission.
If a popular source posts a new article, users will "rush" to post it to HN to reap that sweet karma and the winner will "catch" the upvotes of the others.
It's the HN algorithm which is probably due to the fact that other posts from his domain have done relatively well, plus the actual poster here has quite a bit of karma.
It also saves you from Google performing maintenance on the machine and deleting all your Lua scripts.
KeyDB is becoming increasingly popular though.
The biggest problem with Redis, at least in C++ land, is the client libraries. hiredis doesn’t support Redis Cluster, and other 3rd party clients that do are of unknown quality.
Want to move more of my app's datastore to Redis now that I've learned more about sorted sets etc.
I see some data-types on the right. It surprises me that redis doesn't have a numeric data type. I understand that at its heart it is just a key-value store and doesn't ever need to do range-based lookup but it still surprises me.
One consequence of "everything is a string" I've run into (although probably a sign I'm "doing it wrong"), is serialisation overhead in the client.
If redis is expecting strings then it's left to the client to choose an appropriate serialisation which can have either performance or other pitfalls.
> BITFIELD player:1:stats SET u32 #0 1000
1) (integer) 0
> BITFIELD player:1:stats INCRBY u32 #0 -900
1) (integer) 100
> BITFIELD player:1:stats GET u32 #0
1) (integer) 100
That said, all the keys themselves are still strings and therefore you can't have a SET of numbers or bitfields.
* you're building a web-ish application and need to store session data
* you don't want to go through the overhead of building a strongly typed relational table
* you know minimal operations stuff
* just use redis, its easy to deploy, easy to code for, and available on all major cloud platforms as a managed service
---
The problem is there are tradeoffs and session storage becomes a fundamental architectural decision once your application matures. So something you added as a once-off so you can get back to feature development is now a foundational pillar.
However a simple redis instance in front of the database serving as a readable cache changes the rules of the game significantly - depending on the complexity of your calculation and your end result subsequent "page loads" or whatever you are doing can be tens of thousands (or more) times as efficient, and if you decide to use an expensive database or a cloud database this can help you a lot.
Eventually the hard part is you might have bugs in synchronizing the state of redis and your database, look to existing implementations for your stack instead of reinventing the wheel.
One thing that threw me off is that it says for an SSD a random read is 150μs, but 1MB sequential read is 1ms? Shouldn't sequential reads be faster, or are two different read sizes being compared or something? If so, the ambiguity may confuse some people to think random reads are faster
AFAIK, on SSD's there is no concept/guarantee that blocks are adjacent, so a sequential read is just a bunch of random reads.
I'm building a new website and am using sidekiq for background job processing which relies on redis behind the scenes to store all the job data. I configured a high availability redis instance with `maxmemory-policy noeviction` to ensure no data is lost.
The website is still in its infancy so not thinking about scale for the next little while but curious if you have any tips or gotchas to keep an eye out for. Thanks!
Is there anything inherently wrong with this? Gotchas? A mockup I've done works great so far.
This page from AWS about Redis streams goes exactly to my use case: Redis Streams and Message Queues: https://aws.amazon.com/redis/Redis_Streams_MQ/
If you want something quick and easy and dirty, go with Redis. But switch to Rabbit when you start having to write a lot of handling and other code.
This is leading me to think, using redis as the sole database is very tempting but the Ram requirement is making me think twice.
Wouldn’t there be a database like redis that only stores the latest data into memory and keeps the rest in an AOF file ?
Great article though!
1, .toc-wrap covers the image on desktop
2, the image is way too busy, there's too much going on
I am not affiliated, just a happy user.