17 January 2013

A Quick Overview Of Redis

Recently I decided to get myself somewhat up to speed on some of many nosql databases. Redis is a tool I have heard people raving about for several years now, so I decided it would be a great place to start.

What is Redis?

Redis is an in memory key value store, but with a difference. Many key value stores allow a string value to be stored against a key, and that is it. Redis supports this, but it also supports other simple types too:

Lists - an in memory array allowing elements to be pushed and popped
Hashes - not much more to say about this really
Sets - like a list, only each element can only stored once. Pushing a duplicate element onto a set will still result in only one occurrence of that element being stored.
Sorted Sets - Similar to sets, except each element has a numeric score, and the elements are stored in score order.

Plenty has been written about these types in other places, and I am not going to say much more about them. Read up in the Redis Manual and have a scan over all the available commands.

All In Memory

With Redis, all of your data needs to fit in memory. It has a couple of persistence options to ensure your data is not lost if Redis crashes, but this data is never read except at startup time.

In Redis 2.2 an experimental feature was added called Virtual Memory, but it was later removed in Redis 2.6. The Redis team decided that they wanted to do one thing well - serve data from memory - and not be concerned with reading data from disk.

Simple

One interesting thing about Redis is that it is very simple. It is a single threaded server, and in version 2.2 was only 20k lines of code. Due to the single threaded design, Redis will use at most 1 CPU core (except when it is background saving when it could use another one) and this helps keep the code simple - Redis doesn't need to worry about latches and locks to prevent concurrent processes stomping over each other.

It can only run a single command at a time - in other words each command is atomic and blocks the entire server while it is being processed. When you first hear this, it sounds like a terrible, unscalable design - until you realize just how fast typical Redis commands are. 50K - 100K requests per second are typical on commodity hardware on a single CPU.

Single Node

Right now, Redis is a single node database. If your data is too large to fit on a single machine, then sharding it across multiple machines is a job for the application. Apparently Redis Cluster is coming and should provide sharding capabilities.

One Redis node can be replicated to another very easily, so having a fail-over instance that is identical to the master is pretty easy right now.

Getting Started Is Easy

Getting started with Redis is so simple, it is literally as easy as:

$ make
$ cd src
$ ./redis-server

Unlike with many databases, you can be up and running in about 5 minutes. Setting up replication is just about as easy too.

You can read the entire Redis manual and understand it all in under a day, which is one of the reasons I decided to investigate it.

Performance

Redis is the first non-relational database I have ever experimented with, and I was sceptical about being able to hit the performance numbers that were being suggested. Luckily Redis comes with a handy benchmarking tool, which allows you to see how it performs on your hardware.

Simply running:

$ ./redis-benchmark

Will run a bunch of tests you can use to compare your setup with others. The default tests are not terribly real world in my opinion, as the value set for any keys is always 2 bytes, but this can be changed with the -d switch.

Running the benchmark on my hardware, with a payload size of 200 bytes, gives the following results for set and get operations:

$ ====== SET ======
  10000 requests completed in 0.07 seconds
  50 parallel clients
  200 bytes payload
  keep alive: 1

100.00% <= 0 milliseconds
151515.16 requests per second

====== GET ======
  10000 requests completed in 0.08 seconds
  50 parallel clients
  200 bytes payload
  keep alive: 1

99.51% <= 8 milliseconds
99.97% <= 9 milliseconds
100.00% <= 9 milliseconds
131578.95 requests per second

At 150K sets per second and 131K gets per second, the single threaded nature of Redis doesn't seem so bad any-more.

The benchmark tool also lets you run any Redis command you like, for instance, to test the cost of pushing 1 million items onto a sorted set (a more complex operation that a simple key-value set operation), try the following command:

$ ./redis-benchmark -r 1000000 -n 1000000 zadd sortedset 10 random_value_that_is_a_little_long_:rand:000000000000

====== zadd sortedset 10     random_value_that_is_a_little_long_:rand:000000000000 ======
  1000000 requests completed in 10.70 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

99.96% <= 1 milliseconds
99.99% <= 2 milliseconds
100.00% <= 9 milliseconds
100.00% <= 10 milliseconds
100.00% <= 10 milliseconds
93466.68 requests per second

93K requests per second - not bad at all.

Use Cases

Redis has plenty of potential use cases, I cannot possibly think of them all. At the most simple, it can be used as a cache for a web application, it also has applications in queuing, distributed object stores and with some thought the lists, sets and sorted sets have a lot of potential use cases.

This post is part of a series:

No previous articles | Series Index | Next in series