At Houzz, we use Redis as the de-facto in-memory data store for our applications, including the web servers, mobile API servers and batch jobs. In order to support the growing demands of our applications, we migrated from an ad hoc collection of single Redis servers to Redis Cluster during the first half of the year.
To date, we have gained the following benefits from the migration:
– Ability to scale up without the need to modify applications.
– No additional proxies between clients and servers.
– Lower capacity requirement and lower operational cost.
– Built-in master/slave replication.
– Greater resilience to single point of failures.
– Functional parity to single Redis servers, including support for multi-key queries under certain circumstances.
Redis Cluster also has limitations. It does not support environments where IP addresses or TCP ports are remapped. Although it has built-in replication, as we ultimately discovered, few client libraries, if any, have support for it. For certain operations such as new connection creation and multi-key operations, Redis Cluster has longer latencies than single servers.
In this post, we will share our experiences with the migration, the lessons we learned, the hurdles we encountered, and the solutions we proposed.
Some of our applications use Redis as a permanent data store, while others use it as a cache. In a typical setting, there is a Redis master that processes write requests and propagates the changes to a number of slaves. The slaves serve only read requests. One of the slaves is configured to dump the memory to disk periodically. The dumps are backed up into the cloud. We use Redis Sentinel to do automatic failover from failed masters to slaves. Our applications access Redis through HAProxy.
Historically we scaled up the Redis servers by “functional” sharding. We started with a single shard. When we were about to run out of capacity, we added another shard and moved a subset of the keys from the existing shard to the new one. The new shard is typically dedicated to keys for a specific application or feature, e.g., ads or user data. Code that accessed the moved keys was modified to access the new servers after the move. For example, the marketplace application would access shards that store data about the products and merchants in the marketplace, while the consumer-oriented applications would access shards that store user data such as activities and followed topics. The same process was repeated for several years and the number of servers grew to several dozens. The process remained mostly manual due to the need to modify the applications.
The Redis servers ran on high-end hosts that had a large memory capacity. Such hosts would typically also have a large number of processors. Since each Redis server is a single process, only a small fraction of processors are utilized on each host. In addition, there is imbalance in memory and CPU usages across the shards due to the manual partitioning. Some shards have a large memory footprint and/or serve a high requests per second.
The large memory footprints are problematic to operations such as restart and master-slave synchronization. It can take more than 30 minutes for a large shard to restart or to do a full master-slave sync. Since all our client requests depend on Redis accesses, it poses a risk for a severe site-wide outage should all replicas of the large shard go down.
In the beginning of the year, we evaluated options to scale up the Redis servers with fewer manual processes and a shorter time to production.
Redis Cluster vs. Twemproxy
One option we considered was Redis Cluster. It was released by the Redis community on April 1, 2015. It automatically shards data across multiple servers based on hashes of keys. The server selection for each query is done in the client libraries. If the contacted server does not have the queried shard, the client will be redirected to the right server.
There are several advantages with Redis Cluster. It is well documented and well integrated with Redis core. It does not require an additional server between clients and Redis servers, hence has a lower capacity requirement and a lower operational cost. It does not have a single point of failure. It has the ability to continue read/write operations when a subset of the servers are down. It supports multi-key queries as long as all the keys are served by the same server. Multiple keys can be forced to the same shard with “hash tags”, i.e., sub-key hashing. It has built-in master-slave replication.
As mentioned above, Redis Cluster does not support NAT’ed environments and in general environments where IP addresses or TCP ports are remapped. This limitation makes it incompatible with our existing settings, in which we use Redis Sentinel to do automatic failover, and the clients access Redis through HAProxy. HAProxy provides two functions in this case: It does health checks on the Redis servers so that the client will not access unresponsive or otherwise faulty servers. It also detects the failover that is triggered by Redis Sentinel, so that write requests will be routed to the latest masters. Although Redis Cluster has built-in replication, as we discovered later, few client libraries, if any, have support for it. The open source client libraries we use, e.g., Predis and Jedis, would ignore the slaves in the cluster and send all requests to the masters.
The other option we evaluated was Twemproxy. Twitter developed and launched Twemproxy before Redis Cluster was available. Like Redis Cluster, Twemproxy automatically shards data across multiple servers based on hashes of keys. The clients send queries to the proxy as if it is a single Redis server that owns all the data. The proxy then relays the query to the Redis server that has the shard, and relays the response back to the client.
Like Redis Cluster, there is no single point of failure in Twemproxy if multiple proxies are running for redundancy. Twemproxy also has an option to enable/disable server ejection, which can mask individual server failures when Redis is used as a cache vs. a data store.
One disadvantage of Twemproxy is that it adds an extra hop between clients and Redis servers, which may add up to 20% latency according to prior studies. It also has extra capacity requirement and operational cost for monitoring the proxies. It does not support multi-key queries. It may not be well integrated with Redis Sentinel.
Based on the above comparison, we decided to use Redis Cluster as the scale-up method going forward.
Building the cluster
Before we could migrate to Redis Clust