Configuring Spring Cache Manager with AWS ElastiCache Redis (cluster mode disabled) and Lettuce


We have Spring Boot 2 application that uses Redis as the cache manager. We deploy our application on Amazon AWS where we use AWS ElastiCache Redis service in cluster mode disabled. Our setup includes a Redis master with two Redis slaves. The default Java client for Redis with spring-boot-starter-data-redis dependency is lettuce-core. When you are working with single Redis node with no slaves, using AWS Elastic Cache Redis is as simple as providing the spring.redis.url with the value of AWS ElastiCache Redis instance URL. This was the set up that we were using till a month back. As the load on the system increased we decided to use ElastiCache Redis in replicated setup to scale our reads. In AWS, Redis implements replication in two ways:

  1. With a single shard that contains all of the cluster’s data in each node – Redis (cluster mode disabled)
  2. With data partitioned across up to 15 shards — Redis (cluster mode enabled)

In our case, cached data is less than 1 GB so it fits in RAM of single node. This made us choose cluster mode disabled setup.

In the previous Sprint, we provisioned Redis with one master and two slave nodes. Our initial thought was that we just need to provide URL of the master node to the application and Lettuce client will automatically discover all the slave nodes. However, we noticed that data was being read only from the master node. There were 0 cache hits on the two slaves. This made us look for the problem. We were expecting that our Redis Java client will automatically detect the slaves and distribute the load across replicas. We looked at the Lettuce documentation on the wiki and found the following note:

Static Master/Slave with predefined node addresses

In some cases, topology discovery shouldn’t be enabled, or the discovered Redis addresses are not suited for connections. AWS ElastiCache falls into this category. lettuce allows to specify one or more Redis addresses as List and predefine the node topology. Master/Slave URIs will be treated in this case as static topology, and no additional hosts are discovered in such case. Redis Standalone Master/Slave will discover the roles of the supplied RedisURIs and issue commands to the appropriate node.

Our assumption that slaves will be automatically discovered by Lettuce was wrong. As mentioned in the note, we will have to provide the URLs from our code.

Creating Custom LettuceConnectionFatory Bean

Now, the next question was how to do that with Spring Boot. So far we were relying on the Spring Boot auto configuration. After spending some time on the Spring Data Redis issue tracker I found one issue DATAREDIS- 580 that talks about support for static master slave configuration. These changes are not yet part of the released version so we had to use milestone versions of spring-data-redis and lettuce-core libraries. If you are using Gradle update dependencies section with the following dependencies:

compile 'io.lettuce:lettuce-core:5.1.0.M1'
compile'org.springframework.data:spring-data-redis:2.1.0.M3'

Once you make the above change, create a new configuration class to provide your custom LettuceConnnectionFactory bean.

import io.lettuce.core.ReadFrom;

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.redis.connection.RedisStaticMasterSlaveConfiguration;
import org.springframework.data.redis.connection.lettuce.LettuceClientConfiguration;
import org.springframework.data.redis.connection.lettuce.LettuceConnectionFactory;

@Configuration
public class ElastiCacheConfig {

    @Bean
    public LettuceConnectionFactory connectionFactory() {
        RedisStaticMasterSlaveConfiguration elastiCache =
                new RedisStaticMasterSlaveConfiguration("redis-123-001.zzzz.0001.aps1.cache.amazonaws.com");
        elastiCache.addNode("redis-123-002.zzzz.0001.aps1.cache.amazonaws.com", 6379);
        elastiCache.addNode("redis-123-003.zzzz.0001.aps1.cache.amazonaws.com", 6379);
        LettuceClientConfiguration clientConfig = LettuceClientConfiguration
                .builder()
                .readFrom(ReadFrom.SLAVE_PREFERRED)
                .build();
        return new LettuceConnectionFactory(elastiCache, clientConfig);
    }
}

Ideally, you will read the Redis node URLs from a configuration file. As you can see above, we first statically passed all our node urls to RedisStaticMasterSlaveConfiguration and then in the LettuceClientConfiguration we specified a configuration setting to read from slaves first and if slave does not exist then read from master. There are 5 valid values for ReadFrom. The values are MASTER, MASTER_PREFERED, SLAVE_PREFERED, SLAVE, NEAREST, The default value is MASTER.

After making the above change, we were able to successfully read from Redis slaves. But the problem was still the same. Data was still being read from a single slave.

Reading from all slaves in Round Robin manner

In few special scenarios, our application makes 1000’s of read calls to the Redis in a single request. Our problem is parallelizable so we can process multiple records in parallel. My understanding is that since Redis is single threaded and our processing is parallel we can get better performance if we can distribute our load across multiple slaves. I was expecting Lettuce to have some form of load balancing in-built. But as it turned out, it has not been implemented yet. There is an issue in Lettuce issue tracker that talks about the same problem. The author of the library for now has decided not to implement round robin strategy. It is the responsibility of the application to implement a custom ReadFrom strategy.

So, as every programmer will do I decided to write a custom ReadFrom strategy.

import java.util.List;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.IntStream;
import java.util.stream.Stream;

import io.lettuce.core.ReadFrom;
import io.lettuce.core.models.role.RedisNodeDescription;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Profile;
import org.springframework.data.redis.connection.RedisStaticMasterSlaveConfiguration;
import org.springframework.data.redis.connection.lettuce.LettuceClientConfiguration;
import org.springframework.data.redis.connection.lettuce.LettuceConnectionFactory;

import static java.util.stream.Collectors.toList;

@Configuration
public class ElastiCacheConfig {

    private Logger logger = LoggerFactory.getLogger(ElastiCacheConfig.class);

    private final AtomicInteger index = new AtomicInteger(-1);

    @Bean
    public LettuceConnectionFactory connectionFactory() {
        RedisStaticMasterSlaveConfiguration elastiCache =
                new RedisStaticMasterSlaveConfiguration("redis-123-001.zzzz.0001.aps1.cache.amazonaws.com");
        elastiCache.addNode("redis-123-002.zzzz.0001.aps1.cache.amazonaws.com", 6379);
        elastiCache.addNode("redis-123-003.zzzz.0001.aps1.cache.amazonaws.com", 6379);
        LettuceClientConfiguration clientConfig = LettuceClientConfiguration
                .builder()
                .readFrom(new ReadFrom() {
                    @Override
                    public List<RedisNodeDescription> select(Nodes nodes) {
                        List<RedisNodeDescription> allNodes = nodes.getNodes();
                        int ind = Math.abs(index.incrementAndGet() % allNodes.size());
                        RedisNodeDescription selected = allNodes.get(ind);
                        logger.info("Selected random node {} with uri {}", ind, selected.getUri());
                        List<RedisNodeDescription> remaining = IntStream.range(0, allNodes.size())
                                                                        .filter(i -> i != ind)
                                                                        .mapToObj(allNodes::get).collect(toList());
                        return Stream.concat(
                                Stream.of(selected),
                                remaining.stream()
                        ).collect(toList());
                    }
                })
                .build();
        return new LettuceConnectionFactory(elastiCache, clientConfig);
    }
}

We find a random node and put that at the front of the list. After making this change, I saw that load was distributed across all the nodes. The performance gain was not much so I decided not to use it for now. I am documenting it for the future. I hope it helps someone.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s