Spring Boot Caching: Multi-Level Cache with Caffeine + Redis

Caching reduces database load and response latency. Spring Boot’s cache abstraction lets you add caching with annotations, then swap the implementation (Caffeine, Redis, multi-level) without changing your business code.

This guide covers Caffeine for in-JVM caching, Redis for distributed caching, and a multi-level cache that combines both.


Spring Cache Abstraction

Spring’s cache abstraction uses three annotations:

AnnotationBehaviour
@CacheableCache the return value. On subsequent calls, return from cache without executing the method.
@CachePutAlways execute the method AND update the cache with the result.
@CacheEvictRemove one or all entries from the cache.
@Service
public class ProductService {

    @Cacheable(value = "products", key = "#id")
    public Product findById(Long id) {
        // Only executed on cache miss
        return productRepository.findById(id).orElseThrow();
    }

    @CachePut(value = "products", key = "#product.id")
    public Product update(Product product) {
        // Always executed; updates the cache with the new value
        return productRepository.save(product);
    }

    @CacheEvict(value = "products", key = "#id")
    public void delete(Long id) {
        // Removes the entry from cache after method executes
        productRepository.deleteById(id);
    }

    @CacheEvict(value = "products", allEntries = true)
    @Scheduled(cron = "0 0 * * * *")  // every hour
    public void evictAllProducts() {
        // Scheduled full cache eviction
    }
}

Enable caching by adding @EnableCaching to your configuration:

@SpringBootApplication
@EnableCaching
public class Application {}

Level 1: Caffeine (In-JVM Cache)

Caffeine is the fastest cache — sub-microsecond reads, no network. Ideal for data that’s accessed frequently and doesn’t change often.

Setup

<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>caffeine</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-cache</artifactId>
</dependency>

Configuration

spring:
  cache:
    type: caffeine
    caffeine:
      spec: maximumSize=1000,expireAfterWrite=10m

Or per-cache configuration:

@Configuration
@EnableCaching
public class CacheConfig {

    @Bean
    public CacheManager cacheManager() {
        CaffeineCacheManager manager = new CaffeineCacheManager();

        // Default spec for all caches
        manager.setCaffeine(Caffeine.newBuilder()
            .maximumSize(1000)
            .expireAfterWrite(10, TimeUnit.MINUTES)
            .recordStats());  // enables cache hit/miss metrics

        // Override specific caches
        Map<String, CaffeineCache> caches = new HashMap<>();
        caches.put("products", buildCache(500, 30, TimeUnit.MINUTES));  // large, long TTL
        caches.put("sessions", buildCache(10000, 30, TimeUnit.MINUTES));  // many entries
        caches.put("config", buildCache(100, 1, TimeUnit.HOURS));  // rarely changes

        caches.forEach((name, cache) -> manager.registerCustomCache(name, cache.getNativeCache()));

        return manager;
    }

    private CaffeineCache buildCache(int maxSize, long ttl, TimeUnit unit) {
        return new CaffeineCache("temp",
            Caffeine.newBuilder()
                .maximumSize(maxSize)
                .expireAfterWrite(ttl, unit)
                .recordStats()
                .build());
    }
}

Caffeine eviction policies

Caffeine.newBuilder()
    .maximumSize(1000)           // evict LRU when > 1000 entries
    .expireAfterWrite(10, MINUTES)  // TTL from write time
    .expireAfterAccess(5, MINUTES)  // TTL from last access (sliding window)
    .weakValues()                // GC can evict entries under memory pressure
    .softValues()                // GC evicts only when memory is low

When to use each:

  • expireAfterWrite: data has a known freshness window (price data, config)
  • expireAfterAccess: session-like data that should expire if unused
  • weakValues / softValues: large objects where memory pressure should trigger eviction

Level 2: Redis (Distributed Cache)

Redis cache is shared across all JVM instances — multiple pods see the same cached data. Reads take 0.5–2ms (network round trip) but you get cross-instance consistency.

Setup

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
spring:
  cache:
    type: redis
  data:
    redis:
      host: ${REDIS_HOST:localhost}
      port: 6379
      password: ${REDIS_PASSWORD:}

Per-cache TTL configuration

@Configuration
@EnableCaching
public class RedisCacheConfig {

    @Bean
    public RedisCacheManager cacheManager(RedisConnectionFactory connectionFactory) {
        RedisCacheConfiguration defaults = RedisCacheConfiguration.defaultCacheConfig()
            .entryTtl(Duration.ofMinutes(10))
            .disableCachingNullValues()
            .serializeValuesWith(RedisSerializationContext.SerializationPair
                .fromSerializer(new GenericJackson2JsonRedisSerializer()));

        Map<String, RedisCacheConfiguration> cacheConfigs = new HashMap<>();
        cacheConfigs.put("products",
            defaults.entryTtl(Duration.ofMinutes(30)));
        cacheConfigs.put("config",
            defaults.entryTtl(Duration.ofHours(1)));
        cacheConfigs.put("sessions",
            defaults.entryTtl(Duration.ofMinutes(30)));

        return RedisCacheManager.builder(connectionFactory)
            .cacheDefaults(defaults)
            .withInitialCacheConfigurations(cacheConfigs)
            .build();
    }
}

Serialization: use JSON, not Java serialization

The default Redis serializer uses Java serialization — fragile, large, version-sensitive. Always use JSON:

// In RedisCacheConfiguration above:
.serializeValuesWith(RedisSerializationContext.SerializationPair
    .fromSerializer(new GenericJackson2JsonRedisSerializer()))

Ensure cached objects are serializable to JSON (record types work perfectly, avoid complex JPA entities with lazy collections).


Multi-Level Cache: L1 Caffeine + L2 Redis

For maximum performance: check Caffeine first (sub-microsecond), fall back to Redis (1–2ms), fall back to DB (5–20ms).

Request
   │
   ▼
L1 Caffeine hit? → Return immediately (< 1ms)
   │ miss
   ▼
L2 Redis hit? → Return + populate L1 (1-2ms)
   │ miss
   ▼
Database → Return + populate L1 + L2 (5-20ms)

Implementation

@Configuration
@EnableCaching
public class MultiLevelCacheConfig {

    @Bean
    @Primary
    public CacheManager cacheManager(
            RedisConnectionFactory redisConnectionFactory) {

        // L1: Caffeine
        CaffeineCacheManager l1 = new CaffeineCacheManager();
        l1.setCaffeine(Caffeine.newBuilder()
            .maximumSize(500)
            .expireAfterWrite(5, TimeUnit.MINUTES)
            .recordStats());

        // L2: Redis
        RedisCacheManager l2 = RedisCacheManager.builder(redisConnectionFactory)
            .cacheDefaults(RedisCacheConfiguration.defaultCacheConfig()
                .entryTtl(Duration.ofMinutes(30))
                .serializeValuesWith(RedisSerializationContext.SerializationPair
                    .fromSerializer(new GenericJackson2JsonRedisSerializer())))
            .build();

        return new LayeredCacheManager(l1, l2);
    }
}
public class LayeredCacheManager implements CacheManager {

    private final CacheManager l1;
    private final CacheManager l2;
    private final Map<String, Cache> caches = new ConcurrentHashMap<>();

    public LayeredCacheManager(CacheManager l1, CacheManager l2) {
        this.l1 = l1;
        this.l2 = l2;
    }

    @Override
    public Cache getCache(String name) {
        return caches.computeIfAbsent(name, n -> new LayeredCache(n,
            l1.getCache(n),
            l2.getCache(n)));
    }

    @Override
    public Collection<String> getCacheNames() {
        Set<String> names = new HashSet<>();
        names.addAll(l1.getCacheNames());
        names.addAll(l2.getCacheNames());
        return names;
    }
}
public class LayeredCache implements Cache {

    private final String name;
    private final Cache l1;
    private final Cache l2;

    @Override
    public ValueWrapper get(Object key) {
        // Try L1
        ValueWrapper l1Value = l1.get(key);
        if (l1Value != null) {
            return l1Value;  // L1 hit
        }

        // Try L2
        ValueWrapper l2Value = l2.get(key);
        if (l2Value != null) {
            // Populate L1 from L2
            l1.put(key, l2Value.get());
            return l2Value;  // L2 hit
        }

        return null;  // Cache miss — caller will load from DB
    }

    @Override
    public void put(Object key, Object value) {
        l1.put(key, value);
        l2.put(key, value);
    }

    @Override
    public void evict(Object key) {
        l1.evict(key);
        l2.evict(key);
    }

    @Override
    public void clear() {
        l1.clear();
        l2.clear();
    }

    @Override
    public String getName() { return name; }

    @Override
    public Object getNativeCache() { return this; }
}

Cache Metrics

With recordStats() enabled on Caffeine, metrics are automatically exposed to Prometheus via Micrometer:

# Cache hit rate
cache_gets_total{name="products", result="hit"} /
    cache_gets_total{name="products"}

# Miss rate
cache_gets_total{name="products", result="miss"} /
    cache_gets_total{name="products"}

Target hit rate > 90%. If it’s lower, either TTL is too short or the cache is too small.


Cache Warming on Startup

Avoid a cold cache thundering herd after deployment:

@Component
public class CacheWarmer implements ApplicationRunner {

    private final ProductService productService;
    private final ProductRepository productRepository;

    @Override
    public void run(ApplicationArguments args) {
        log.info("Warming product cache...");
        // Load the 1000 most-viewed products into cache
        productRepository.findTop1000ByOrderByViewCountDesc()
            .forEach(product -> productService.findById(product.getId()));
        log.info("Cache warm-up complete");
    }
}

Run this on pod startup. It’s especially important in Kubernetes where new pods start with cold caches and can overload the DB if all 10 new pods hit it simultaneously.


Conditional Caching

Skip the cache for certain conditions:

@Cacheable(value = "products", key = "#id",
           condition = "#id > 0",           // only cache when ID > 0
           unless = "#result.price == 0")   // don't cache free products
public Product findById(Long id) {
    return productRepository.findById(id).orElseThrow();
}

Quick Reference

// Enable caching
@SpringBootApplication @EnableCaching

// Basic annotations
@Cacheable(value = "products", key = "#id")
@CachePut(value = "products", key = "#product.id")
@CacheEvict(value = "products", key = "#id")
@CacheEvict(value = "products", allEntries = true)

// Caffeine config (application.yaml)
spring.cache.type: caffeine
spring.cache.caffeine.spec: maximumSize=1000,expireAfterWrite=10m

// Redis config (application.yaml)
spring.cache.type: redis
spring.data.redis.host: redis-host
# Hit rate (Micrometer + Prometheus)
rate(cache_gets_total{result="hit"}[5m]) / rate(cache_gets_total[5m])

Summary

Spring Boot’s cache abstraction lets you add caching with three annotations and swap implementations without code changes. Caffeine is the fastest option for in-JVM caching — use it for hot data with predictable size. Redis is for distributed caching across multiple pods — use it when cache consistency across instances matters. The multi-level cache (Caffeine L1 + Redis L2) gives you sub-millisecond reads for hot data with Redis as a fallback and consistency layer. Warm the cache on startup to avoid thundering herd after deployment.

Abhay

Abhay Pratap Singh

DevOps Engineer passionate about automation, cloud infrastructure, and self-hosted tools. I write about Kubernetes, Terraform, DNS, and everything in between.