Spring Boot Caching: Multi-Level Cache with Caffeine + Redis
Caching reduces database load and response latency. Spring Boot’s cache abstraction lets you add caching with annotations, then swap the implementation (Caffeine, Redis, multi-level) without changing your business code.
This guide covers Caffeine for in-JVM caching, Redis for distributed caching, and a multi-level cache that combines both.
Spring Cache Abstraction
Spring’s cache abstraction uses three annotations:
| Annotation | Behaviour |
|---|---|
@Cacheable | Cache the return value. On subsequent calls, return from cache without executing the method. |
@CachePut | Always execute the method AND update the cache with the result. |
@CacheEvict | Remove one or all entries from the cache. |
@Service
public class ProductService {
@Cacheable(value = "products", key = "#id")
public Product findById(Long id) {
// Only executed on cache miss
return productRepository.findById(id).orElseThrow();
}
@CachePut(value = "products", key = "#product.id")
public Product update(Product product) {
// Always executed; updates the cache with the new value
return productRepository.save(product);
}
@CacheEvict(value = "products", key = "#id")
public void delete(Long id) {
// Removes the entry from cache after method executes
productRepository.deleteById(id);
}
@CacheEvict(value = "products", allEntries = true)
@Scheduled(cron = "0 0 * * * *") // every hour
public void evictAllProducts() {
// Scheduled full cache eviction
}
}
Enable caching by adding @EnableCaching to your configuration:
@SpringBootApplication
@EnableCaching
public class Application {}
Level 1: Caffeine (In-JVM Cache)
Caffeine is the fastest cache — sub-microsecond reads, no network. Ideal for data that’s accessed frequently and doesn’t change often.
Setup
<dependency>
<groupId>com.github.ben-manes.caffeine</groupId>
<artifactId>caffeine</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-cache</artifactId>
</dependency>
Configuration
spring:
cache:
type: caffeine
caffeine:
spec: maximumSize=1000,expireAfterWrite=10m
Or per-cache configuration:
@Configuration
@EnableCaching
public class CacheConfig {
@Bean
public CacheManager cacheManager() {
CaffeineCacheManager manager = new CaffeineCacheManager();
// Default spec for all caches
manager.setCaffeine(Caffeine.newBuilder()
.maximumSize(1000)
.expireAfterWrite(10, TimeUnit.MINUTES)
.recordStats()); // enables cache hit/miss metrics
// Override specific caches
Map<String, CaffeineCache> caches = new HashMap<>();
caches.put("products", buildCache(500, 30, TimeUnit.MINUTES)); // large, long TTL
caches.put("sessions", buildCache(10000, 30, TimeUnit.MINUTES)); // many entries
caches.put("config", buildCache(100, 1, TimeUnit.HOURS)); // rarely changes
caches.forEach((name, cache) -> manager.registerCustomCache(name, cache.getNativeCache()));
return manager;
}
private CaffeineCache buildCache(int maxSize, long ttl, TimeUnit unit) {
return new CaffeineCache("temp",
Caffeine.newBuilder()
.maximumSize(maxSize)
.expireAfterWrite(ttl, unit)
.recordStats()
.build());
}
}
Caffeine eviction policies
Caffeine.newBuilder()
.maximumSize(1000) // evict LRU when > 1000 entries
.expireAfterWrite(10, MINUTES) // TTL from write time
.expireAfterAccess(5, MINUTES) // TTL from last access (sliding window)
.weakValues() // GC can evict entries under memory pressure
.softValues() // GC evicts only when memory is low
When to use each:
expireAfterWrite: data has a known freshness window (price data, config)expireAfterAccess: session-like data that should expire if unusedweakValues/softValues: large objects where memory pressure should trigger eviction
Level 2: Redis (Distributed Cache)
Redis cache is shared across all JVM instances — multiple pods see the same cached data. Reads take 0.5–2ms (network round trip) but you get cross-instance consistency.
Setup
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>
spring:
cache:
type: redis
data:
redis:
host: ${REDIS_HOST:localhost}
port: 6379
password: ${REDIS_PASSWORD:}
Per-cache TTL configuration
@Configuration
@EnableCaching
public class RedisCacheConfig {
@Bean
public RedisCacheManager cacheManager(RedisConnectionFactory connectionFactory) {
RedisCacheConfiguration defaults = RedisCacheConfiguration.defaultCacheConfig()
.entryTtl(Duration.ofMinutes(10))
.disableCachingNullValues()
.serializeValuesWith(RedisSerializationContext.SerializationPair
.fromSerializer(new GenericJackson2JsonRedisSerializer()));
Map<String, RedisCacheConfiguration> cacheConfigs = new HashMap<>();
cacheConfigs.put("products",
defaults.entryTtl(Duration.ofMinutes(30)));
cacheConfigs.put("config",
defaults.entryTtl(Duration.ofHours(1)));
cacheConfigs.put("sessions",
defaults.entryTtl(Duration.ofMinutes(30)));
return RedisCacheManager.builder(connectionFactory)
.cacheDefaults(defaults)
.withInitialCacheConfigurations(cacheConfigs)
.build();
}
}
Serialization: use JSON, not Java serialization
The default Redis serializer uses Java serialization — fragile, large, version-sensitive. Always use JSON:
// In RedisCacheConfiguration above:
.serializeValuesWith(RedisSerializationContext.SerializationPair
.fromSerializer(new GenericJackson2JsonRedisSerializer()))
Ensure cached objects are serializable to JSON (record types work perfectly, avoid complex JPA entities with lazy collections).
Multi-Level Cache: L1 Caffeine + L2 Redis
For maximum performance: check Caffeine first (sub-microsecond), fall back to Redis (1–2ms), fall back to DB (5–20ms).
Request
│
▼
L1 Caffeine hit? → Return immediately (< 1ms)
│ miss
▼
L2 Redis hit? → Return + populate L1 (1-2ms)
│ miss
▼
Database → Return + populate L1 + L2 (5-20ms)
Implementation
@Configuration
@EnableCaching
public class MultiLevelCacheConfig {
@Bean
@Primary
public CacheManager cacheManager(
RedisConnectionFactory redisConnectionFactory) {
// L1: Caffeine
CaffeineCacheManager l1 = new CaffeineCacheManager();
l1.setCaffeine(Caffeine.newBuilder()
.maximumSize(500)
.expireAfterWrite(5, TimeUnit.MINUTES)
.recordStats());
// L2: Redis
RedisCacheManager l2 = RedisCacheManager.builder(redisConnectionFactory)
.cacheDefaults(RedisCacheConfiguration.defaultCacheConfig()
.entryTtl(Duration.ofMinutes(30))
.serializeValuesWith(RedisSerializationContext.SerializationPair
.fromSerializer(new GenericJackson2JsonRedisSerializer())))
.build();
return new LayeredCacheManager(l1, l2);
}
}
public class LayeredCacheManager implements CacheManager {
private final CacheManager l1;
private final CacheManager l2;
private final Map<String, Cache> caches = new ConcurrentHashMap<>();
public LayeredCacheManager(CacheManager l1, CacheManager l2) {
this.l1 = l1;
this.l2 = l2;
}
@Override
public Cache getCache(String name) {
return caches.computeIfAbsent(name, n -> new LayeredCache(n,
l1.getCache(n),
l2.getCache(n)));
}
@Override
public Collection<String> getCacheNames() {
Set<String> names = new HashSet<>();
names.addAll(l1.getCacheNames());
names.addAll(l2.getCacheNames());
return names;
}
}
public class LayeredCache implements Cache {
private final String name;
private final Cache l1;
private final Cache l2;
@Override
public ValueWrapper get(Object key) {
// Try L1
ValueWrapper l1Value = l1.get(key);
if (l1Value != null) {
return l1Value; // L1 hit
}
// Try L2
ValueWrapper l2Value = l2.get(key);
if (l2Value != null) {
// Populate L1 from L2
l1.put(key, l2Value.get());
return l2Value; // L2 hit
}
return null; // Cache miss — caller will load from DB
}
@Override
public void put(Object key, Object value) {
l1.put(key, value);
l2.put(key, value);
}
@Override
public void evict(Object key) {
l1.evict(key);
l2.evict(key);
}
@Override
public void clear() {
l1.clear();
l2.clear();
}
@Override
public String getName() { return name; }
@Override
public Object getNativeCache() { return this; }
}
Cache Metrics
With recordStats() enabled on Caffeine, metrics are automatically exposed to Prometheus via Micrometer:
# Cache hit rate
cache_gets_total{name="products", result="hit"} /
cache_gets_total{name="products"}
# Miss rate
cache_gets_total{name="products", result="miss"} /
cache_gets_total{name="products"}
Target hit rate > 90%. If it’s lower, either TTL is too short or the cache is too small.
Cache Warming on Startup
Avoid a cold cache thundering herd after deployment:
@Component
public class CacheWarmer implements ApplicationRunner {
private final ProductService productService;
private final ProductRepository productRepository;
@Override
public void run(ApplicationArguments args) {
log.info("Warming product cache...");
// Load the 1000 most-viewed products into cache
productRepository.findTop1000ByOrderByViewCountDesc()
.forEach(product -> productService.findById(product.getId()));
log.info("Cache warm-up complete");
}
}
Run this on pod startup. It’s especially important in Kubernetes where new pods start with cold caches and can overload the DB if all 10 new pods hit it simultaneously.
Conditional Caching
Skip the cache for certain conditions:
@Cacheable(value = "products", key = "#id",
condition = "#id > 0", // only cache when ID > 0
unless = "#result.price == 0") // don't cache free products
public Product findById(Long id) {
return productRepository.findById(id).orElseThrow();
}
Quick Reference
// Enable caching
@SpringBootApplication @EnableCaching
// Basic annotations
@Cacheable(value = "products", key = "#id")
@CachePut(value = "products", key = "#product.id")
@CacheEvict(value = "products", key = "#id")
@CacheEvict(value = "products", allEntries = true)
// Caffeine config (application.yaml)
spring.cache.type: caffeine
spring.cache.caffeine.spec: maximumSize=1000,expireAfterWrite=10m
// Redis config (application.yaml)
spring.cache.type: redis
spring.data.redis.host: redis-host
# Hit rate (Micrometer + Prometheus)
rate(cache_gets_total{result="hit"}[5m]) / rate(cache_gets_total[5m])
Summary
Spring Boot’s cache abstraction lets you add caching with three annotations and swap implementations without code changes. Caffeine is the fastest option for in-JVM caching — use it for hot data with predictable size. Redis is for distributed caching across multiple pods — use it when cache consistency across instances matters. The multi-level cache (Caffeine L1 + Redis L2) gives you sub-millisecond reads for hot data with Redis as a fallback and consistency layer. Warm the cache on startup to avoid thundering herd after deployment.
