Error Handling Basics: DefaultErrorHandler and CommonErrorHandler
What Happens When a Listener Throws?
Without an error handler, an uncaught exception from @KafkaListener causes the container to log the error and retry the same record on the next poll — indefinitely. One bad record can block an entire partition forever.
DefaultErrorHandler fixes this: it retries a configurable number of times with backoff, then calls a ConsumerRecordRecoverer (e.g. send to a dead-letter topic) and moves on.
DefaultErrorHandler — The Modern API
Spring Kafka 2.8+ replaced SeekToCurrentErrorHandler with CommonErrorHandler (interface) and DefaultErrorHandler (primary implementation). If you see SeekToCurrentErrorHandler in your codebase, replace it — it’s deprecated.
flowchart TD
Listener["@KafkaListener throws Exception"]
DEH["DefaultErrorHandler"]
Retry{"Retries\nexhausted?"}
Recoverer["ConsumerRecordRecoverer\n(log, DLT, skip)"]
Commit["Commit offset\nMove to next record"]
RetryRecord["Retry same record\n(with BackOff delay)"]
Listener --> DEH
DEH --> Retry
Retry -->|No| RetryRecord
Retry -->|Yes| Recoverer
Recoverer --> Commit
Basic Setup
@Bean
public DefaultErrorHandler errorHandler() {
return new DefaultErrorHandler(); // default: 10 retries, no backoff, logs and skips
}
@Bean
public ConcurrentKafkaListenerContainerFactory<String, OrderPlacedEvent>
kafkaListenerContainerFactory(
ConsumerFactory<String, OrderPlacedEvent> consumerFactory,
DefaultErrorHandler errorHandler) {
var factory = new ConcurrentKafkaListenerContainerFactory<String, OrderPlacedEvent>();
factory.setConsumerFactory(consumerFactory);
factory.setCommonErrorHandler(errorHandler);
return factory;
}
The default DefaultErrorHandler retries up to 10 times with no delay, then logs the error and commits the offset (skipping the record). This is rarely what you want in production — configure backoff and a recoverer.
FixedBackOff
Retry a fixed number of times with a fixed delay between attempts:
@Bean
public DefaultErrorHandler errorHandler() {
// 3 retries, 2 seconds between each
FixedBackOff backOff = new FixedBackOff(2000L, 3L);
return new DefaultErrorHandler(backOff);
}
sequenceDiagram
participant Container
participant Listener
participant Handler
Container->>Listener: process(record) [attempt 1]
Listener-->>Handler: OrderProcessingException
Note over Handler: wait 2s
Container->>Listener: process(record) [attempt 2]
Listener-->>Handler: OrderProcessingException
Note over Handler: wait 2s
Container->>Listener: process(record) [attempt 3]
Listener-->>Handler: OrderProcessingException
Note over Handler: wait 2s
Container->>Listener: process(record) [attempt 4 — final]
Listener-->>Handler: OrderProcessingException
Handler->>Handler: retries exhausted → skip record, commit offset
FixedBackOff(interval, maxAttempts) — maxAttempts is the number of retries after the first failure, so total attempts = maxAttempts + 1.
ExponentialBackOff
Retries with increasing delays — essential for transient failures hitting external services:
@Bean
public DefaultErrorHandler errorHandler() {
ExponentialBackOff backOff = new ExponentialBackOff(1000L, 2.0);
backOff.setMaxInterval(30_000L); // cap at 30 seconds
backOff.setMaxElapsedTime(120_000L); // give up after 2 minutes total
return new DefaultErrorHandler(backOff);
}
With initialInterval=1000ms, multiplier=2.0:
| Attempt | Delay |
|---|---|
| 1 | 1s |
| 2 | 2s |
| 3 | 4s |
| 4 | 8s |
| 5 | 16s |
| 6+ | 30s (capped) |
Adding a Recoverer
When retries are exhausted, call a recoverer instead of silently dropping the record:
@Bean
public DefaultErrorHandler errorHandler(KafkaTemplate<String, Object> kafkaTemplate) {
// Send failed records to a dead-letter topic
DeadLetterPublishingRecoverer recoverer =
new DeadLetterPublishingRecoverer(kafkaTemplate);
ExponentialBackOff backOff = new ExponentialBackOff(1000L, 2.0);
backOff.setMaxElapsedTime(60_000L);
return new DefaultErrorHandler(recoverer, backOff);
}
DeadLetterPublishingRecoverer sends the failed record to {originalTopic}.DLT by default. See Dead Letter Topics for full configuration.
Logging Recoverer (Simplest)
Log and skip without a DLT — acceptable for non-critical analytics or metric events:
@Bean
public DefaultErrorHandler errorHandler() {
ConsumerRecordRecoverer loggingRecoverer = (record, exception) ->
log.error("Failed to process record at topic={} partition={} offset={}: {}",
record.topic(), record.partition(), record.offset(), exception.getMessage());
return new DefaultErrorHandler(loggingRecoverer, new FixedBackOff(1000L, 2L));
}
Exception Handling Flow
flowchart TD
Ex["Exception thrown in listener"]
NR{"Non-retryable\nexception?"}
Ret{"Retries\nexhausted?"}
Skip["Call recoverer immediately\n(no retries)"]
Retry["Retry with BackOff"]
Recover["Call recoverer\n(DLT or log)"]
Commit["Commit offset, continue"]
Ex --> NR
NR -->|Yes| Skip --> Commit
NR -->|No| Ret
Ret -->|No| Retry --> Ex
Ret -->|Yes| Recover --> Commit
Non-retryable exceptions bypass the BackOff entirely — the recoverer is called immediately. See Retryable vs Non-Retryable Exceptions for configuration.
Error Handler Configuration Reference
DefaultErrorHandler handler = new DefaultErrorHandler(recoverer, backOff);
// Log the exception before retrying (default: true)
handler.setLogLevel(KafkaException.Level.WARN);
// Include stack trace in log (default: true for DEBUG, false for WARN/ERROR)
handler.setCommitRecovered(true); // commit offset after recovery (default: true)
application.properties Backoff (Limited)
For simple cases without a recoverer:
# Max attempts (including first) — maps to FixedBackOff
spring.kafka.listener.max-failures=4
This is far less flexible than @Bean configuration — use @Bean for any production setup.
Common Mistakes
Retrying non-transient exceptions — retrying a NullPointerException or a validation error wastes time. Always classify exceptions and mark non-transient ones as non-retryable.
No backoff delay — retrying immediately on failure hammers the failing downstream service. Always set a backoff interval.
No recoverer — the default error handler skips the record after exhausting retries with only a log message. Use DeadLetterPublishingRecoverer so failed records can be inspected and reprocessed.
Setting max attempts too high with long delays — with 10 retries at 30s each, one bad record blocks the partition for 5 minutes. Size retries and delays based on your SLA.
Key Takeaways
DefaultErrorHandlerreplaces the deprecatedSeekToCurrentErrorHandler— update existing code- Wire it into the listener container factory via
setCommonErrorHandler(...) FixedBackOfffor simple retry/delay;ExponentialBackOfffor transient failures hitting external services- Always provide a
ConsumerRecordRecoverer— logging and dead-letter publishing are the two main options - Non-retryable exceptions should bypass backoff entirely — configure them explicitly in the next article
- One bad record blocks the whole partition until retries exhaust — don’t set max retries + delay so high that it causes SLA breaches
Next: Retryable vs Non-Retryable Exceptions: Custom Exception Classification — tell the error handler exactly which exceptions should retry and which should go straight to the recoverer.