Retryable vs Non-Retryable Exceptions: Custom Exception Classification
Transient vs Permanent Failures
Not every exception is worth retrying. Retrying a NullPointerException or a schema validation error wastes time and delays other records. Retrying a database timeout or a downstream HTTP 503 is exactly right — the error is temporary and will likely resolve.
flowchart TD
Ex["Exception in listener"]
Q{"Transient?\n(DB timeout, HTTP 503,\nnetwork blip)"}
Q -->|Yes| Retry["Retry with BackOff"]
Q -->|No| Skip["Call recoverer immediately\n(no retries wasted)"]
Retry -->|"still failing after\nmax retries"| Skip
Skip --> DLT["Dead-letter topic\nor log"]
Spring Kafka’s DefaultErrorHandler lets you declare exactly which exception types are retryable and which are not.
Default Classification
By default, DefaultErrorHandler retries all exceptions except those that Spring Kafka considers fatal:
DeserializationException— bad bytes can’t be fixed by retryingMessageConversionException— type mismatch can’t be fixed by retryingMethodArgumentResolutionException— bad listener method signatureNoSuchMethodExceptionClassCastException
Everything else retries up to the configured maximum.
Adding Non-Retryable Exceptions
Tell the error handler to skip retries for specific exception types:
@Bean
public DefaultErrorHandler errorHandler(KafkaTemplate<String, Object> kafkaTemplate) {
DeadLetterPublishingRecoverer recoverer =
new DeadLetterPublishingRecoverer(kafkaTemplate);
ExponentialBackOff backOff = new ExponentialBackOff(1000L, 2.0);
backOff.setMaxElapsedTime(30_000L);
DefaultErrorHandler handler = new DefaultErrorHandler(recoverer, backOff);
// These exceptions go straight to the recoverer — no retries
handler.addNotRetryableExceptions(
OrderValidationException.class,
IllegalArgumentException.class,
InvalidOrderStateException.class
);
return handler;
}
When OrderValidationException is thrown, the record goes directly to the DLT without any BackOff delay.
Adding Retryable Exceptions (Allowlist Mode)
By default, all non-fatal exceptions retry. You can flip to allowlist mode — only exceptions you explicitly add will retry:
DefaultErrorHandler handler = new DefaultErrorHandler(recoverer, backOff);
// Switch to allowlist: only these exceptions retry
handler.setClassifications(Map.of(
TransientDataAccessException.class, true, // retry
HttpServerErrorException.class, true, // retry
OrderValidationException.class, false, // no retry
IllegalArgumentException.class, false // no retry
), false); // false = default is NOT retryable (allowlist mode)
Allowlist mode is safer for production — you explicitly opt in to retry behaviour for each exception type.
Custom ExceptionClassifier
For complex rules — retry HTTP 503 but not 400, or check the exception message:
@Bean
public DefaultErrorHandler errorHandler(KafkaTemplate<String, Object> kafkaTemplate) {
DeadLetterPublishingRecoverer recoverer =
new DeadLetterPublishingRecoverer(kafkaTemplate);
DefaultErrorHandler handler = new DefaultErrorHandler(recoverer,
new ExponentialBackOff(1000L, 2.0));
handler.setExceptionFunction(ex -> {
// true = retryable, false = not retryable
if (ex instanceof HttpClientErrorException httpEx) {
// 4xx errors are permanent — don't retry
return httpEx.getStatusCode().is4xxClientError() ? false : true;
}
if (ex instanceof InventoryUnavailableException) {
return true; // transient — retry
}
if (ex instanceof OrderValidationException) {
return false; // permanent — skip to recoverer
}
return true; // default: retry
});
return handler;
}
Exception Hierarchy Matching
Classification matches on the exact exception class and its superclasses. Declare the most specific classes first:
handler.addNotRetryableExceptions(OrderValidationException.class);
// OrderValidationException extends BusinessException extends RuntimeException
// → only OrderValidationException is non-retryable; other BusinessExceptions still retry
To make an entire hierarchy non-retryable:
handler.addNotRetryableExceptions(BusinessException.class);
// All subclasses of BusinessException become non-retryable
Practical Exception Taxonomy for an Order Service
@Bean
public DefaultErrorHandler errorHandler(KafkaTemplate<String, Object> kafkaTemplate) {
DefaultErrorHandler handler = new DefaultErrorHandler(
new DeadLetterPublishingRecoverer(kafkaTemplate),
new ExponentialBackOff(500L, 2.0) {{
setMaxElapsedTime(60_000L);
}}
);
// Permanent failures — go straight to DLT
handler.addNotRetryableExceptions(
OrderValidationException.class, // bad business data
DuplicateOrderException.class, // idempotency check failed
InsufficientStockException.class, // no stock available (not transient)
JsonProcessingException.class, // malformed JSON payload
IllegalArgumentException.class
);
// Everything else (DB timeouts, HTTP 503, etc.) retries with ExponentialBackOff
return handler;
}
Logging Non-Retryable Exceptions at Different Levels
handler.setLogLevel(KafkaException.Level.WARN); // default for retryable
handler.setAckAfterHandle(true); // commit offset even after recoverer failure (careful!)
To log non-retryable exceptions at ERROR and retryable at WARN:
handler.setExceptionFunction(ex -> {
boolean retryable = !(ex instanceof OrderValidationException);
if (!retryable) {
log.error("Non-retryable exception — sending to DLT: {}", ex.getMessage());
}
return retryable;
});
Exception Classification Decision Flow
flowchart TD
E["Exception thrown"]
Fatal{"Spring Kafka fatal exception?\nDeserializationEx, MessageConversionEx..."}
Custom{"Declared in\naddNotRetryableExceptions?"}
AllowList{"Using allowlist\nmode — setClassifications?"}
AllowListed{"Exception in\nallowlist?"}
Default{"Default:\nretryable?"}
DLT["Recoverer immediately"]
Retry["Retry with BackOff"]
E --> Fatal
Fatal -->|Yes| DLT
Fatal -->|No| Custom
Custom -->|Yes| DLT
Custom -->|No| AllowList
AllowList -->|Yes| AllowListed
AllowListed -->|Yes| Retry
AllowListed -->|No| DLT
AllowList -->|No| Default
Default -->|Yes - default| Retry
Default -->|No| DLT
Key Takeaways
- By default,
DefaultErrorHandlerretries all exceptions except built-in fatal ones (deserialization, type conversion) addNotRetryableExceptions(...)sends specific exceptions straight to the recoverer without retrying- Allowlist mode (
setClassifications(..., false)) is safer — only explicitly declared exception types will retry setExceptionFunction(ex -> bool)enables arbitrary classification logic (check HTTP status code, message, etc.)- Exception class hierarchy matters: declaring a superclass affects all subclasses
- Design an explicit exception taxonomy: transient (DB timeout, HTTP 503) vs permanent (validation, duplicate key)
Next: Dead Letter Topics: Routing Failed Messages with DeadLetterPublishingRecoverer — send unrecoverable records to a DLT with full metadata headers, and set up a DLT consumer to inspect and reprocess them.