Retryable vs Non-Retryable Exceptions: Custom Exception Classification

Transient vs Permanent Failures

Not every exception is worth retrying. Retrying a NullPointerException or a schema validation error wastes time and delays other records. Retrying a database timeout or a downstream HTTP 503 is exactly right — the error is temporary and will likely resolve.

flowchart TD
    Ex["Exception in listener"]
    Q{"Transient?\n(DB timeout, HTTP 503,\nnetwork blip)"}
    Q -->|Yes| Retry["Retry with BackOff"]
    Q -->|No| Skip["Call recoverer immediately\n(no retries wasted)"]

    Retry -->|"still failing after\nmax retries"| Skip
    Skip --> DLT["Dead-letter topic\nor log"]

Spring Kafka’s DefaultErrorHandler lets you declare exactly which exception types are retryable and which are not.


Default Classification

By default, DefaultErrorHandler retries all exceptions except those that Spring Kafka considers fatal:

  • DeserializationException — bad bytes can’t be fixed by retrying
  • MessageConversionException — type mismatch can’t be fixed by retrying
  • MethodArgumentResolutionException — bad listener method signature
  • NoSuchMethodException
  • ClassCastException

Everything else retries up to the configured maximum.


Adding Non-Retryable Exceptions

Tell the error handler to skip retries for specific exception types:

@Bean
public DefaultErrorHandler errorHandler(KafkaTemplate<String, Object> kafkaTemplate) {
    DeadLetterPublishingRecoverer recoverer =
        new DeadLetterPublishingRecoverer(kafkaTemplate);
    ExponentialBackOff backOff = new ExponentialBackOff(1000L, 2.0);
    backOff.setMaxElapsedTime(30_000L);

    DefaultErrorHandler handler = new DefaultErrorHandler(recoverer, backOff);

    // These exceptions go straight to the recoverer — no retries
    handler.addNotRetryableExceptions(
        OrderValidationException.class,
        IllegalArgumentException.class,
        InvalidOrderStateException.class
    );

    return handler;
}

When OrderValidationException is thrown, the record goes directly to the DLT without any BackOff delay.


Adding Retryable Exceptions (Allowlist Mode)

By default, all non-fatal exceptions retry. You can flip to allowlist mode — only exceptions you explicitly add will retry:

DefaultErrorHandler handler = new DefaultErrorHandler(recoverer, backOff);

// Switch to allowlist: only these exceptions retry
handler.setClassifications(Map.of(
    TransientDataAccessException.class, true,    // retry
    HttpServerErrorException.class, true,         // retry
    OrderValidationException.class, false,        // no retry
    IllegalArgumentException.class, false         // no retry
), false);  // false = default is NOT retryable (allowlist mode)

Allowlist mode is safer for production — you explicitly opt in to retry behaviour for each exception type.


Custom ExceptionClassifier

For complex rules — retry HTTP 503 but not 400, or check the exception message:

@Bean
public DefaultErrorHandler errorHandler(KafkaTemplate<String, Object> kafkaTemplate) {
    DeadLetterPublishingRecoverer recoverer =
        new DeadLetterPublishingRecoverer(kafkaTemplate);

    DefaultErrorHandler handler = new DefaultErrorHandler(recoverer,
        new ExponentialBackOff(1000L, 2.0));

    handler.setExceptionFunction(ex -> {
        // true = retryable, false = not retryable
        if (ex instanceof HttpClientErrorException httpEx) {
            // 4xx errors are permanent — don't retry
            return httpEx.getStatusCode().is4xxClientError() ? false : true;
        }
        if (ex instanceof InventoryUnavailableException) {
            return true;   // transient — retry
        }
        if (ex instanceof OrderValidationException) {
            return false;  // permanent — skip to recoverer
        }
        return true;  // default: retry
    });

    return handler;
}

Exception Hierarchy Matching

Classification matches on the exact exception class and its superclasses. Declare the most specific classes first:

handler.addNotRetryableExceptions(OrderValidationException.class);
// OrderValidationException extends BusinessException extends RuntimeException
// → only OrderValidationException is non-retryable; other BusinessExceptions still retry

To make an entire hierarchy non-retryable:

handler.addNotRetryableExceptions(BusinessException.class);
// All subclasses of BusinessException become non-retryable

Practical Exception Taxonomy for an Order Service

@Bean
public DefaultErrorHandler errorHandler(KafkaTemplate<String, Object> kafkaTemplate) {
    DefaultErrorHandler handler = new DefaultErrorHandler(
        new DeadLetterPublishingRecoverer(kafkaTemplate),
        new ExponentialBackOff(500L, 2.0) {{
            setMaxElapsedTime(60_000L);
        }}
    );

    // Permanent failures — go straight to DLT
    handler.addNotRetryableExceptions(
        OrderValidationException.class,       // bad business data
        DuplicateOrderException.class,        // idempotency check failed
        InsufficientStockException.class,     // no stock available (not transient)
        JsonProcessingException.class,        // malformed JSON payload
        IllegalArgumentException.class
    );

    // Everything else (DB timeouts, HTTP 503, etc.) retries with ExponentialBackOff

    return handler;
}

Logging Non-Retryable Exceptions at Different Levels

handler.setLogLevel(KafkaException.Level.WARN);    // default for retryable
handler.setAckAfterHandle(true);   // commit offset even after recoverer failure (careful!)

To log non-retryable exceptions at ERROR and retryable at WARN:

handler.setExceptionFunction(ex -> {
    boolean retryable = !(ex instanceof OrderValidationException);
    if (!retryable) {
        log.error("Non-retryable exception — sending to DLT: {}", ex.getMessage());
    }
    return retryable;
});

Exception Classification Decision Flow

flowchart TD
    E["Exception thrown"]
    Fatal{"Spring Kafka fatal exception?\nDeserializationEx, MessageConversionEx..."}
    Custom{"Declared in\naddNotRetryableExceptions?"}
    AllowList{"Using allowlist\nmode — setClassifications?"}
    AllowListed{"Exception in\nallowlist?"}
    Default{"Default:\nretryable?"}
    DLT["Recoverer immediately"]
    Retry["Retry with BackOff"]

    E --> Fatal
    Fatal -->|Yes| DLT
    Fatal -->|No| Custom
    Custom -->|Yes| DLT
    Custom -->|No| AllowList
    AllowList -->|Yes| AllowListed
    AllowListed -->|Yes| Retry
    AllowListed -->|No| DLT
    AllowList -->|No| Default
    Default -->|Yes - default| Retry
    Default -->|No| DLT

Key Takeaways

  • By default, DefaultErrorHandler retries all exceptions except built-in fatal ones (deserialization, type conversion)
  • addNotRetryableExceptions(...) sends specific exceptions straight to the recoverer without retrying
  • Allowlist mode (setClassifications(..., false)) is safer — only explicitly declared exception types will retry
  • setExceptionFunction(ex -> bool) enables arbitrary classification logic (check HTTP status code, message, etc.)
  • Exception class hierarchy matters: declaring a superclass affects all subclasses
  • Design an explicit exception taxonomy: transient (DB timeout, HTTP 503) vs permanent (validation, duplicate key)

Next: Dead Letter Topics: Routing Failed Messages with DeadLetterPublishingRecoverer — send unrecoverable records to a DLT with full metadata headers, and set up a DLT consumer to inspect and reprocess them.