Best Practices Reference: The Complete Spring Batch Checklist

Introduction

This is the final article in the Spring Batch Tutorial series. It consolidates every hard-won lesson into actionable checklists — use it as a review before deploying a new batch job to production, or as a reference when debugging an existing one.


Part 1: Job Design

Always design for restartability

  • Every step that writes to a database uses idempotent SQL (ON DUPLICATE KEY UPDATE or WHERE NOT EXISTS)
  • Every ItemReader has a unique .name(...) set — required for position persistence in ExecutionContext
  • Identifying JobParameters represent the logical data key (e.g., runDate) so the same data always maps to the same JobInstance
  • Non-identifying parameters (file paths, log levels, timestamps) use identifying=false
  • Prep/cleanup tasklets use allowStartIfComplete(true) so they re-run on restart
  • Steps that must not be retried indefinitely use .startLimit(N)

Keep jobs focused

  • Each job does one logical thing — import, transform, report, or notify
  • Steps are single-responsibility — validate, load, enrich, aggregate, export are separate steps
  • Notification/cleanup steps are at the end of the flow and run regardless of success/failure (conditional routing)
  • Long-running jobs are broken into checkpointed steps — a crash at step 3 of 5 restarts from step 3, not step 1

JobParameters discipline

  • Never pass secrets (passwords, API keys) as job parameters — they are persisted in plain text in BATCH_JOB_EXECUTION_PARAMS
  • @StepScope / @JobScope is applied to every bean that injects jobParameters or executionContext via SpEL
  • RunIdIncrementer is used for jobs that must always create a new JobInstance (reports, exports)

Part 2: Readers

General

  • Reader has a .name(...) set (required for saveState to work)
  • .saveState(false) is set for readers used in multi-threaded steps
  • Encoding is explicitly set (.encoding("UTF-8")) — never rely on system default

FlatFileItemReader

  • linesToSkip is set for files with headers
  • .comments("#") skips comment/blank lines where applicable
  • A custom FieldSetMapper or ConversionService is used for LocalDate, BigDecimal, and enum fields — BeanWrapperFieldSetMapper alone cannot handle these
  • MultiResourceItemReader resources are sorted — unsorted resources break restart

JdbcCursorItemReader

  • fetchSize = Integer.MIN_VALUE is set for MySQL streaming (otherwise the full ResultSet is buffered in memory)
  • Used only in single-threaded steps — ResultSet is not thread-safe
  • The query has an ORDER BY clause on the primary key

JdbcPagingItemReader

  • sortKeys is set with a stable unique column (primary key) — required for correct pagination
  • pageSize equals chunk size as a starting point
  • Used when multi-threaded steps are needed — it is thread-safe

JpaPagingItemReader

  • JOIN FETCH or @EntityGraph is used when the processor accesses lazy associations — prevents N+1
  • Second-level cache is disabled for batch contexts (hibernate.cache.use_second_level_cache=false)

Part 3: Writers

JdbcBatchItemWriter

  • rewriteBatchedStatements=true is in the JDBC URL — the single biggest throughput improvement
  • useServerPrepStmts=false is paired with rewriteBatchedStatements=true
  • assertUpdates(false) is set for upserts and conditional updates
  • beanMapped() or a custom ItemSqlParameterSourceProvider is configured — not raw ? positional parameters

FlatFileItemWriter

  • Reader has .name(...) — required for file byte-offset restart
  • shouldDeleteIfEmpty(true) cleans up empty output files
  • headerCallback writes the header once — not in the LineAggregator

CompositeItemWriter

  • The most critical writer (database) is first in the delegate list — on failure the transaction rolls back all delegates
  • FlatFileItemWriter delegates are registered as .stream() on the step so they get open()/close() lifecycle calls
  • ClassifierCompositeItemWriter delegates are all registered as .stream() on the step

Part 4: ItemProcessor

  • Processors are stateless where possible — thread-safe by default
  • Shared caches use ConcurrentHashMap or a bounded Caffeine cache with a TTL
  • Caches are cleared in @AfterStep to prevent stale data on restart
  • null return (filter) is used for deliberate exclusions — not exceptions
  • Exceptions from processors propagate up to the step’s retry/skip framework — never swallowed silently
  • CompositeItemProcessor is used to chain multiple single-responsibility processors

Part 5: Error Handling

Retry

  • Only transient exceptions are in the .retry() list (deadlocks, timeouts, HTTP 5xx)
  • Fatal exceptions (FlatFileParseException, DataIntegrityViolationException) are explicitly excluded
  • ExponentialRandomBackOffPolicy is used for multi-threaded or distributed jobs
  • A RetryListener is registered in production to log retry storms
  • retryLimit is set — never unlimited

Skip

  • skipLimit is set — never unlimited (Integer.MAX_VALUE is acceptable only with a SkipListener)
  • A SkipListener writes every skipped item to a dead-letter table with phase, payload, and error details
  • noRollback is applied to read-time exceptions (e.g., FlatFileParseException) where no write occurred
  • skipPolicy is used when different exception types need different limits

Job failure

  • Failure notifications (Slack, email, PagerDuty) are in the job’s afterJob() listener, not inside step logic
  • staleExecution detection runs periodically to mark STARTED-but-dead executions as ABANDONED
  • A runbook exists for each job explaining how to diagnose and restart failures

Part 6: Performance

  • rewriteBatchedStatements=true — set this first, before any other tuning
  • Chunk size is benchmarked (not guessed) — start at 200, double until throughput plateaus
  • HikariCP maximum-pool-size = thread count × connections_per_thread + metadata connections
  • hibernate.jdbc.batch_size is set for JPA writers (default 1 = no batching)
  • Composite indexes exist on frequently filtered columns (status, order_id etc.)
  • Slow query log is checked during initial benchmarking
  • Memory-intensive jobs use G1GC with -Xmx 2–4g and bounded processor caches

Part 7: Metadata and Observability

  • spring.batch.jdbc.initialize-schema=never in production — schema is managed by Flyway/Liquibase
  • A JobExecutionListener publishes job duration and status to Micrometer/Prometheus after every run
  • A StepExecutionListener publishes per-step counters (read, write, skip, rollback) as metrics
  • Dashboard shows: job status, duration, rows processed, skip rate, retry rate per step
  • Alerting fires on: STATUS = FAILED, skip rate > threshold, job duration > SLA

Dead-letter monitoring

-- Check for pending dead-letter items daily
SELECT job_name, step_name, phase, COUNT(*) AS pending_count
FROM batch_dead_letter
WHERE status = 'PENDING'
  AND created_at > DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY job_name, step_name, phase;

Part 8: Testing

  • Every processor has a unit test with plain JUnit — no Spring context
  • Every FieldSetMapper has unit tests for valid lines and malformed lines
  • Every job has an integration test with @SpringBatchTest + JobLauncherTestUtils
  • Tests cover three paths: golden path, skip path, skipLimit-exceeded path
  • At least one test uses Testcontainers with real MySQL for database-specific SQL
  • jobRepositoryTestUtils.removeJobExecutions() is called in @AfterEach
  • Test CSV files in src/test/resources/test-data/ are maintained alongside the job

Part 9: Production Readiness

  • spring.batch.job.enabled=false — jobs are triggered explicitly, not on startup
  • Scheduling uses Quartz with isClustered=true and JDBC job store for HA deployments
  • A REST endpoint (secured) exists for manual job trigger and stop
  • Sensitive job parameters are never logged (mask them in JobExecutionListener.beforeJob())
  • JVM heap and GC flags are set in the container/pod spec, not hardcoded in code
  • Health endpoint includes batch job status (/actuator/health)
  • Graceful shutdown is configured — Spring Batch completes the current chunk on SIGTERM before exiting

Graceful shutdown

@SpringBootApplication
public class BatchApplication {

    public static void main(String[] args) {
        SpringApplication app = new SpringApplication(BatchApplication.class);
        app.setRegisterShutdownHook(true);  // default — ensures @PreDestroy and context close
        app.run(args);
    }
}
# Give the JVM 60 seconds to finish the current chunk on SIGTERM
spring.lifecycle.timeout-per-shutdown-phase=60s
server.shutdown=graceful

Quick Diagnostic SQL

Is anything running right now?

SELECT ji.JOB_NAME, je.JOB_EXECUTION_ID, je.START_TIME,
       TIMESTAMPDIFF(MINUTE, je.START_TIME, NOW()) AS running_minutes
FROM BATCH_JOB_EXECUTION je
JOIN BATCH_JOB_INSTANCE ji ON je.JOB_INSTANCE_ID = ji.JOB_INSTANCE_ID
WHERE je.STATUS = 'STARTED';

What failed in the last 24 hours?

SELECT ji.JOB_NAME, je.JOB_EXECUTION_ID, je.END_TIME, LEFT(je.EXIT_MESSAGE, 300)
FROM BATCH_JOB_EXECUTION je
JOIN BATCH_JOB_INSTANCE ji ON je.JOB_INSTANCE_ID = ji.JOB_INSTANCE_ID
WHERE je.STATUS = 'FAILED'
  AND je.END_TIME > DATE_SUB(NOW(), INTERVAL 24 HOUR)
ORDER BY je.END_TIME DESC;

What are the step counters for the latest run?

SELECT se.STEP_NAME, se.STATUS, se.READ_COUNT, se.WRITE_COUNT,
       se.SKIP_COUNT, se.ROLLBACK_COUNT,
       TIMESTAMPDIFF(SECOND, se.START_TIME, se.END_TIME) AS seconds
FROM BATCH_STEP_EXECUTION se
WHERE se.JOB_EXECUTION_ID = (
    SELECT MAX(JOB_EXECUTION_ID) FROM BATCH_JOB_EXECUTION
    WHERE JOB_INSTANCE_ID = (
        SELECT MAX(JOB_INSTANCE_ID) FROM BATCH_JOB_INSTANCE WHERE JOB_NAME = 'importOrdersJob'
    )
)
ORDER BY se.STEP_EXECUTION_ID;

Common Mistakes Reference

MistakeSymptomFix
No .name() on readerReader position lost on restart — restarts from beginningAdd .name("uniqueName") to all readers
Missing @StepScopejobParameters SpEL evaluates to nullAdd @StepScope or @JobScope
FlatFileItemReader in multi-threaded stepData corruption / ConcurrentModificationExceptionUse SynchronizedItemStreamReader or switch to JdbcPagingItemReader
Missing rewriteBatchedStatements10–50x slower inserts than possibleAdd to JDBC URL
No identifying parametersEvery run restarts the same failed JobInstanceUse a date, run ID, or RunIdIncrementer
skipLimit not setOne bad record fails entire jobAdd .faultTolerant().skip(...).skipLimit(N)
No SkipListenerSkipped items silently disappearAdd dead-letter table writer
JPA writer without batch_sizeOne SQL per entity — no batchingSet hibernate.jdbc.batch_size=50
saveState(true) in multi-threaded stepIncorrect restart position storedSet saveState(false)
initialize-schema=always in productionMetadata tables recreated on restart → history lostUse never, manage schema with migrations

Congratulations — You’ve Completed the Series!

You have covered the full Spring Batch landscape — from the first job to production-grade partitioned processing. Here is a summary of the journey:

PartArticlesWhat you learned
Foundations1–4Architecture, chunk processing, project setup, metadata tables
Readers5–8CSV, MySQL, JPA, REST/S3, multi-source reading
Writers9–10File, JDBC, JPA, composite, classifier, custom writers
Processors11–12Validation, enrichment, filtering, chaining, async processing
Job Configuration13–15Flows, decisions, parameters, context, tasklets
Listeners16Full lifecycle observability
Error Handling17–18Retry, skip, dead-letter, restart strategies
Testing19Unit, integration, Testcontainers
Scaling20–22Multi-threaded, partitioning, remote/Kafka
Production23–25Scheduling, performance tuning, best practices

The full series is available at /tutorials/spring-batch/.