Part 25 of 25
Best Practices Reference: The Complete Spring Batch Checklist
Introduction
This is the final article in the Spring Batch Tutorial series. It consolidates every hard-won lesson into actionable checklists — use it as a review before deploying a new batch job to production, or as a reference when debugging an existing one.
Part 1: Job Design
Always design for restartability
- Every step that writes to a database uses idempotent SQL (
ON DUPLICATE KEY UPDATEorWHERE NOT EXISTS) - Every
ItemReaderhas a unique.name(...)set — required for position persistence inExecutionContext - Identifying
JobParametersrepresent the logical data key (e.g.,runDate) so the same data always maps to the sameJobInstance - Non-identifying parameters (file paths, log levels, timestamps) use
identifying=false - Prep/cleanup tasklets use
allowStartIfComplete(true)so they re-run on restart - Steps that must not be retried indefinitely use
.startLimit(N)
Keep jobs focused
- Each job does one logical thing — import, transform, report, or notify
- Steps are single-responsibility — validate, load, enrich, aggregate, export are separate steps
- Notification/cleanup steps are at the end of the flow and run regardless of success/failure (conditional routing)
- Long-running jobs are broken into checkpointed steps — a crash at step 3 of 5 restarts from step 3, not step 1
JobParameters discipline
- Never pass secrets (passwords, API keys) as job parameters — they are persisted in plain text in
BATCH_JOB_EXECUTION_PARAMS -
@StepScope/@JobScopeis applied to every bean that injectsjobParametersorexecutionContextvia SpEL -
RunIdIncrementeris used for jobs that must always create a newJobInstance(reports, exports)
Part 2: Readers
General
- Reader has a
.name(...)set (required forsaveStateto work) -
.saveState(false)is set for readers used in multi-threaded steps - Encoding is explicitly set (
.encoding("UTF-8")) — never rely on system default
FlatFileItemReader
-
linesToSkipis set for files with headers -
.comments("#")skips comment/blank lines where applicable - A custom
FieldSetMapperorConversionServiceis used forLocalDate,BigDecimal, and enum fields —BeanWrapperFieldSetMapperalone cannot handle these -
MultiResourceItemReaderresources are sorted — unsorted resources break restart
JdbcCursorItemReader
-
fetchSize = Integer.MIN_VALUEis set for MySQL streaming (otherwise the full ResultSet is buffered in memory) - Used only in single-threaded steps —
ResultSetis not thread-safe - The query has an
ORDER BYclause on the primary key
JdbcPagingItemReader
-
sortKeysis set with a stable unique column (primary key) — required for correct pagination -
pageSizeequals chunk size as a starting point - Used when multi-threaded steps are needed — it is thread-safe
JpaPagingItemReader
-
JOIN FETCHor@EntityGraphis used when the processor accesses lazy associations — prevents N+1 - Second-level cache is disabled for batch contexts (
hibernate.cache.use_second_level_cache=false)
Part 3: Writers
JdbcBatchItemWriter
-
rewriteBatchedStatements=trueis in the JDBC URL — the single biggest throughput improvement -
useServerPrepStmts=falseis paired withrewriteBatchedStatements=true -
assertUpdates(false)is set for upserts and conditional updates -
beanMapped()or a customItemSqlParameterSourceProvideris configured — not raw?positional parameters
FlatFileItemWriter
- Reader has
.name(...)— required for file byte-offset restart -
shouldDeleteIfEmpty(true)cleans up empty output files -
headerCallbackwrites the header once — not in theLineAggregator
CompositeItemWriter
- The most critical writer (database) is first in the delegate list — on failure the transaction rolls back all delegates
-
FlatFileItemWriterdelegates are registered as.stream()on the step so they getopen()/close()lifecycle calls -
ClassifierCompositeItemWriterdelegates are all registered as.stream()on the step
Part 4: ItemProcessor
- Processors are stateless where possible — thread-safe by default
- Shared caches use
ConcurrentHashMapor a boundedCaffeinecache with a TTL - Caches are cleared in
@AfterStepto prevent stale data on restart -
nullreturn (filter) is used for deliberate exclusions — not exceptions - Exceptions from processors propagate up to the step’s retry/skip framework — never swallowed silently
-
CompositeItemProcessoris used to chain multiple single-responsibility processors
Part 5: Error Handling
Retry
- Only transient exceptions are in the
.retry()list (deadlocks, timeouts, HTTP 5xx) - Fatal exceptions (
FlatFileParseException,DataIntegrityViolationException) are explicitly excluded -
ExponentialRandomBackOffPolicyis used for multi-threaded or distributed jobs - A
RetryListeneris registered in production to log retry storms -
retryLimitis set — never unlimited
Skip
-
skipLimitis set — never unlimited (Integer.MAX_VALUEis acceptable only with aSkipListener) - A
SkipListenerwrites every skipped item to a dead-letter table with phase, payload, and error details -
noRollbackis applied to read-time exceptions (e.g.,FlatFileParseException) where no write occurred -
skipPolicyis used when different exception types need different limits
Job failure
- Failure notifications (Slack, email, PagerDuty) are in the job’s
afterJob()listener, not inside step logic -
staleExecutiondetection runs periodically to mark STARTED-but-dead executions as ABANDONED - A runbook exists for each job explaining how to diagnose and restart failures
Part 6: Performance
-
rewriteBatchedStatements=true— set this first, before any other tuning - Chunk size is benchmarked (not guessed) — start at 200, double until throughput plateaus
- HikariCP
maximum-pool-size= thread count × connections_per_thread + metadata connections -
hibernate.jdbc.batch_sizeis set for JPA writers (default 1 = no batching) - Composite indexes exist on frequently filtered columns (
status, order_idetc.) - Slow query log is checked during initial benchmarking
- Memory-intensive jobs use G1GC with
-Xmx 2–4gand bounded processor caches
Part 7: Metadata and Observability
-
spring.batch.jdbc.initialize-schema=neverin production — schema is managed by Flyway/Liquibase - A
JobExecutionListenerpublishes job duration and status to Micrometer/Prometheus after every run - A
StepExecutionListenerpublishes per-step counters (read, write, skip, rollback) as metrics - Dashboard shows: job status, duration, rows processed, skip rate, retry rate per step
- Alerting fires on:
STATUS = FAILED, skip rate > threshold, job duration > SLA
Dead-letter monitoring
-- Check for pending dead-letter items daily
SELECT job_name, step_name, phase, COUNT(*) AS pending_count
FROM batch_dead_letter
WHERE status = 'PENDING'
AND created_at > DATE_SUB(NOW(), INTERVAL 24 HOUR)
GROUP BY job_name, step_name, phase;
Part 8: Testing
- Every processor has a unit test with plain JUnit — no Spring context
- Every
FieldSetMapperhas unit tests for valid lines and malformed lines - Every job has an integration test with
@SpringBatchTest+JobLauncherTestUtils - Tests cover three paths: golden path, skip path, skipLimit-exceeded path
- At least one test uses Testcontainers with real MySQL for database-specific SQL
-
jobRepositoryTestUtils.removeJobExecutions()is called in@AfterEach - Test CSV files in
src/test/resources/test-data/are maintained alongside the job
Part 9: Production Readiness
-
spring.batch.job.enabled=false— jobs are triggered explicitly, not on startup - Scheduling uses Quartz with
isClustered=trueand JDBC job store for HA deployments - A REST endpoint (secured) exists for manual job trigger and stop
- Sensitive job parameters are never logged (mask them in
JobExecutionListener.beforeJob()) - JVM heap and GC flags are set in the container/pod spec, not hardcoded in code
- Health endpoint includes batch job status (
/actuator/health) - Graceful shutdown is configured — Spring Batch completes the current chunk on SIGTERM before exiting
Graceful shutdown
@SpringBootApplication
public class BatchApplication {
public static void main(String[] args) {
SpringApplication app = new SpringApplication(BatchApplication.class);
app.setRegisterShutdownHook(true); // default — ensures @PreDestroy and context close
app.run(args);
}
}
# Give the JVM 60 seconds to finish the current chunk on SIGTERM
spring.lifecycle.timeout-per-shutdown-phase=60s
server.shutdown=graceful
Quick Diagnostic SQL
Is anything running right now?
SELECT ji.JOB_NAME, je.JOB_EXECUTION_ID, je.START_TIME,
TIMESTAMPDIFF(MINUTE, je.START_TIME, NOW()) AS running_minutes
FROM BATCH_JOB_EXECUTION je
JOIN BATCH_JOB_INSTANCE ji ON je.JOB_INSTANCE_ID = ji.JOB_INSTANCE_ID
WHERE je.STATUS = 'STARTED';
What failed in the last 24 hours?
SELECT ji.JOB_NAME, je.JOB_EXECUTION_ID, je.END_TIME, LEFT(je.EXIT_MESSAGE, 300)
FROM BATCH_JOB_EXECUTION je
JOIN BATCH_JOB_INSTANCE ji ON je.JOB_INSTANCE_ID = ji.JOB_INSTANCE_ID
WHERE je.STATUS = 'FAILED'
AND je.END_TIME > DATE_SUB(NOW(), INTERVAL 24 HOUR)
ORDER BY je.END_TIME DESC;
What are the step counters for the latest run?
SELECT se.STEP_NAME, se.STATUS, se.READ_COUNT, se.WRITE_COUNT,
se.SKIP_COUNT, se.ROLLBACK_COUNT,
TIMESTAMPDIFF(SECOND, se.START_TIME, se.END_TIME) AS seconds
FROM BATCH_STEP_EXECUTION se
WHERE se.JOB_EXECUTION_ID = (
SELECT MAX(JOB_EXECUTION_ID) FROM BATCH_JOB_EXECUTION
WHERE JOB_INSTANCE_ID = (
SELECT MAX(JOB_INSTANCE_ID) FROM BATCH_JOB_INSTANCE WHERE JOB_NAME = 'importOrdersJob'
)
)
ORDER BY se.STEP_EXECUTION_ID;
Common Mistakes Reference
| Mistake | Symptom | Fix |
|---|---|---|
No .name() on reader | Reader position lost on restart — restarts from beginning | Add .name("uniqueName") to all readers |
Missing @StepScope | jobParameters SpEL evaluates to null | Add @StepScope or @JobScope |
FlatFileItemReader in multi-threaded step | Data corruption / ConcurrentModificationException | Use SynchronizedItemStreamReader or switch to JdbcPagingItemReader |
Missing rewriteBatchedStatements | 10–50x slower inserts than possible | Add to JDBC URL |
| No identifying parameters | Every run restarts the same failed JobInstance | Use a date, run ID, or RunIdIncrementer |
skipLimit not set | One bad record fails entire job | Add .faultTolerant().skip(...).skipLimit(N) |
No SkipListener | Skipped items silently disappear | Add dead-letter table writer |
JPA writer without batch_size | One SQL per entity — no batching | Set hibernate.jdbc.batch_size=50 |
saveState(true) in multi-threaded step | Incorrect restart position stored | Set saveState(false) |
initialize-schema=always in production | Metadata tables recreated on restart → history lost | Use never, manage schema with migrations |
Congratulations — You’ve Completed the Series!
You have covered the full Spring Batch landscape — from the first job to production-grade partitioned processing. Here is a summary of the journey:
| Part | Articles | What you learned |
|---|---|---|
| Foundations | 1–4 | Architecture, chunk processing, project setup, metadata tables |
| Readers | 5–8 | CSV, MySQL, JPA, REST/S3, multi-source reading |
| Writers | 9–10 | File, JDBC, JPA, composite, classifier, custom writers |
| Processors | 11–12 | Validation, enrichment, filtering, chaining, async processing |
| Job Configuration | 13–15 | Flows, decisions, parameters, context, tasklets |
| Listeners | 16 | Full lifecycle observability |
| Error Handling | 17–18 | Retry, skip, dead-letter, restart strategies |
| Testing | 19 | Unit, integration, Testcontainers |
| Scaling | 20–22 | Multi-threaded, partitioning, remote/Kafka |
| Production | 23–25 | Scheduling, performance tuning, best practices |
The full series is available at /tutorials/spring-batch/.