JobRepository and Batch Metadata: How Spring Batch Tracks Everything
Introduction
Every time Spring Batch runs a job it records the run’s history in a set of relational tables. This metadata is not optional — it is what makes Spring Batch reliable. Without it there would be no restart capability, no duplicate-run prevention, and no audit trail. Understanding the metadata layer is essential for debugging failures, building monitoring dashboards, and designing restartable jobs.
In this article you will learn:
- The role of
JobRepositoryandJobExplorerin the Spring Batch architecture - The six metadata tables — their schema, purpose, and relationships
- How Spring Batch uses these tables to enable restartability
- How to query batch history directly in MySQL
- How to access metadata programmatically with
JobExplorer - Key changes in Spring Batch 5 (removed
MapJobRepositoryFactoryBean, new testing approach)
All examples use Spring Boot 3.3+, Spring Batch 5.2+, and MySQL 8.x — the same stack established in Article 3.
The Two Persistence APIs
Spring Batch provides two beans for working with job metadata:
| Bean | Access | Purpose |
|---|---|---|
JobRepository | Read + Write | Framework-internal CRUD for job/step state |
JobExplorer | Read only | Application code queries for history and status |
JobRepository is used by the framework itself — SimpleJobLauncher, Job, and Step implementations call it to persist state. You rarely call it directly. JobExplorer is the API you call in your own code to inspect what happened.
Both are auto-configured by Spring Boot when spring-boot-starter-batch is on the classpath and a DataSource bean exists.
The Six Metadata Tables
Spring Batch creates six tables in your database. Their names are prefixed with BATCH_ by default. Here is the entity-relationship overview:
BATCH_JOB_INSTANCE
├── BATCH_JOB_EXECUTION (one instance → many executions)
│ ├── BATCH_JOB_EXECUTION_PARAMS
│ ├── BATCH_JOB_EXECUTION_CONTEXT
│ └── BATCH_STEP_EXECUTION (one execution → many step executions)
│ └── BATCH_STEP_EXECUTION_CONTEXT
BATCH_JOB_INSTANCE
Represents a unique combination of job name + identifying job parameters. Think of it as “the job run for date 2026-05-03”.
CREATE TABLE BATCH_JOB_INSTANCE (
JOB_INSTANCE_ID BIGINT NOT NULL PRIMARY KEY,
VERSION BIGINT,
JOB_NAME VARCHAR(100) NOT NULL,
JOB_KEY VARCHAR(32) NOT NULL,
UNIQUE (JOB_NAME, JOB_KEY)
);
JOB_KEY is an MD5 hash of the identifying parameters. The unique constraint on (JOB_NAME, JOB_KEY) is the mechanism that prevents you from accidentally launching the same logical job run twice.
BATCH_JOB_EXECUTION
Represents a single execution attempt against a JobInstance. If a job fails and you restart it, the same JobInstance gets a new JobExecution.
CREATE TABLE BATCH_JOB_EXECUTION (
JOB_EXECUTION_ID BIGINT NOT NULL PRIMARY KEY,
VERSION BIGINT,
JOB_INSTANCE_ID BIGINT NOT NULL,
CREATE_TIME DATETIME(6) NOT NULL,
START_TIME DATETIME(6) DEFAULT NULL,
END_TIME DATETIME(6) DEFAULT NULL,
STATUS VARCHAR(10),
EXIT_CODE VARCHAR(2500),
EXIT_MESSAGE VARCHAR(2500),
LAST_UPDATED DATETIME(6),
CONSTRAINT JOB_INST_EXEC_FK FOREIGN KEY (JOB_INSTANCE_ID)
REFERENCES BATCH_JOB_INSTANCE(JOB_INSTANCE_ID)
);
STATUS is a BatchStatus value: STARTED, COMPLETED, FAILED, STOPPED, ABANDONED. EXIT_CODE is an ExitStatus string — it can be the same as STATUS or a custom code you define.
BATCH_JOB_EXECUTION_PARAMS
Every job parameter passed at launch time is stored here, one row per parameter.
CREATE TABLE BATCH_JOB_EXECUTION_PARAMS (
JOB_EXECUTION_ID BIGINT NOT NULL,
PARAMETER_NAME VARCHAR(100) NOT NULL,
PARAMETER_TYPE VARCHAR(100) NOT NULL,
PARAMETER_VALUE VARCHAR(2500) DEFAULT NULL,
IDENTIFYING CHAR(1) NOT NULL,
CONSTRAINT JOB_EXEC_PARAMS_FK FOREIGN KEY (JOB_EXECUTION_ID)
REFERENCES BATCH_JOB_EXECUTION(JOB_EXECUTION_ID)
);
The IDENTIFYING column (Y/N) marks whether this parameter contributed to the JOB_KEY hash. Non-identifying parameters (like a log level or notification email) are stored for auditing but do not affect JobInstance uniqueness.
BATCH_JOB_EXECUTION_CONTEXT
Stores the serialized ExecutionContext for each job execution. This is the state bag that enables a job to share data between steps and to resume after failure.
CREATE TABLE BATCH_JOB_EXECUTION_CONTEXT (
JOB_EXECUTION_ID BIGINT NOT NULL PRIMARY KEY,
SHORT_CONTEXT VARCHAR(2500) NOT NULL,
SERIALIZED_CONTEXT TEXT,
CONSTRAINT JOB_EXEC_CTX_FK FOREIGN KEY (JOB_EXECUTION_ID)
REFERENCES BATCH_JOB_EXECUTION(JOB_EXECUTION_ID)
);
SHORT_CONTEXT holds a compact JSON summary; SERIALIZED_CONTEXT holds the full serialized form used during restart.
BATCH_STEP_EXECUTION
One row per step per job execution. Contains item counters that are committed transactionally so they survive a crash.
CREATE TABLE BATCH_STEP_EXECUTION (
STEP_EXECUTION_ID BIGINT NOT NULL PRIMARY KEY,
VERSION BIGINT NOT NULL,
STEP_NAME VARCHAR(100) NOT NULL,
JOB_EXECUTION_ID BIGINT NOT NULL,
CREATE_TIME DATETIME(6) NOT NULL,
START_TIME DATETIME(6) DEFAULT NULL,
END_TIME DATETIME(6) DEFAULT NULL,
STATUS VARCHAR(10),
COMMIT_COUNT BIGINT,
READ_COUNT BIGINT,
FILTER_COUNT BIGINT,
WRITE_COUNT BIGINT,
READ_SKIP_COUNT BIGINT,
WRITE_SKIP_COUNT BIGINT,
PROCESS_SKIP_COUNT BIGINT,
ROLLBACK_COUNT BIGINT,
EXIT_CODE VARCHAR(2500),
EXIT_MESSAGE VARCHAR(2500),
LAST_UPDATED DATETIME(6),
CONSTRAINT JOB_EXEC_STEP_FK FOREIGN KEY (JOB_EXECUTION_ID)
REFERENCES BATCH_JOB_EXECUTION(JOB_EXECUTION_ID)
);
These counters are invaluable: after a failed run you can see exactly how many items were read, written, and skipped before the crash.
BATCH_STEP_EXECUTION_CONTEXT
Stores the serialized ExecutionContext for each step execution. This is the key to step-level restartability — it typically contains the reader’s position (line number in a file, last primary key read from a database).
CREATE TABLE BATCH_STEP_EXECUTION_CONTEXT (
STEP_EXECUTION_ID BIGINT NOT NULL PRIMARY KEY,
SHORT_CONTEXT VARCHAR(2500) NOT NULL,
SERIALIZED_CONTEXT TEXT,
CONSTRAINT STEP_EXEC_CTX_FK FOREIGN KEY (STEP_EXECUTION_ID)
REFERENCES BATCH_STEP_EXECUTION(STEP_EXECUTION_ID)
);
JobInstance vs JobExecution
This distinction trips up almost every Spring Batch newcomer.
JobInstance is the logical job run. It is identified by the job name and its identifying parameters.
importOrdersJob + {runDate=2026-05-03} → one JobInstance
importOrdersJob + {runDate=2026-05-04} → different JobInstance
JobExecution is a single attempt to run a JobInstance. If the job fails, you restart it — same JobInstance, new JobExecution.
JobInstance(importOrdersJob, 2026-05-03)
├── JobExecution 1 → FAILED (crashed at step 2)
└── JobExecution 2 → COMPLETED (restart, resumed from step 2)
A JobInstance is considered complete only when one of its JobExecutions finishes with COMPLETED. After that, Spring Batch will refuse to run the same JobInstance again — this is the duplicate-run prevention mechanism.
BatchStatus vs ExitStatus
BatchStatus | ExitStatus | |
|---|---|---|
| Type | Enum | String |
| Values | COMPLETED, STARTED, FAILED, STOPPED, ABANDONED, UNKNOWN | COMPLETED, FAILED, STOPPED, or any custom string |
| Set by | Framework | Framework (default) or your code |
| Used for | Internal state tracking | Conditional step flow control |
BatchStatus is the framework’s internal state machine. ExitStatus is what drives the on("FAILED").to(step2) conditional transitions in your job definition. You can set a custom ExitStatus in a StepExecutionListener to create multi-branch job flows.
@Component
public class ValidationListener implements StepExecutionListener {
@Override
public ExitStatus afterStep(StepExecution stepExecution) {
long skipCount = stepExecution.getSkipCount();
if (skipCount > 1000) {
// Too many skips — treat as partial failure
return new ExitStatus("PARTIAL_FAILURE");
}
return null; // null = keep default ExitStatus
}
}
Schema Initialization
Spring Boot auto-creates the tables based on the spring.batch.jdbc.initialize-schema property:
# always = create on every startup (dev/test only)
# embedded = create only for H2/HSQL/Derby (default)
# never = never create — manage schema yourself (production default)
spring.batch.jdbc.initialize-schema=always
For production, use never and apply the DDL through your migration tool (Flyway or Liquibase). The official MySQL DDL script is bundled in the Spring Batch jar:
org/springframework/batch/core/schema-mysql.sql
Extract it and add it to your migration scripts:
# Extract from the jar in your Maven local repository
jar xf ~/.m2/repository/org/springframework/batch/spring-batch-core/5.2.x/spring-batch-core-5.2.x.jar \
org/springframework/batch/core/schema-mysql.sql
Querying Batch History in MySQL
These SQL queries are useful for monitoring dashboards, alerting, and debugging.
All executions for a job, most recent first
SELECT
je.JOB_EXECUTION_ID,
je.STATUS,
je.EXIT_CODE,
je.START_TIME,
je.END_TIME,
TIMESTAMPDIFF(SECOND, je.START_TIME, je.END_TIME) AS duration_seconds
FROM BATCH_JOB_EXECUTION je
JOIN BATCH_JOB_INSTANCE ji ON je.JOB_INSTANCE_ID = ji.JOB_INSTANCE_ID
WHERE ji.JOB_NAME = 'importOrdersJob'
ORDER BY je.START_TIME DESC
LIMIT 20;
Currently running jobs
SELECT
ji.JOB_NAME,
je.JOB_EXECUTION_ID,
je.START_TIME,
TIMESTAMPDIFF(SECOND, je.START_TIME, NOW()) AS running_seconds
FROM BATCH_JOB_EXECUTION je
JOIN BATCH_JOB_INSTANCE ji ON je.JOB_INSTANCE_ID = ji.JOB_INSTANCE_ID
WHERE je.STATUS = 'STARTED';
Failed jobs in the last 24 hours
SELECT
ji.JOB_NAME,
je.JOB_EXECUTION_ID,
je.START_TIME,
je.END_TIME,
LEFT(je.EXIT_MESSAGE, 500) AS error_summary
FROM BATCH_JOB_EXECUTION je
JOIN BATCH_JOB_INSTANCE ji ON je.JOB_INSTANCE_ID = ji.JOB_INSTANCE_ID
WHERE je.STATUS = 'FAILED'
AND je.END_TIME > DATE_SUB(NOW(), INTERVAL 24 HOUR)
ORDER BY je.END_TIME DESC;
Step metrics for a specific execution
SELECT
se.STEP_NAME,
se.STATUS,
se.READ_COUNT,
se.WRITE_COUNT,
se.FILTER_COUNT,
se.READ_SKIP_COUNT + se.WRITE_SKIP_COUNT + se.PROCESS_SKIP_COUNT AS total_skips,
se.ROLLBACK_COUNT,
TIMESTAMPDIFF(SECOND, se.START_TIME, se.END_TIME) AS duration_seconds
FROM BATCH_STEP_EXECUTION se
WHERE se.JOB_EXECUTION_ID = 42
ORDER BY se.STEP_EXECUTION_ID;
Job parameters for a specific execution
SELECT
PARAMETER_NAME,
PARAMETER_TYPE,
PARAMETER_VALUE,
IF(IDENTIFYING = 'Y', 'identifying', 'non-identifying') AS param_role
FROM BATCH_JOB_EXECUTION_PARAMS
WHERE JOB_EXECUTION_ID = 42;
Programmatic Access with JobExplorer
JobExplorer gives you read-only access to the metadata from your application code. Spring Boot auto-wires it.
Finding job instances and executions
@Service
@RequiredArgsConstructor
public class BatchMonitorService {
private final JobExplorer jobExplorer;
public List<JobInstance> getRecentInstances(String jobName, int count) {
return jobExplorer.findJobInstancesByJobName(jobName, 0, count);
}
public JobExecution getLatestExecution(String jobName) {
List<JobInstance> instances = jobExplorer.findJobInstancesByJobName(jobName, 0, 1);
if (instances.isEmpty()) return null;
List<JobExecution> executions = jobExplorer.getJobExecutions(instances.get(0));
return executions.isEmpty() ? null : executions.get(0);
}
public Set<JobExecution> getRunningJobs(String jobName) {
return jobExplorer.findRunningJobExecutions(jobName);
}
public Collection<StepExecution> getStepExecutions(long jobExecutionId) {
JobExecution execution = jobExplorer.getJobExecution(jobExecutionId);
return execution != null ? execution.getStepExecutions() : Collections.emptyList();
}
}
Exposing batch status via a REST endpoint
@RestController
@RequestMapping("/api/batch")
@RequiredArgsConstructor
public class BatchStatusController {
private final JobExplorer jobExplorer;
@GetMapping("/jobs/{jobName}/latest")
public ResponseEntity<Map<String, Object>> getLatestExecution(@PathVariable String jobName) {
List<JobInstance> instances = jobExplorer.findJobInstancesByJobName(jobName, 0, 1);
if (instances.isEmpty()) {
return ResponseEntity.notFound().build();
}
List<JobExecution> executions = jobExplorer.getJobExecutions(instances.get(0));
if (executions.isEmpty()) {
return ResponseEntity.notFound().build();
}
JobExecution latest = executions.get(0);
Map<String, Object> result = new LinkedHashMap<>();
result.put("jobName", jobName);
result.put("executionId", latest.getId());
result.put("status", latest.getStatus());
result.put("exitCode", latest.getExitStatus().getExitCode());
result.put("startTime", latest.getStartTime());
result.put("endTime", latest.getEndTime());
result.put("steps", latest.getStepExecutions().stream()
.map(se -> Map.of(
"name", se.getStepName(),
"status", se.getStatus(),
"readCount", se.getReadCount(),
"writeCount", se.getWriteCount()
))
.collect(Collectors.toList())
);
return ResponseEntity.ok(result);
}
@GetMapping("/jobs/{jobName}/running")
public ResponseEntity<List<Long>> getRunningExecutions(@PathVariable String jobName) {
Set<JobExecution> running = jobExplorer.findRunningJobExecutions(jobName);
List<Long> ids = running.stream()
.map(JobExecution::getId)
.collect(Collectors.toList());
return ResponseEntity.ok(ids);
}
}
How Metadata Enables Restartability
When a job fails at step 3 of 5, here is what happens on restart:
- You launch the job with the same identifying parameters as the failed run.
- Spring Batch queries
BATCH_JOB_INSTANCEto find the existingJobInstance. - Spring Batch checks
BATCH_JOB_EXECUTION— the last execution hasSTATUS = FAILED. - A new
JobExecutionrow is inserted for the sameJobInstance. - For each step, Spring Batch checks
BATCH_STEP_EXECUTION:- Steps with
STATUS = COMPLETEDare skipped — already done. - The failed step gets a new
StepExecution. Its context is loaded fromBATCH_STEP_EXECUTION_CONTEXT, restoring the reader’s position (e.g., row offset, file line number).
- Steps with
- Processing resumes from the exact point of failure.
This is why the VERSION column exists on several tables — it is an optimistic locking counter that prevents two concurrent processes from updating the same row simultaneously.
// Make a job restartable (default is true)
@Bean
public Job importOrdersJob(JobRepository jobRepository, Step importStep) {
return new JobBuilder("importOrdersJob", jobRepository)
.start(importStep)
.build();
// restartable = true by default
}
// Explicitly prevent restart (always create a new JobInstance)
@Bean
public Job dailyReportJob(JobRepository jobRepository, Step reportStep) {
return new JobBuilder("dailyReportJob", jobRepository)
.start(reportStep)
.preventRestart() // fails if the same parameters are resubmitted
.build();
}
Customizing the Table Prefix
If you need to share a database with other applications or use a custom schema, you can change the default BATCH_ prefix:
# application.properties
spring.batch.jdbc.table-prefix=ECOM_BATCH_
Spring Boot passes this through to both JobRepository and JobExplorer automatically.
For custom datasource wiring:
@Configuration
public class BatchInfraConfig {
@Bean
public JobRepository jobRepository(DataSource dataSource,
PlatformTransactionManager tm) throws Exception {
JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
factory.setDataSource(dataSource);
factory.setTransactionManager(tm);
factory.setTablePrefix("ECOM_BATCH_");
factory.setDatabaseType("mysql");
factory.afterPropertiesSet();
return factory.getObject();
}
@Bean
public JobExplorer jobExplorer(DataSource dataSource) throws Exception {
JobExplorerFactoryBean factory = new JobExplorerFactoryBean();
factory.setDataSource(dataSource);
factory.setTablePrefix("ECOM_BATCH_"); // must match JobRepository
factory.afterPropertiesSet();
return factory.getObject();
}
}
In-Memory Repository for Testing (Spring Batch 5)
Breaking change: MapJobRepositoryFactoryBean was removed in Spring Batch 5. There is no map-based in-memory implementation.
Option 1: H2 with embedded schema (recommended)
@TestConfiguration
public class BatchTestConfig {
@Bean
@Primary
public DataSource testDataSource() {
return new EmbeddedDatabaseBuilder()
.setType(EmbeddedDatabaseType.H2)
.addScript("classpath:org/springframework/batch/core/schema-h2.sql")
.build();
}
@Bean
public PlatformTransactionManager transactionManager(DataSource dataSource) {
return new DataSourceTransactionManager(dataSource);
}
}
@SpringBatchTest
@SpringBootTest
@Import(BatchTestConfig.class)
class ImportOrdersJobTest {
@Autowired
private JobLauncherTestUtils jobLauncherTestUtils;
@Autowired
private JobRepositoryTestUtils jobRepositoryTestUtils;
@AfterEach
void cleanUp() {
jobRepositoryTestUtils.removeJobExecutions();
}
@Test
void jobCompletesSuccessfully() throws Exception {
JobExecution execution = jobLauncherTestUtils.launchJob(
new JobParametersBuilder()
.addString("runDate", "2026-05-03")
.toJobParameters()
);
assertThat(execution.getStatus()).isEqualTo(BatchStatus.COMPLETED);
}
}
Option 2: ResourcelessJobRepository (Spring Batch 5.2+)
Use when you want to test reader/processor/writer logic without any database at all:
@TestConfiguration
public class NoDbBatchConfig {
@Bean
public JobRepository jobRepository() {
return new ResourcelessJobRepository();
}
@Bean
public PlatformTransactionManager transactionManager() {
return new ResourcelessTransactionManager();
}
}
Limitations: no restart, no history, single-threaded only. Good for unit-testing individual steps.
Worked Example: Inspecting a Failed Job
Suppose your importOrdersJob failed. Here is a complete debugging workflow.
Step 1: Find the failed execution
SELECT je.JOB_EXECUTION_ID, je.STATUS, je.EXIT_CODE, je.START_TIME
FROM BATCH_JOB_EXECUTION je
JOIN BATCH_JOB_INSTANCE ji ON je.JOB_INSTANCE_ID = ji.JOB_INSTANCE_ID
WHERE ji.JOB_NAME = 'importOrdersJob' AND je.STATUS = 'FAILED'
ORDER BY je.START_TIME DESC LIMIT 1;
-- Result: JOB_EXECUTION_ID = 17
Step 2: See which step failed
SELECT STEP_NAME, STATUS, EXIT_CODE, READ_COUNT, WRITE_COUNT, ROLLBACK_COUNT, LEFT(EXIT_MESSAGE, 300)
FROM BATCH_STEP_EXECUTION
WHERE JOB_EXECUTION_ID = 17;
-- processOrdersStep FAILED with "DataIntegrityViolationException: Duplicate entry..."
Step 3: Check the step context to see reader position
SELECT SHORT_CONTEXT
FROM BATCH_STEP_EXECUTION_CONTEXT sec
JOIN BATCH_STEP_EXECUTION se ON sec.STEP_EXECUTION_ID = se.STEP_EXECUTION_ID
WHERE se.JOB_EXECUTION_ID = 17 AND se.STEP_NAME = 'processOrdersStep';
-- {"batch.taskletType":"...","batch.stepType":"ChunkOrientedTasklet","FlatFileItemReader.read.count":4823}
The context shows the reader stopped at line 4823. When you fix the duplicate-key issue and relaunch with the same parameters, Spring Batch restores this context and resumes from line 4824.
Step 4: Relaunch
// In your service or controller
jobLauncher.run(importOrdersJob,
new JobParametersBuilder()
.addString("runDate", "2026-05-03", true) // true = identifying
.toJobParameters()
);
// Spring Batch detects the existing FAILED JobInstance and restarts it
Key Takeaways
- Spring Batch persists all job state in six relational tables — this is non-optional and is what makes batch processing reliable.
- A
JobInstanceis the logical run (job + parameters). AJobExecutionis one attempt. The sameJobInstancecan have multipleJobExecutions after failures and restarts. BatchStatusis the framework’s internal enum.ExitStatusis the string code that drives conditional flow and can be customized.JobExplorergives read-only programmatic access to batch history — use it to build monitoring endpoints.MapJobRepositoryFactoryBeanis gone in Spring Batch 5. For testing, use H2 with the embedded schema script, orResourcelessJobRepositoryfor simple unit tests.- Set
spring.batch.jdbc.initialize-schema=neverin production and manage the schema with Flyway or Liquibase.
What’s Next
Article 5 starts Part 2 of this series — getting data in. We will cover FlatFileItemReader in depth: reading CSV files, fixed-width files, multi-line records, handling headers and footers, and dealing with encoding issues. All examples will feed real data into the e-commerce orders MySQL schema.