JobRepository and Batch Metadata: How Spring Batch Tracks Everything

Part 4 of 25

May 03, 2026 Abhay 11 min read

JobRepository and Batch Metadata: How Spring Batch Tracks Everything

Introduction

Every time Spring Batch runs a job it records the run’s history in a set of relational tables. This metadata is not optional — it is what makes Spring Batch reliable. Without it there would be no restart capability, no duplicate-run prevention, and no audit trail. Understanding the metadata layer is essential for debugging failures, building monitoring dashboards, and designing restartable jobs.

In this article you will learn:

The role of JobRepository and JobExplorer in the Spring Batch architecture
The six metadata tables — their schema, purpose, and relationships
How Spring Batch uses these tables to enable restartability
How to query batch history directly in MySQL
How to access metadata programmatically with JobExplorer
Key changes in Spring Batch 5 (removed MapJobRepositoryFactoryBean, new testing approach)

All examples use Spring Boot 3.3+, Spring Batch 5.2+, and MySQL 8.x — the same stack established in Article 3.

The Two Persistence APIs

Spring Batch provides two beans for working with job metadata:

Bean	Access	Purpose
`JobRepository`	Read + Write	Framework-internal CRUD for job/step state
`JobExplorer`	Read only	Application code queries for history and status

JobRepository is used by the framework itself — SimpleJobLauncher, Job, and Step implementations call it to persist state. You rarely call it directly. JobExplorer is the API you call in your own code to inspect what happened.

Both are auto-configured by Spring Boot when spring-boot-starter-batch is on the classpath and a DataSource bean exists.

The Six Metadata Tables

Spring Batch creates six tables in your database. Their names are prefixed with BATCH_ by default. Here is the entity-relationship overview:

BATCH_JOB_INSTANCE
    ├── BATCH_JOB_EXECUTION (one instance → many executions)
    │       ├── BATCH_JOB_EXECUTION_PARAMS
    │       ├── BATCH_JOB_EXECUTION_CONTEXT
    │       └── BATCH_STEP_EXECUTION (one execution → many step executions)
    │               └── BATCH_STEP_EXECUTION_CONTEXT

BATCH_JOB_INSTANCE

Represents a unique combination of job name + identifying job parameters. Think of it as “the job run for date 2026-05-03”.

CREATE TABLE BATCH_JOB_INSTANCE (
    JOB_INSTANCE_ID BIGINT       NOT NULL PRIMARY KEY,
    VERSION         BIGINT,
    JOB_NAME        VARCHAR(100) NOT NULL,
    JOB_KEY         VARCHAR(32)  NOT NULL,
    UNIQUE (JOB_NAME, JOB_KEY)
);

JOB_KEY is an MD5 hash of the identifying parameters. The unique constraint on (JOB_NAME, JOB_KEY) is the mechanism that prevents you from accidentally launching the same logical job run twice.

BATCH_JOB_EXECUTION

Represents a single execution attempt against a JobInstance. If a job fails and you restart it, the same JobInstance gets a new JobExecution.

CREATE TABLE BATCH_JOB_EXECUTION (
    JOB_EXECUTION_ID BIGINT       NOT NULL PRIMARY KEY,
    VERSION          BIGINT,
    JOB_INSTANCE_ID  BIGINT       NOT NULL,
    CREATE_TIME      DATETIME(6)  NOT NULL,
    START_TIME       DATETIME(6)  DEFAULT NULL,
    END_TIME         DATETIME(6)  DEFAULT NULL,
    STATUS           VARCHAR(10),
    EXIT_CODE        VARCHAR(2500),
    EXIT_MESSAGE     VARCHAR(2500),
    LAST_UPDATED     DATETIME(6),
    CONSTRAINT JOB_INST_EXEC_FK FOREIGN KEY (JOB_INSTANCE_ID)
        REFERENCES BATCH_JOB_INSTANCE(JOB_INSTANCE_ID)
);

STATUS is a BatchStatus value: STARTED, COMPLETED, FAILED, STOPPED, ABANDONED. EXIT_CODE is an ExitStatus string — it can be the same as STATUS or a custom code you define.

BATCH_JOB_EXECUTION_PARAMS

Every job parameter passed at launch time is stored here, one row per parameter.

CREATE TABLE BATCH_JOB_EXECUTION_PARAMS (
    JOB_EXECUTION_ID BIGINT        NOT NULL,
    PARAMETER_NAME   VARCHAR(100)  NOT NULL,
    PARAMETER_TYPE   VARCHAR(100)  NOT NULL,
    PARAMETER_VALUE  VARCHAR(2500) DEFAULT NULL,
    IDENTIFYING      CHAR(1)       NOT NULL,
    CONSTRAINT JOB_EXEC_PARAMS_FK FOREIGN KEY (JOB_EXECUTION_ID)
        REFERENCES BATCH_JOB_EXECUTION(JOB_EXECUTION_ID)
);

The IDENTIFYING column (Y/N) marks whether this parameter contributed to the JOB_KEY hash. Non-identifying parameters (like a log level or notification email) are stored for auditing but do not affect JobInstance uniqueness.

BATCH_JOB_EXECUTION_CONTEXT

Stores the serialized ExecutionContext for each job execution. This is the state bag that enables a job to share data between steps and to resume after failure.

CREATE TABLE BATCH_JOB_EXECUTION_CONTEXT (
    JOB_EXECUTION_ID   BIGINT        NOT NULL PRIMARY KEY,
    SHORT_CONTEXT      VARCHAR(2500) NOT NULL,
    SERIALIZED_CONTEXT TEXT,
    CONSTRAINT JOB_EXEC_CTX_FK FOREIGN KEY (JOB_EXECUTION_ID)
        REFERENCES BATCH_JOB_EXECUTION(JOB_EXECUTION_ID)
);

SHORT_CONTEXT holds a compact JSON summary; SERIALIZED_CONTEXT holds the full serialized form used during restart.

BATCH_STEP_EXECUTION

One row per step per job execution. Contains item counters that are committed transactionally so they survive a crash.

CREATE TABLE BATCH_STEP_EXECUTION (
    STEP_EXECUTION_ID  BIGINT        NOT NULL PRIMARY KEY,
    VERSION            BIGINT        NOT NULL,
    STEP_NAME          VARCHAR(100)  NOT NULL,
    JOB_EXECUTION_ID   BIGINT        NOT NULL,
    CREATE_TIME        DATETIME(6)   NOT NULL,
    START_TIME         DATETIME(6)   DEFAULT NULL,
    END_TIME           DATETIME(6)   DEFAULT NULL,
    STATUS             VARCHAR(10),
    COMMIT_COUNT       BIGINT,
    READ_COUNT         BIGINT,
    FILTER_COUNT       BIGINT,
    WRITE_COUNT        BIGINT,
    READ_SKIP_COUNT    BIGINT,
    WRITE_SKIP_COUNT   BIGINT,
    PROCESS_SKIP_COUNT BIGINT,
    ROLLBACK_COUNT     BIGINT,
    EXIT_CODE          VARCHAR(2500),
    EXIT_MESSAGE       VARCHAR(2500),
    LAST_UPDATED       DATETIME(6),
    CONSTRAINT JOB_EXEC_STEP_FK FOREIGN KEY (JOB_EXECUTION_ID)
        REFERENCES BATCH_JOB_EXECUTION(JOB_EXECUTION_ID)
);

These counters are invaluable: after a failed run you can see exactly how many items were read, written, and skipped before the crash.

BATCH_STEP_EXECUTION_CONTEXT

Stores the serialized ExecutionContext for each step execution. This is the key to step-level restartability — it typically contains the reader’s position (line number in a file, last primary key read from a database).

CREATE TABLE BATCH_STEP_EXECUTION_CONTEXT (
    STEP_EXECUTION_ID  BIGINT        NOT NULL PRIMARY KEY,
    SHORT_CONTEXT      VARCHAR(2500) NOT NULL,
    SERIALIZED_CONTEXT TEXT,
    CONSTRAINT STEP_EXEC_CTX_FK FOREIGN KEY (STEP_EXECUTION_ID)
        REFERENCES BATCH_STEP_EXECUTION(STEP_EXECUTION_ID)
);

JobInstance vs JobExecution

This distinction trips up almost every Spring Batch newcomer.

JobInstance is the logical job run. It is identified by the job name and its identifying parameters.

importOrdersJob + {runDate=2026-05-03} → one JobInstance
importOrdersJob + {runDate=2026-05-04} → different JobInstance

JobExecution is a single attempt to run a JobInstance. If the job fails, you restart it — same JobInstance, new JobExecution.

JobInstance(importOrdersJob, 2026-05-03)
  ├── JobExecution 1  →  FAILED   (crashed at step 2)
  └── JobExecution 2  →  COMPLETED (restart, resumed from step 2)

A JobInstance is considered complete only when one of its JobExecutions finishes with COMPLETED. After that, Spring Batch will refuse to run the same JobInstance again — this is the duplicate-run prevention mechanism.

BatchStatus vs ExitStatus

	`BatchStatus`	`ExitStatus`
Type	Enum	String
Values	COMPLETED, STARTED, FAILED, STOPPED, ABANDONED, UNKNOWN	COMPLETED, FAILED, STOPPED, or any custom string
Set by	Framework	Framework (default) or your code
Used for	Internal state tracking	Conditional step flow control

BatchStatus is the framework’s internal state machine. ExitStatus is what drives the on("FAILED").to(step2) conditional transitions in your job definition. You can set a custom ExitStatus in a StepExecutionListener to create multi-branch job flows.

@Component
public class ValidationListener implements StepExecutionListener {

    @Override
    public ExitStatus afterStep(StepExecution stepExecution) {
        long skipCount = stepExecution.getSkipCount();
        if (skipCount > 1000) {
            // Too many skips — treat as partial failure
            return new ExitStatus("PARTIAL_FAILURE");
        }
        return null; // null = keep default ExitStatus
    }
}

Schema Initialization

Spring Boot auto-creates the tables based on the spring.batch.jdbc.initialize-schema property:

# always    = create on every startup (dev/test only)
# embedded  = create only for H2/HSQL/Derby (default)
# never     = never create — manage schema yourself (production default)
spring.batch.jdbc.initialize-schema=always

For production, use never and apply the DDL through your migration tool (Flyway or Liquibase). The official MySQL DDL script is bundled in the Spring Batch jar:

org/springframework/batch/core/schema-mysql.sql

Extract it and add it to your migration scripts:

# Extract from the jar in your Maven local repository
jar xf ~/.m2/repository/org/springframework/batch/spring-batch-core/5.2.x/spring-batch-core-5.2.x.jar \
    org/springframework/batch/core/schema-mysql.sql

Querying Batch History in MySQL

These SQL queries are useful for monitoring dashboards, alerting, and debugging.

All executions for a job, most recent first

SELECT
    je.JOB_EXECUTION_ID,
    je.STATUS,
    je.EXIT_CODE,
    je.START_TIME,
    je.END_TIME,
    TIMESTAMPDIFF(SECOND, je.START_TIME, je.END_TIME) AS duration_seconds
FROM BATCH_JOB_EXECUTION je
JOIN BATCH_JOB_INSTANCE ji ON je.JOB_INSTANCE_ID = ji.JOB_INSTANCE_ID
WHERE ji.JOB_NAME = 'importOrdersJob'
ORDER BY je.START_TIME DESC
LIMIT 20;

Currently running jobs

SELECT
    ji.JOB_NAME,
    je.JOB_EXECUTION_ID,
    je.START_TIME,
    TIMESTAMPDIFF(SECOND, je.START_TIME, NOW()) AS running_seconds
FROM BATCH_JOB_EXECUTION je
JOIN BATCH_JOB_INSTANCE ji ON je.JOB_INSTANCE_ID = ji.JOB_INSTANCE_ID
WHERE je.STATUS = 'STARTED';

Failed jobs in the last 24 hours

SELECT
    ji.JOB_NAME,
    je.JOB_EXECUTION_ID,
    je.START_TIME,
    je.END_TIME,
    LEFT(je.EXIT_MESSAGE, 500) AS error_summary
FROM BATCH_JOB_EXECUTION je
JOIN BATCH_JOB_INSTANCE ji ON je.JOB_INSTANCE_ID = ji.JOB_INSTANCE_ID
WHERE je.STATUS = 'FAILED'
  AND je.END_TIME > DATE_SUB(NOW(), INTERVAL 24 HOUR)
ORDER BY je.END_TIME DESC;

Step metrics for a specific execution

SELECT
    se.STEP_NAME,
    se.STATUS,
    se.READ_COUNT,
    se.WRITE_COUNT,
    se.FILTER_COUNT,
    se.READ_SKIP_COUNT + se.WRITE_SKIP_COUNT + se.PROCESS_SKIP_COUNT AS total_skips,
    se.ROLLBACK_COUNT,
    TIMESTAMPDIFF(SECOND, se.START_TIME, se.END_TIME) AS duration_seconds
FROM BATCH_STEP_EXECUTION se
WHERE se.JOB_EXECUTION_ID = 42
ORDER BY se.STEP_EXECUTION_ID;

Job parameters for a specific execution

SELECT
    PARAMETER_NAME,
    PARAMETER_TYPE,
    PARAMETER_VALUE,
    IF(IDENTIFYING = 'Y', 'identifying', 'non-identifying') AS param_role
FROM BATCH_JOB_EXECUTION_PARAMS
WHERE JOB_EXECUTION_ID = 42;

Programmatic Access with JobExplorer

JobExplorer gives you read-only access to the metadata from your application code. Spring Boot auto-wires it.

Finding job instances and executions

@Service
@RequiredArgsConstructor
public class BatchMonitorService {

    private final JobExplorer jobExplorer;

    public List<JobInstance> getRecentInstances(String jobName, int count) {
        return jobExplorer.findJobInstancesByJobName(jobName, 0, count);
    }

    public JobExecution getLatestExecution(String jobName) {
        List<JobInstance> instances = jobExplorer.findJobInstancesByJobName(jobName, 0, 1);
        if (instances.isEmpty()) return null;

        List<JobExecution> executions = jobExplorer.getJobExecutions(instances.get(0));
        return executions.isEmpty() ? null : executions.get(0);
    }

    public Set<JobExecution> getRunningJobs(String jobName) {
        return jobExplorer.findRunningJobExecutions(jobName);
    }

    public Collection<StepExecution> getStepExecutions(long jobExecutionId) {
        JobExecution execution = jobExplorer.getJobExecution(jobExecutionId);
        return execution != null ? execution.getStepExecutions() : Collections.emptyList();
    }
}

Exposing batch status via a REST endpoint

@RestController
@RequestMapping("/api/batch")
@RequiredArgsConstructor
public class BatchStatusController {

    private final JobExplorer jobExplorer;

    @GetMapping("/jobs/{jobName}/latest")
    public ResponseEntity<Map<String, Object>> getLatestExecution(@PathVariable String jobName) {
        List<JobInstance> instances = jobExplorer.findJobInstancesByJobName(jobName, 0, 1);
        if (instances.isEmpty()) {
            return ResponseEntity.notFound().build();
        }

        List<JobExecution> executions = jobExplorer.getJobExecutions(instances.get(0));
        if (executions.isEmpty()) {
            return ResponseEntity.notFound().build();
        }

        JobExecution latest = executions.get(0);
        Map<String, Object> result = new LinkedHashMap<>();
        result.put("jobName", jobName);
        result.put("executionId", latest.getId());
        result.put("status", latest.getStatus());
        result.put("exitCode", latest.getExitStatus().getExitCode());
        result.put("startTime", latest.getStartTime());
        result.put("endTime", latest.getEndTime());
        result.put("steps", latest.getStepExecutions().stream()
            .map(se -> Map.of(
                "name", se.getStepName(),
                "status", se.getStatus(),
                "readCount", se.getReadCount(),
                "writeCount", se.getWriteCount()
            ))
            .collect(Collectors.toList())
        );
        return ResponseEntity.ok(result);
    }

    @GetMapping("/jobs/{jobName}/running")
    public ResponseEntity<List<Long>> getRunningExecutions(@PathVariable String jobName) {
        Set<JobExecution> running = jobExplorer.findRunningJobExecutions(jobName);
        List<Long> ids = running.stream()
            .map(JobExecution::getId)
            .collect(Collectors.toList());
        return ResponseEntity.ok(ids);
    }
}

How Metadata Enables Restartability

When a job fails at step 3 of 5, here is what happens on restart:

You launch the job with the same identifying parameters as the failed run.
Spring Batch queries BATCH_JOB_INSTANCE to find the existing JobInstance.
Spring Batch checks BATCH_JOB_EXECUTION — the last execution has STATUS = FAILED.
A new JobExecution row is inserted for the same JobInstance.
For each step, Spring Batch checks BATCH_STEP_EXECUTION:
- Steps with STATUS = COMPLETED are skipped — already done.
- The failed step gets a new StepExecution. Its context is loaded from BATCH_STEP_EXECUTION_CONTEXT, restoring the reader’s position (e.g., row offset, file line number).
Processing resumes from the exact point of failure.

This is why the VERSION column exists on several tables — it is an optimistic locking counter that prevents two concurrent processes from updating the same row simultaneously.

// Make a job restartable (default is true)
@Bean
public Job importOrdersJob(JobRepository jobRepository, Step importStep) {
    return new JobBuilder("importOrdersJob", jobRepository)
        .start(importStep)
        .build();
    // restartable = true by default
}

// Explicitly prevent restart (always create a new JobInstance)
@Bean
public Job dailyReportJob(JobRepository jobRepository, Step reportStep) {
    return new JobBuilder("dailyReportJob", jobRepository)
        .start(reportStep)
        .preventRestart()   // fails if the same parameters are resubmitted
        .build();
}

Customizing the Table Prefix

If you need to share a database with other applications or use a custom schema, you can change the default BATCH_ prefix:

# application.properties
spring.batch.jdbc.table-prefix=ECOM_BATCH_

Spring Boot passes this through to both JobRepository and JobExplorer automatically.

For custom datasource wiring:

@Configuration
public class BatchInfraConfig {

    @Bean
    public JobRepository jobRepository(DataSource dataSource,
                                       PlatformTransactionManager tm) throws Exception {
        JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
        factory.setDataSource(dataSource);
        factory.setTransactionManager(tm);
        factory.setTablePrefix("ECOM_BATCH_");
        factory.setDatabaseType("mysql");
        factory.afterPropertiesSet();
        return factory.getObject();
    }

    @Bean
    public JobExplorer jobExplorer(DataSource dataSource) throws Exception {
        JobExplorerFactoryBean factory = new JobExplorerFactoryBean();
        factory.setDataSource(dataSource);
        factory.setTablePrefix("ECOM_BATCH_");  // must match JobRepository
        factory.afterPropertiesSet();
        return factory.getObject();
    }
}

In-Memory Repository for Testing (Spring Batch 5)

Breaking change: MapJobRepositoryFactoryBean was removed in Spring Batch 5. There is no map-based in-memory implementation.

Option 1: H2 with embedded schema (recommended)

@TestConfiguration
public class BatchTestConfig {

    @Bean
    @Primary
    public DataSource testDataSource() {
        return new EmbeddedDatabaseBuilder()
            .setType(EmbeddedDatabaseType.H2)
            .addScript("classpath:org/springframework/batch/core/schema-h2.sql")
            .build();
    }

    @Bean
    public PlatformTransactionManager transactionManager(DataSource dataSource) {
        return new DataSourceTransactionManager(dataSource);
    }
}

@SpringBatchTest
@SpringBootTest
@Import(BatchTestConfig.class)
class ImportOrdersJobTest {

    @Autowired
    private JobLauncherTestUtils jobLauncherTestUtils;

    @Autowired
    private JobRepositoryTestUtils jobRepositoryTestUtils;

    @AfterEach
    void cleanUp() {
        jobRepositoryTestUtils.removeJobExecutions();
    }

    @Test
    void jobCompletesSuccessfully() throws Exception {
        JobExecution execution = jobLauncherTestUtils.launchJob(
            new JobParametersBuilder()
                .addString("runDate", "2026-05-03")
                .toJobParameters()
        );
        assertThat(execution.getStatus()).isEqualTo(BatchStatus.COMPLETED);
    }
}

Option 2: ResourcelessJobRepository (Spring Batch 5.2+)

Use when you want to test reader/processor/writer logic without any database at all:

@TestConfiguration
public class NoDbBatchConfig {

    @Bean
    public JobRepository jobRepository() {
        return new ResourcelessJobRepository();
    }

    @Bean
    public PlatformTransactionManager transactionManager() {
        return new ResourcelessTransactionManager();
    }
}

Limitations: no restart, no history, single-threaded only. Good for unit-testing individual steps.

Worked Example: Inspecting a Failed Job

Suppose your importOrdersJob failed. Here is a complete debugging workflow.

Step 1: Find the failed execution

SELECT je.JOB_EXECUTION_ID, je.STATUS, je.EXIT_CODE, je.START_TIME
FROM BATCH_JOB_EXECUTION je
JOIN BATCH_JOB_INSTANCE ji ON je.JOB_INSTANCE_ID = ji.JOB_INSTANCE_ID
WHERE ji.JOB_NAME = 'importOrdersJob' AND je.STATUS = 'FAILED'
ORDER BY je.START_TIME DESC LIMIT 1;
-- Result: JOB_EXECUTION_ID = 17

Step 2: See which step failed

SELECT STEP_NAME, STATUS, EXIT_CODE, READ_COUNT, WRITE_COUNT, ROLLBACK_COUNT, LEFT(EXIT_MESSAGE, 300)
FROM BATCH_STEP_EXECUTION
WHERE JOB_EXECUTION_ID = 17;
-- processOrdersStep FAILED with "DataIntegrityViolationException: Duplicate entry..."

Step 3: Check the step context to see reader position

SELECT SHORT_CONTEXT
FROM BATCH_STEP_EXECUTION_CONTEXT sec
JOIN BATCH_STEP_EXECUTION se ON sec.STEP_EXECUTION_ID = se.STEP_EXECUTION_ID
WHERE se.JOB_EXECUTION_ID = 17 AND se.STEP_NAME = 'processOrdersStep';
-- {"batch.taskletType":"...","batch.stepType":"ChunkOrientedTasklet","FlatFileItemReader.read.count":4823}

The context shows the reader stopped at line 4823. When you fix the duplicate-key issue and relaunch with the same parameters, Spring Batch restores this context and resumes from line 4824.

Step 4: Relaunch

// In your service or controller
jobLauncher.run(importOrdersJob,
    new JobParametersBuilder()
        .addString("runDate", "2026-05-03", true)  // true = identifying
        .toJobParameters()
);
// Spring Batch detects the existing FAILED JobInstance and restarts it

Key Takeaways

Spring Batch persists all job state in six relational tables — this is non-optional and is what makes batch processing reliable.
A JobInstance is the logical run (job + parameters). A JobExecution is one attempt. The same JobInstance can have multiple JobExecutions after failures and restarts.
BatchStatus is the framework’s internal enum. ExitStatus is the string code that drives conditional flow and can be customized.
JobExplorer gives read-only programmatic access to batch history — use it to build monitoring endpoints.
MapJobRepositoryFactoryBean is gone in Spring Batch 5. For testing, use H2 with the embedded schema script, or ResourcelessJobRepository for simple unit tests.
Set spring.batch.jdbc.initialize-schema=never in production and manage the schema with Flyway or Liquibase.

What’s Next

Article 5 starts Part 2 of this series — getting data in. We will cover FlatFileItemReader in depth: reading CSV files, fixed-width files, multi-line records, handling headers and footers, and dealing with encoding issues. All examples will feed real data into the e-commerce orders MySQL schema.