Reading from MySQL: JdbcCursorItemReader and JdbcPagingItemReader

Part 6 of 25

May 03, 2026 Abhay 7 min read

Reading from MySQL: JdbcCursorItemReader and JdbcPagingItemReader

Introduction

Most real batch jobs read from a database, not a file. Spring Batch provides two JDBC readers for this:

Reader	Strategy	Thread-safe	Use when
`JdbcCursorItemReader`	Open a server-side cursor, stream rows	No	Single-threaded step, huge result sets
`JdbcPagingItemReader`	Execute `LIMIT / OFFSET` queries in a loop	Yes	Multi-threaded steps, sorted data

Both handle restartability automatically and both work with any DataSource — including MySQL.

JdbcCursorItemReader

How it works

The reader opens a single JDBC ResultSet on Step.open() and streams rows one at a time. The cursor stays open for the life of the step, so the database connection is held the entire time. This is efficient for large result sets but means:

You need a connection that can stay open for minutes or hours.
You cannot use this reader in a multi-threaded step — ResultSet is not thread-safe.
On restart, Spring Batch re-executes the query and skips the rows already processed (using the read.count stored in ExecutionContext).

Basic example — read unprocessed orders

@Bean
public JdbcCursorItemReader<Order> pendingOrderCursorReader(DataSource dataSource) {
    return new JdbcCursorItemReaderBuilder<Order>()
            .name("pendingOrderCursorReader")
            .dataSource(dataSource)
            .sql("SELECT order_id, customer_id, amount, order_date, status " +
                 "FROM orders WHERE status = 'PENDING' ORDER BY order_id")
            .rowMapper(new BeanPropertyRowMapper<>(Order.class))
            .build();
}

BeanPropertyRowMapper maps column names to POJO fields using camel-case conversion (order_id → orderId). For complex mappings, implement RowMapper<T> yourself.

Custom RowMapper

public class OrderRowMapper implements RowMapper<Order> {

    @Override
    public Order mapRow(ResultSet rs, int rowNum) throws SQLException {
        Order o = new Order();
        o.setOrderId(rs.getLong("order_id"));
        o.setCustomerId(rs.getLong("customer_id"));
        o.setAmount(rs.getBigDecimal("amount"));
        o.setOrderDate(rs.getDate("order_date").toLocalDate());
        o.setStatus(rs.getString("status"));
        return o;
    }
}

.rowMapper(new OrderRowMapper())

Parameterized query

Pass runtime values with PreparedStatementSetter:

@Bean
public JdbcCursorItemReader<Order> ordersForDateReader(
        DataSource dataSource,
        @Value("#{jobParameters['runDate']}") String runDate) {

    return new JdbcCursorItemReaderBuilder<Order>()
            .name("ordersForDateReader")
            .dataSource(dataSource)
            .sql("SELECT order_id, customer_id, amount, order_date, status " +
                 "FROM orders WHERE order_date = ? ORDER BY order_id")
            .preparedStatementSetter(ps -> ps.setString(1, runDate))
            .rowMapper(new OrderRowMapper())
            .build();
}

Tuning fetch size

MySQL’s JDBC driver by default loads the entire ResultSet into memory before your code reads the first row. To stream properly you must set fetchSize to Integer.MIN_VALUE:

return new JdbcCursorItemReaderBuilder<Order>()
        .name("streamingOrderReader")
        .dataSource(dataSource)
        .sql("SELECT * FROM orders ORDER BY order_id")
        .rowMapper(new OrderRowMapper())
        .fetchSize(Integer.MIN_VALUE)   // MySQL streaming ResultSet
        .build();

With fetchSize = Integer.MIN_VALUE, MySQL streams rows one at a time. The connection cannot be used for other queries while streaming is in progress — keep this in mind if your ItemProcessor makes additional database calls.

Connection timeout considerations

For very large tables the cursor stays open for a long time. Configure MySQL and HikariCP accordingly:

spring.datasource.hikari.connection-timeout=30000
spring.datasource.hikari.maximum-pool-size=10
# Prevent MySQL from closing idle connections mid-step
spring.datasource.hikari.keepalive-time=60000

On the MySQL side, increase net_read_timeout and net_write_timeout if steps run for tens of minutes.

JdbcPagingItemReader

How it works

The paging reader executes a series of SELECT ... LIMIT N OFFSET M queries (or their database-specific equivalent). Each call to read() returns the next item from the current page. When the page is exhausted, a new query is executed for the next page.

Because each page is a separate query, the reader is thread-safe and can be used in multi-threaded steps. It also does not hold a long-lived connection.

Requirement: The query must have a stable sort key (typically the primary key). Without a consistent sort order, rows can shift between pages as data changes, causing duplicates or misses.

Basic example

@Bean
public JdbcPagingItemReader<Order> pendingOrderPagingReader(DataSource dataSource) {
    MySqlPagingQueryProvider queryProvider = new MySqlPagingQueryProvider();
    queryProvider.setSelectClause("SELECT order_id, customer_id, amount, order_date, status");
    queryProvider.setFromClause("FROM orders");
    queryProvider.setWhereClause("WHERE status = 'PENDING'");
    queryProvider.setSortKeys(Map.of("order_id", Order.ASCENDING));

    return new JdbcPagingItemReaderBuilder<Order>()
            .name("pendingOrderPagingReader")
            .dataSource(dataSource)
            .queryProvider(queryProvider)
            .pageSize(100)
            .rowMapper(new OrderRowMapper())
            .build();
}

MySqlPagingQueryProvider generates correct LIMIT / OFFSET SQL for MySQL. Spring Batch includes providers for all major databases.

Parameterized paging reader

@Bean
public JdbcPagingItemReader<Order> ordersForDatePagingReader(
        DataSource dataSource,
        @Value("#{jobParameters['runDate']}") String runDate) {

    MySqlPagingQueryProvider queryProvider = new MySqlPagingQueryProvider();
    queryProvider.setSelectClause(
            "SELECT order_id, customer_id, amount, order_date, status");
    queryProvider.setFromClause("FROM orders");
    queryProvider.setWhereClause("WHERE order_date = :runDate");
    queryProvider.setSortKeys(Map.of("order_id", Order.ASCENDING));

    return new JdbcPagingItemReaderBuilder<Order>()
            .name("ordersForDatePagingReader")
            .dataSource(dataSource)
            .queryProvider(queryProvider)
            .parameterValues(Map.of("runDate", runDate))
            .pageSize(500)
            .rowMapper(new OrderRowMapper())
            .build();
}

Page size vs chunk size

These are two independent settings:

Page size — how many rows one SQL query fetches (LIMIT N). Lives on the reader.
Chunk size — how many items are processed per transaction. Lives on the step.

A common mistake is setting chunk size larger than page size — this forces multiple page queries within a single chunk, which is fine but slightly inefficient. A good starting point: set page size equal to chunk size.

int chunkSize = 500;

Step step = new StepBuilder("processOrdersStep", jobRepository)
        .<Order, Order>chunk(chunkSize, tx)
        .reader(pagingReader)   // pageSize = 500
        .writer(writer)
        .build();

Cursor vs Paging: When to Use Which

Concern	Cursor	Paging
Memory	Low — streams one row at a time	Low — one page in memory
Connection held	For entire step duration	Released between pages
Thread-safe	No	Yes
Multi-threaded step	Not supported	Supported
Restart behaviour	Re-executes full query, skips N rows	Re-executes from last page offset
MySQL streaming	Needs `fetchSize = Integer.MIN_VALUE`	Standard LIMIT/OFFSET
Data consistency	Snapshot at query open	Pages may see data changes between queries

Rule of thumb: Use JdbcCursorItemReader when you have a single-threaded step reading a stable snapshot. Use JdbcPagingItemReader when you need parallel processing or your step takes a long time and you cannot hold a connection open.

Complete Job: Reprocess Pending Orders

This example reads PENDING orders from MySQL, sets their status to PROCESSING, and writes back to the database.

@Configuration
@RequiredArgsConstructor
public class ReprocessOrdersJobConfig {

    private final DataSource dataSource;
    private final JobRepository jobRepository;
    private final PlatformTransactionManager tx;

    // --- Reader ---

    @Bean
    public JdbcPagingItemReader<Order> pendingOrderReader() {
        MySqlPagingQueryProvider qp = new MySqlPagingQueryProvider();
        qp.setSelectClause("SELECT order_id, customer_id, amount, order_date, status");
        qp.setFromClause("FROM orders");
        qp.setWhereClause("WHERE status = 'PENDING'");
        qp.setSortKeys(Map.of("order_id", Order.ASCENDING));

        return new JdbcPagingItemReaderBuilder<Order>()
                .name("pendingOrderReader")
                .dataSource(dataSource)
                .queryProvider(qp)
                .pageSize(200)
                .rowMapper(new OrderRowMapper())
                .build();
    }

    // --- Processor ---

    @Bean
    public ItemProcessor<Order, Order> markProcessingProcessor() {
        return order -> {
            order.setStatus("PROCESSING");
            return order;
        };
    }

    // --- Writer ---

    @Bean
    public JdbcBatchItemWriter<Order> orderStatusWriter() {
        return new JdbcBatchItemWriterBuilder<Order>()
                .dataSource(dataSource)
                .sql("UPDATE orders SET status = :status WHERE order_id = :orderId")
                .beanMapped()
                .build();
    }

    // --- Step ---

    @Bean
    public Step reprocessOrdersStep() {
        return new StepBuilder("reprocessOrdersStep", jobRepository)
                .<Order, Order>chunk(200, tx)
                .reader(pendingOrderReader())
                .processor(markProcessingProcessor())
                .writer(orderStatusWriter())
                .build();
    }

    // --- Job ---

    @Bean
    public Job reprocessOrdersJob() {
        return new JobBuilder("reprocessOrdersJob", jobRepository)
                .start(reprocessOrdersStep())
                .build();
    }
}

Restart Behaviour

JdbcCursorItemReader on restart

The cursor reader stores its read.count in BATCH_STEP_EXECUTION_CONTEXT. On restart it re-executes the original SQL and calls ResultSet.next() N times to skip already-processed rows. This is safe but slow for large offsets — if 2 million rows were processed before the crash, 2 million rows will be skipped on restart.

JdbcPagingItemReader on restart

The paging reader stores the last page’s sort key value in ExecutionContext. On restart it resumes from that key using a WHERE clause like WHERE order_id > :_lastId. This is efficient regardless of how many rows were already processed.

This is one advantage of JdbcPagingItemReader for large tables: restarts are fast.

Reading with a Named Query

For complex queries, define them as constants or load from a file:

private static final String PENDING_ORDERS_SQL = """
        SELECT o.order_id,
               o.customer_id,
               o.amount,
               o.order_date,
               o.status
        FROM   orders o
        JOIN   customers c ON c.customer_id = o.customer_id
        WHERE  o.status = 'PENDING'
          AND  c.active = 1
        ORDER  BY o.order_id
        """;

@Bean
public JdbcCursorItemReader<Order> complexOrderReader(DataSource dataSource) {
    return new JdbcCursorItemReaderBuilder<Order>()
            .name("complexOrderReader")
            .dataSource(dataSource)
            .sql(PENDING_ORDERS_SQL)
            .rowMapper(new OrderRowMapper())
            .fetchSize(Integer.MIN_VALUE)
            .build();
}

Key Takeaways

JdbcCursorItemReader streams rows via a server-side cursor. It is not thread-safe. Set fetchSize = Integer.MIN_VALUE on MySQL to actually stream instead of buffering.
JdbcPagingItemReader uses LIMIT / OFFSET queries. It is thread-safe and suitable for multi-threaded steps. It requires a stable sort key.
Page size and chunk size are independent. Start with them equal, then tune.
Both readers are restartable. The paging reader’s restart is more efficient for large tables.
Use MySqlPagingQueryProvider for MySQL-specific LIMIT pagination syntax.

What’s Next

Article 7 covers JpaPagingItemReader — reading via JPA entities instead of raw JDBC, when it is appropriate, and how to avoid the N+1 problem in batch contexts.