Writing to Files and Databases: FlatFileItemWriter and JdbcBatchItemWriter

Introduction

After reading and processing data, your batch job needs to write results somewhere. Spring Batch provides two essential writers:

  • FlatFileItemWriter — writes items to CSV or any delimited/formatted flat file
  • JdbcBatchItemWriter — writes items to a database using JDBC batch inserts/updates

Both are transactional and restartable. This article covers both in depth, plus transaction semantics you must understand to avoid duplicates and partial writes.


How Writers Work in Spring Batch

Writers receive a Chunk<O> — a list of all items processed in the current transaction. The write() method is called once per chunk, not once per item. This is the key to high throughput: instead of executing one INSERT per item, JdbcBatchItemWriter sends all items in a single JDBC batch.

public interface ItemWriter<T> {
    void write(Chunk<? extends T> chunk) throws Exception;
}

If write() throws, the entire chunk transaction is rolled back. Spring Batch then retries item-by-item (if retry is configured) to isolate the bad item.


FlatFileItemWriter

Basic CSV writer

@Bean
public FlatFileItemWriter<Order> orderCsvWriter(
        @Value("file:/output/orders-report.csv") Resource outputFile) {

    BeanWrapperFieldExtractor<Order> extractor = new BeanWrapperFieldExtractor<>();
    extractor.setNames(new String[]{"orderId", "customerId", "amount", "orderDate", "status"});

    DelimitedLineAggregator<Order> aggregator = new DelimitedLineAggregator<>();
    aggregator.setDelimiter(",");
    aggregator.setFieldExtractor(extractor);

    return new FlatFileItemWriterBuilder<Order>()
            .name("orderCsvWriter")
            .resource(outputFile)
            .lineAggregator(aggregator)
            .headerCallback(writer -> writer.write("orderId,customerId,amount,orderDate,status"))
            .footerCallback(writer -> writer.write("--- End of Report ---"))
            .build();
}
  • BeanWrapperFieldExtractor reads POJO properties in the specified order.
  • DelimitedLineAggregator joins fields with the delimiter.
  • headerCallback and footerCallback write a single header/footer line — they are called once per file, not per chunk.

Appending to an existing file

By default, FlatFileItemWriter overwrites the file on each run. To append:

return new FlatFileItemWriterBuilder<Order>()
        .name("orderCsvWriter")
        .resource(outputFile)
        .lineAggregator(aggregator)
        .append(true)                 // append to existing file
        .shouldDeleteIfEmpty(true)    // delete output file if no items were written
        .build();

Writing fixed-width output

FormatterLineAggregator<Order> aggregator = new FormatterLineAggregator<>();
aggregator.setFormat("%-10d%-15d%10.2f%-10s%-12s");
aggregator.setFieldExtractor(extractor);

setFormat uses String.format syntax. This produces right-aligned numbers and left-aligned strings in fixed column widths.

Custom LineAggregator

For complex output formats (JSON lines, XML fragments, custom formats):

public class OrderJsonLineAggregator implements LineAggregator<Order> {

    private final ObjectMapper mapper = new ObjectMapper()
            .registerModule(new JavaTimeModule());

    @Override
    public String aggregate(Order order) {
        try {
            return mapper.writeValueAsString(order);
        } catch (JsonProcessingException e) {
            throw new RuntimeException("Failed to serialize order: " + order.getOrderId(), e);
        }
    }
}
return new FlatFileItemWriterBuilder<Order>()
        .name("orderJsonWriter")
        .resource(outputFile)
        .lineAggregator(new OrderJsonLineAggregator())
        .build();

Restart safety

FlatFileItemWriter is restartable by default. It stores the byte offset of the last successfully written chunk in ExecutionContext. On restart it truncates the file back to that offset and resumes writing — this prevents duplicate lines in the output.

This requires the writer to have a unique .name(...), just like readers.


JdbcBatchItemWriter

Basic insert

@Bean
public JdbcBatchItemWriter<Order> orderInsertWriter(DataSource dataSource) {
    return new JdbcBatchItemWriterBuilder<Order>()
            .dataSource(dataSource)
            .sql("INSERT INTO orders (customer_id, amount, order_date, status, created_at) " +
                 "VALUES (:customerId, :amount, :orderDate, :status, NOW())")
            .beanMapped()       // maps :paramName to order.getCustomerId() etc.
            .build();
}

beanMapped() uses BeanPropertyItemSqlParameterSourceProvider under the hood — it maps :paramName placeholders to POJO getter methods.

Using ItemSqlParameterSourceProvider for custom mapping

When your SQL parameter names don’t match your POJO field names:

@Bean
public JdbcBatchItemWriter<Order> orderWriter(DataSource dataSource) {
    return new JdbcBatchItemWriterBuilder<Order>()
            .dataSource(dataSource)
            .sql("INSERT INTO orders (cust_id, total, dt, state) " +
                 "VALUES (:cid, :total, :date, :state)")
            .itemSqlParameterSourceProvider(order -> {
                MapSqlParameterSource params = new MapSqlParameterSource();
                params.addValue("cid",   order.getCustomerId());
                params.addValue("total", order.getAmount());
                params.addValue("date",  order.getOrderDate());
                params.addValue("state", order.getStatus());
                return params;
            })
            .build();
}

Upsert (INSERT ON DUPLICATE KEY UPDATE)

MySQL supports upserts natively:

@Bean
public JdbcBatchItemWriter<Order> orderUpsertWriter(DataSource dataSource) {
    return new JdbcBatchItemWriterBuilder<Order>()
            .dataSource(dataSource)
            .sql("""
                INSERT INTO orders (order_id, customer_id, amount, order_date, status)
                VALUES (:orderId, :customerId, :amount, :orderDate, :status)
                ON DUPLICATE KEY UPDATE
                    amount  = VALUES(amount),
                    status  = VALUES(status)
                """)
            .beanMapped()
            .assertUpdates(false)    // required for upserts — affected rows can be 0, 1, or 2
            .build();
}

assertUpdates(false) disables the check that every SQL statement affected at least one row. You must set this for upserts or conditional updates.

Batch update

@Bean
public JdbcBatchItemWriter<Order> markCompletedWriter(DataSource dataSource) {
    return new JdbcBatchItemWriterBuilder<Order>()
            .dataSource(dataSource)
            .sql("UPDATE orders SET status = :status, updated_at = NOW() " +
                 "WHERE order_id = :orderId")
            .beanMapped()
            .build();
}

JDBC Batch Size Tuning

MySQL’s JDBC driver batches statements by default when you call addBatch(). Spring’s JdbcBatchItemWriter uses NamedParameterJdbcTemplate which uses SqlParameterSource[] — internally this calls executeBatch().

The effective batch size is your chunk size. All items in one chunk are batched in a single executeBatch() call. Larger chunks = fewer round trips = higher throughput.

Tune the MySQL JDBC URL:

spring.datasource.url=jdbc:mysql://localhost:3306/batch_db\
  ?rewriteBatchedStatements=true\
  &useServerPrepStmts=false

rewriteBatchedStatements=true is the most important MySQL JDBC tuning parameter for batch jobs. It rewrites individual INSERT statements into multi-row INSERTs, which can give a 10–50x throughput improvement.


Transaction Semantics

Understanding the write transaction is critical for correct batch design.

Read chunk (100 items)
    ↓
Process 100 items
    ↓
BEGIN TRANSACTION
    Write 100 items (one executeBatch call)
    Update BATCH_STEP_EXECUTION (read_count, write_count, etc.)
COMMIT

If the write fails, the entire chunk rolls back. The BATCH_STEP_EXECUTION counters are not updated. On retry/restart, Spring Batch re-reads the same chunk and tries again.

Important: your ItemWriter must be idempotent or use upserts if you cannot guarantee exactly-once delivery. Network failures can cause a write to succeed on the database but the client never sees the commit — Spring Batch will retry the chunk.


Complete Example: Orders Report Job

Read completed orders from MySQL, process them into a report DTO, write CSV output.

@Data
@AllArgsConstructor
public class OrderReportLine {
    private Long   orderId;
    private Long   customerId;
    private BigDecimal amount;
    private String orderDate;
    private String status;
}
@Configuration
@RequiredArgsConstructor
public class OrderReportJobConfig {

    private final DataSource dataSource;
    private final JobRepository jobRepository;
    private final PlatformTransactionManager tx;

    @Bean
    public JdbcPagingItemReader<Order> completedOrderReader() {
        MySqlPagingQueryProvider qp = new MySqlPagingQueryProvider();
        qp.setSelectClause("SELECT order_id, customer_id, amount, order_date, status");
        qp.setFromClause("FROM orders");
        qp.setWhereClause("WHERE status = 'COMPLETED'");
        qp.setSortKeys(Map.of("order_id", Order.ASCENDING));

        return new JdbcPagingItemReaderBuilder<Order>()
                .name("completedOrderReader")
                .dataSource(dataSource)
                .queryProvider(qp)
                .pageSize(500)
                .rowMapper(new BeanPropertyRowMapper<>(Order.class))
                .build();
    }

    @Bean
    public ItemProcessor<Order, OrderReportLine> orderToReportLineProcessor() {
        DateTimeFormatter fmt = DateTimeFormatter.ofPattern("dd-MMM-yyyy");
        return order -> new OrderReportLine(
                order.getOrderId(),
                order.getCustomerId(),
                order.getAmount(),
                order.getOrderDate().format(fmt),
                order.getStatus()
        );
    }

    @Bean
    public FlatFileItemWriter<OrderReportLine> reportCsvWriter(
            @Value("file:/reports/orders-${date:report}.csv") Resource output) {

        BeanWrapperFieldExtractor<OrderReportLine> extractor = new BeanWrapperFieldExtractor<>();
        extractor.setNames(new String[]{"orderId", "customerId", "amount", "orderDate", "status"});

        DelimitedLineAggregator<OrderReportLine> aggregator = new DelimitedLineAggregator<>();
        aggregator.setDelimiter(",");
        aggregator.setFieldExtractor(extractor);

        return new FlatFileItemWriterBuilder<OrderReportLine>()
                .name("reportCsvWriter")
                .resource(output)
                .lineAggregator(aggregator)
                .headerCallback(w -> w.write("Order ID,Customer ID,Amount,Date,Status"))
                .shouldDeleteIfEmpty(true)
                .build();
    }

    @Bean
    public Step generateReportStep() {
        return new StepBuilder("generateReportStep", jobRepository)
                .<Order, OrderReportLine>chunk(500, tx)
                .reader(completedOrderReader())
                .processor(orderToReportLineProcessor())
                .writer(reportCsvWriter(null))
                .build();
    }

    @Bean
    public Job generateOrderReportJob() {
        return new JobBuilder("generateOrderReportJob", jobRepository)
                .start(generateReportStep())
                .build();
    }
}

Key Takeaways

  • FlatFileItemWriter writes items as delimited or formatted lines. Use headerCallback / footerCallback for single-line header/footer. Enable .append(true) for accumulating output across runs.
  • JdbcBatchItemWriter sends all chunk items as a single JDBC batch. Use beanMapped() for automatic property-to-parameter mapping.
  • Set rewriteBatchedStatements=true in the MySQL JDBC URL — this is the single biggest throughput improvement for insert-heavy jobs.
  • Set assertUpdates(false) for upserts and conditional updates.
  • Both writers are restartable and transactional. Writers see a chunk at a time, not individual items.

What’s Next

Article 10 covers advanced writers: JpaItemWriter, CompositeItemWriter (fan-out to multiple destinations), ClassifierCompositeItemWriter (routing items to different writers), and building fully custom writers.