Reading Flat Files: CSV, Fixed-Width, and Delimited with FlatFileItemReader

Introduction

Flat files — CSV exports, fixed-width mainframe feeds, pipe-delimited data dumps — are the most common input source for batch jobs. Spring Batch’s FlatFileItemReader handles all of them. It is restartable out of the box: it persists its line-number position in the ExecutionContext so that a restarted job resumes exactly where it crashed.

In this article you will build a complete order-import job that reads a CSV file and inserts rows into a MySQL orders table. Along the way you will learn every important configuration option: delimiters, fixed-width ranges, header skipping, comment lines, multi-file reading, custom mappers, and error handling.


How FlatFileItemReader Works

Every FlatFileItemReader needs two things:

  1. A Resource — the file to read (classpath, filesystem, S3, etc.)
  2. A LineMapper — converts a raw String line into a domain object

The default DefaultLineMapper splits the work into two phases:

Raw line  →  LineTokenizer  →  FieldSet  →  FieldSetMapper  →  Domain object

LineTokenizer parses the line into named fields. FieldSetMapper converts those fields into your POJO.


Project Setup

Add this dependency to your pom.xml (if not already present from Article 3):

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-batch</artifactId>
</dependency>
<dependency>
    <groupId>com.mysql</groupId>
    <artifactId>mysql-connector-j</artifactId>
    <scope>runtime</scope>
</dependency>

Create the orders table in MySQL:

CREATE TABLE orders (
    order_id    BIGINT AUTO_INCREMENT PRIMARY KEY,
    customer_id BIGINT         NOT NULL,
    amount      DECIMAL(19, 2) NOT NULL,
    order_date  DATE           NOT NULL,
    status      VARCHAR(20)    NOT NULL,
    created_at  TIMESTAMP      DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_customer (customer_id),
    INDEX idx_status   (status)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

Reading a CSV File

The domain object

@Data
@NoArgsConstructor
@AllArgsConstructor
public class Order {
    private Long    customerId;
    private BigDecimal amount;
    private LocalDate  orderDate;
    private String     status;
}

Sample CSV (src/main/resources/data/orders.csv)

customerId,amount,orderDate,status
101,99.99,2026-05-01,COMPLETED
102,49.99,2026-05-02,PENDING
# test order — skip me
103,199.99,2026-05-03,SHIPPED

Reader configuration

@Bean
public FlatFileItemReader<Order> orderCsvReader(
        @Value("classpath:data/orders.csv") Resource resource) {

    DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
    tokenizer.setDelimiter(",");
    tokenizer.setNames("customerId", "amount", "orderDate", "status");

    BeanWrapperFieldSetMapper<Order> mapper = new BeanWrapperFieldSetMapper<>();
    mapper.setTargetType(Order.class);

    DefaultLineMapper<Order> lineMapper = new DefaultLineMapper<>();
    lineMapper.setLineTokenizer(tokenizer);
    lineMapper.setFieldSetMapper(mapper);

    return new FlatFileItemReaderBuilder<Order>()
            .name("orderCsvReader")         // required for restartability
            .resource(resource)
            .lineMapper(lineMapper)
            .linesToSkip(1)                 // skip header row
            .encoding("UTF-8")
            .comments("#")                  // skip comment lines
            .saveState(true)                // persist line number on each commit
            .build();
}

Key points:

  • .name(...) is required — this string becomes the key under which the line number is saved in ExecutionContext.
  • .linesToSkip(1) skips the CSV header. Use .skippedLinesCallback(line -> log.info(...)) to validate or log the header.
  • .comments("#") silently skips any line starting with #.
  • BeanWrapperFieldSetMapper does fuzzy property matching: "customerId" in the tokenizer maps to setCustomerId() on the POJO. It handles simple type conversions (String → Long, String → BigDecimal) automatically.

Handling dates and custom types

BeanWrapperFieldSetMapper cannot parse LocalDate by default. Two options:

Option A — custom ConversionService (simple, no extra class):

@Bean
public BeanWrapperFieldSetMapper<Order> orderFieldSetMapper() {
    BeanWrapperFieldSetMapper<Order> mapper = new BeanWrapperFieldSetMapper<>();
    mapper.setTargetType(Order.class);

    DefaultConversionService cs = new DefaultConversionService();
    cs.addConverter(String.class, LocalDate.class,
            s -> LocalDate.parse(s, DateTimeFormatter.ofPattern("yyyy-MM-dd")));
    mapper.setConversionService(cs);
    return mapper;
}

Option B — implement FieldSetMapper yourself (more control):

public class OrderFieldSetMapper implements FieldSetMapper<Order> {

    private static final DateTimeFormatter FMT =
            DateTimeFormatter.ofPattern("yyyy-MM-dd");

    @Override
    public Order mapFieldSet(FieldSet fs) throws BindException {
        Order o = new Order();
        o.setCustomerId(fs.readLong("customerId"));
        o.setAmount(new BigDecimal(fs.readString("amount")));
        o.setOrderDate(LocalDate.parse(fs.readString("orderDate"), FMT));
        o.setStatus(fs.readString("status").toUpperCase());
        return o;
    }
}

Use it in the reader:

lineMapper.setFieldSetMapper(new OrderFieldSetMapper());

Reading a Fixed-Width File

Fixed-width files are common in legacy integrations (mainframes, banks, government systems). Each field occupies a fixed character range.

Sample file (trades.dat)

UK21341EAH41 211 1.11customer1 
UK21341EAH42 221 2.22customer2 

Fields: ISIN (1–12), quantity (13–15), price (16–20), customer (21–30).

Reader configuration

@Bean
public FlatFileItemReader<Trade> tradeFixedWidthReader(
        @Value("file:/data/trades.dat") Resource resource) {

    FixedLengthTokenizer tokenizer = new FixedLengthTokenizer();
    tokenizer.setNames("isin", "quantity", "price", "customer");
    tokenizer.setColumns(
            new Range(1, 12),   // ISIN
            new Range(13, 15),  // quantity
            new Range(16, 20),  // price
            new Range(21, 30)   // customer
    );
    tokenizer.setStrict(false); // allow lines shorter than the last range end

    BeanWrapperFieldSetMapper<Trade> mapper = new BeanWrapperFieldSetMapper<>();
    mapper.setTargetType(Trade.class);

    DefaultLineMapper<Trade> lineMapper = new DefaultLineMapper<>();
    lineMapper.setLineTokenizer(tokenizer);
    lineMapper.setFieldSetMapper(mapper);

    return new FlatFileItemReaderBuilder<Trade>()
            .name("tradeFixedWidthReader")
            .resource(resource)
            .lineMapper(lineMapper)
            .build();
}

Range uses 1-based inclusive positions. setStrict(false) prevents IncorrectLineLengthException when the last field is optional or trailing spaces are trimmed.


Custom Delimiters

Switch delimiter for pipe-separated or tab-separated files:

tokenizer.setDelimiter("|");   // pipe-separated
tokenizer.setDelimiter("\t");  // tab-separated
tokenizer.setDelimiter(";");   // semicolon-separated

To treat consecutive delimiters as a single delimiter (like awk):

tokenizer.setDelimiter(DelimitedLineTokenizer.DELIMITER_COMMA);
tokenizer.setQuoteCharacter('"');  // handle quoted fields with embedded commas

Skipping Headers with a Callback

Use a callback to validate or log the header line rather than silently discarding it:

return new FlatFileItemReaderBuilder<Order>()
        .name("orderCsvReader")
        .resource(resource)
        .lineMapper(lineMapper)
        .linesToSkip(1)
        .skippedLinesCallback(line -> {
            String[] cols = line.split(",");
            if (!cols[0].equalsIgnoreCase("customerId")) {
                throw new IllegalStateException("Unexpected header: " + line);
            }
        })
        .build();

Multi-Line Records

Some formats spread a single logical record across multiple physical lines. Use a RecordSeparatorPolicy to tell the reader when a record ends.

Records terminated by a semicolon

public class SemicolonRecordSeparatorPolicy implements RecordSeparatorPolicy {

    @Override
    public boolean isEndOfRecord(String line) {
        return line.trim().endsWith(";");
    }

    @Override
    public String postProcess(String record) {
        // strip the trailing semicolon before handing to LineMapper
        return record.endsWith(";")
                ? record.substring(0, record.length() - 1)
                : record;
    }

    @Override
    public String preProcess(String line) {
        return line;
    }
}
return new FlatFileItemReaderBuilder<Order>()
        .name("multiLineReader")
        .resource(resource)
        .lineMapper(lineMapper)
        .recordSeparatorPolicy(new SemicolonRecordSeparatorPolicy())
        .build();

Reading Multiple Files with MultiResourceItemReader

When your input arrives as a set of daily or partitioned files, MultiResourceItemReader wraps a delegate FlatFileItemReader and reads them sequentially.

@Bean
public FlatFileItemReader<Order> delegateReader() {
    // same CSV reader as above, but NO resource set here
    DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
    tokenizer.setNames("customerId", "amount", "orderDate", "status");

    DefaultLineMapper<Order> lineMapper = new DefaultLineMapper<>();
    lineMapper.setLineTokenizer(tokenizer);
    lineMapper.setFieldSetMapper(new OrderFieldSetMapper());

    return new FlatFileItemReaderBuilder<Order>()
            .name("delegateOrderReader")
            .lineMapper(lineMapper)
            .linesToSkip(1)
            .encoding("UTF-8")
            .comments("#")
            .build();
}

@Bean
public MultiResourceItemReader<Order> multiFileOrderReader(
        FlatFileItemReader<Order> delegateReader,
        ResourcePatternResolver resolver) throws IOException {

    Resource[] files = resolver.getResources("file:/data/orders/orders_*.csv");
    // sort for deterministic ordering (critical for restart)
    Arrays.sort(files, Comparator.comparing(Resource::getFilename));

    return new MultiResourceItemReaderBuilder<Order>()
            .name("multiFileOrderReader")
            .delegate(delegateReader)
            .resources(files)
            .build();
}

MultiResourceItemReader persists its current resource index and the delegate’s line number in ExecutionContext. On restart it resumes at the exact file + line where the job failed.


Complete Order Import Job

Putting it all together: CSV → processor → MySQL.

Writer

@Bean
public JdbcBatchItemWriter<Order> orderWriter(DataSource dataSource) {
    return new JdbcBatchItemWriterBuilder<Order>()
            .dataSource(dataSource)
            .sql("INSERT INTO orders (customer_id, amount, order_date, status) " +
                 "VALUES (:customerId, :amount, :orderDate, :status)")
            .beanMapped()
            .build();
}

Step and Job

@Bean
public Step importOrdersStep(JobRepository jobRepository,
                              PlatformTransactionManager tx,
                              FlatFileItemReader<Order> orderCsvReader,
                              JdbcBatchItemWriter<Order> orderWriter) {
    return new StepBuilder("importOrdersStep", jobRepository)
            .<Order, Order>chunk(100, tx)
            .reader(orderCsvReader)
            .writer(orderWriter)
            .faultTolerant()
            .skip(FlatFileParseException.class)
            .skipLimit(50)
            .listener(new OrderSkipListener())
            .build();
}

@Bean
public Job importOrdersJob(JobRepository jobRepository, Step importOrdersStep) {
    return new JobBuilder("importOrdersJob", jobRepository)
            .start(importOrdersStep)
            .build();
}

Skip listener — log bad lines

public class OrderSkipListener implements SkipListener<Order, Order> {

    private static final Logger log = LoggerFactory.getLogger(OrderSkipListener.class);

    @Override
    public void onSkipInRead(Throwable t) {
        if (t instanceof FlatFileParseException ex) {
            log.warn("Skipped malformed line {}: [{}]", ex.getLineNumber(), ex.getInput());
        }
    }

    @Override public void onSkipInWrite(Order item, Throwable t) {}
    @Override public void onSkipInProcess(Order item, Throwable t) {}
}

application.properties

spring.datasource.url=jdbc:mysql://localhost:3306/batch_db
spring.datasource.username=root
spring.datasource.password=secret
spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver

spring.batch.jdbc.initialize-schema=always
spring.batch.job.enabled=false

Running

# Run via CommandLineJobRunner
java -jar target/app.jar \
  --spring.batch.job.names=importOrdersJob \
  runDate=2026-05-03

Restart in Action

When the job fails at line 4 823 of a large CSV:

  1. FlatFileItemReader stores {orderCsvReader.read.count: 4823} in BATCH_STEP_EXECUTION_CONTEXT.
  2. You fix the bad data and relaunch with the same runDate parameter.
  3. Spring Batch finds the existing FAILED JobInstance, creates a new JobExecution, and restores the step context.
  4. FlatFileItemReader.open() reads the persisted count and seeks to line 4 824.
  5. Processing resumes — lines 1–4 823 are not reprocessed.

This works automatically as long as:

  • The reader has a unique .name(...).
  • .saveState(true) (the default).
  • The JobInstance uses the same identifying parameters on relaunch.

Summary

ScenarioClassKey config
CSV fileFlatFileItemReader + DelimitedLineTokenizer.delimiter(",")
Fixed-widthFlatFileItemReader + FixedLengthTokenizer.setColumns(new Range(...))
Skip headerlinesToSkip(1)Optional callback with skippedLinesCallback
Comment lines.comments("#")Multiple prefixes allowed
Multi-line recordsCustom RecordSeparatorPolicyOverride isEndOfRecord
Multiple filesMultiResourceItemReaderSort resources for restart safety
Custom type mappingFieldSetMapper implOverride mapFieldSet
Skip bad lines.faultTolerant().skip(FlatFileParseException.class)Log in SkipListener

What’s Next

Article 6 covers reading from MySQL using JdbcCursorItemReader and JdbcPagingItemReader — including when to choose cursor vs. pagination, thread-safety considerations, and how to use JdbcPagingItemReader in multi-threaded steps.