Reading Flat Files: CSV, Fixed-Width, and Delimited with FlatFileItemReader
Introduction
Flat files — CSV exports, fixed-width mainframe feeds, pipe-delimited data dumps — are the most common input source for batch jobs. Spring Batch’s FlatFileItemReader handles all of them. It is restartable out of the box: it persists its line-number position in the ExecutionContext so that a restarted job resumes exactly where it crashed.
In this article you will build a complete order-import job that reads a CSV file and inserts rows into a MySQL orders table. Along the way you will learn every important configuration option: delimiters, fixed-width ranges, header skipping, comment lines, multi-file reading, custom mappers, and error handling.
How FlatFileItemReader Works
Every FlatFileItemReader needs two things:
- A
Resource— the file to read (classpath, filesystem, S3, etc.) - A
LineMapper— converts a rawStringline into a domain object
The default DefaultLineMapper splits the work into two phases:
Raw line → LineTokenizer → FieldSet → FieldSetMapper → Domain object
LineTokenizer parses the line into named fields. FieldSetMapper converts those fields into your POJO.
Project Setup
Add this dependency to your pom.xml (if not already present from Article 3):
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
</dependency>
<dependency>
<groupId>com.mysql</groupId>
<artifactId>mysql-connector-j</artifactId>
<scope>runtime</scope>
</dependency>
Create the orders table in MySQL:
CREATE TABLE orders (
order_id BIGINT AUTO_INCREMENT PRIMARY KEY,
customer_id BIGINT NOT NULL,
amount DECIMAL(19, 2) NOT NULL,
order_date DATE NOT NULL,
status VARCHAR(20) NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
INDEX idx_customer (customer_id),
INDEX idx_status (status)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
Reading a CSV File
The domain object
@Data
@NoArgsConstructor
@AllArgsConstructor
public class Order {
private Long customerId;
private BigDecimal amount;
private LocalDate orderDate;
private String status;
}
Sample CSV (src/main/resources/data/orders.csv)
customerId,amount,orderDate,status
101,99.99,2026-05-01,COMPLETED
102,49.99,2026-05-02,PENDING
# test order — skip me
103,199.99,2026-05-03,SHIPPED
Reader configuration
@Bean
public FlatFileItemReader<Order> orderCsvReader(
@Value("classpath:data/orders.csv") Resource resource) {
DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
tokenizer.setDelimiter(",");
tokenizer.setNames("customerId", "amount", "orderDate", "status");
BeanWrapperFieldSetMapper<Order> mapper = new BeanWrapperFieldSetMapper<>();
mapper.setTargetType(Order.class);
DefaultLineMapper<Order> lineMapper = new DefaultLineMapper<>();
lineMapper.setLineTokenizer(tokenizer);
lineMapper.setFieldSetMapper(mapper);
return new FlatFileItemReaderBuilder<Order>()
.name("orderCsvReader") // required for restartability
.resource(resource)
.lineMapper(lineMapper)
.linesToSkip(1) // skip header row
.encoding("UTF-8")
.comments("#") // skip comment lines
.saveState(true) // persist line number on each commit
.build();
}
Key points:
.name(...)is required — this string becomes the key under which the line number is saved inExecutionContext..linesToSkip(1)skips the CSV header. Use.skippedLinesCallback(line -> log.info(...))to validate or log the header..comments("#")silently skips any line starting with#.BeanWrapperFieldSetMapperdoes fuzzy property matching:"customerId"in the tokenizer maps tosetCustomerId()on the POJO. It handles simple type conversions (String → Long, String → BigDecimal) automatically.
Handling dates and custom types
BeanWrapperFieldSetMapper cannot parse LocalDate by default. Two options:
Option A — custom ConversionService (simple, no extra class):
@Bean
public BeanWrapperFieldSetMapper<Order> orderFieldSetMapper() {
BeanWrapperFieldSetMapper<Order> mapper = new BeanWrapperFieldSetMapper<>();
mapper.setTargetType(Order.class);
DefaultConversionService cs = new DefaultConversionService();
cs.addConverter(String.class, LocalDate.class,
s -> LocalDate.parse(s, DateTimeFormatter.ofPattern("yyyy-MM-dd")));
mapper.setConversionService(cs);
return mapper;
}
Option B — implement FieldSetMapper yourself (more control):
public class OrderFieldSetMapper implements FieldSetMapper<Order> {
private static final DateTimeFormatter FMT =
DateTimeFormatter.ofPattern("yyyy-MM-dd");
@Override
public Order mapFieldSet(FieldSet fs) throws BindException {
Order o = new Order();
o.setCustomerId(fs.readLong("customerId"));
o.setAmount(new BigDecimal(fs.readString("amount")));
o.setOrderDate(LocalDate.parse(fs.readString("orderDate"), FMT));
o.setStatus(fs.readString("status").toUpperCase());
return o;
}
}
Use it in the reader:
lineMapper.setFieldSetMapper(new OrderFieldSetMapper());
Reading a Fixed-Width File
Fixed-width files are common in legacy integrations (mainframes, banks, government systems). Each field occupies a fixed character range.
Sample file (trades.dat)
UK21341EAH41 211 1.11customer1
UK21341EAH42 221 2.22customer2
Fields: ISIN (1–12), quantity (13–15), price (16–20), customer (21–30).
Reader configuration
@Bean
public FlatFileItemReader<Trade> tradeFixedWidthReader(
@Value("file:/data/trades.dat") Resource resource) {
FixedLengthTokenizer tokenizer = new FixedLengthTokenizer();
tokenizer.setNames("isin", "quantity", "price", "customer");
tokenizer.setColumns(
new Range(1, 12), // ISIN
new Range(13, 15), // quantity
new Range(16, 20), // price
new Range(21, 30) // customer
);
tokenizer.setStrict(false); // allow lines shorter than the last range end
BeanWrapperFieldSetMapper<Trade> mapper = new BeanWrapperFieldSetMapper<>();
mapper.setTargetType(Trade.class);
DefaultLineMapper<Trade> lineMapper = new DefaultLineMapper<>();
lineMapper.setLineTokenizer(tokenizer);
lineMapper.setFieldSetMapper(mapper);
return new FlatFileItemReaderBuilder<Trade>()
.name("tradeFixedWidthReader")
.resource(resource)
.lineMapper(lineMapper)
.build();
}
Range uses 1-based inclusive positions. setStrict(false) prevents IncorrectLineLengthException when the last field is optional or trailing spaces are trimmed.
Custom Delimiters
Switch delimiter for pipe-separated or tab-separated files:
tokenizer.setDelimiter("|"); // pipe-separated
tokenizer.setDelimiter("\t"); // tab-separated
tokenizer.setDelimiter(";"); // semicolon-separated
To treat consecutive delimiters as a single delimiter (like awk):
tokenizer.setDelimiter(DelimitedLineTokenizer.DELIMITER_COMMA);
tokenizer.setQuoteCharacter('"'); // handle quoted fields with embedded commas
Skipping Headers with a Callback
Use a callback to validate or log the header line rather than silently discarding it:
return new FlatFileItemReaderBuilder<Order>()
.name("orderCsvReader")
.resource(resource)
.lineMapper(lineMapper)
.linesToSkip(1)
.skippedLinesCallback(line -> {
String[] cols = line.split(",");
if (!cols[0].equalsIgnoreCase("customerId")) {
throw new IllegalStateException("Unexpected header: " + line);
}
})
.build();
Multi-Line Records
Some formats spread a single logical record across multiple physical lines. Use a RecordSeparatorPolicy to tell the reader when a record ends.
Records terminated by a semicolon
public class SemicolonRecordSeparatorPolicy implements RecordSeparatorPolicy {
@Override
public boolean isEndOfRecord(String line) {
return line.trim().endsWith(";");
}
@Override
public String postProcess(String record) {
// strip the trailing semicolon before handing to LineMapper
return record.endsWith(";")
? record.substring(0, record.length() - 1)
: record;
}
@Override
public String preProcess(String line) {
return line;
}
}
return new FlatFileItemReaderBuilder<Order>()
.name("multiLineReader")
.resource(resource)
.lineMapper(lineMapper)
.recordSeparatorPolicy(new SemicolonRecordSeparatorPolicy())
.build();
Reading Multiple Files with MultiResourceItemReader
When your input arrives as a set of daily or partitioned files, MultiResourceItemReader wraps a delegate FlatFileItemReader and reads them sequentially.
@Bean
public FlatFileItemReader<Order> delegateReader() {
// same CSV reader as above, but NO resource set here
DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
tokenizer.setNames("customerId", "amount", "orderDate", "status");
DefaultLineMapper<Order> lineMapper = new DefaultLineMapper<>();
lineMapper.setLineTokenizer(tokenizer);
lineMapper.setFieldSetMapper(new OrderFieldSetMapper());
return new FlatFileItemReaderBuilder<Order>()
.name("delegateOrderReader")
.lineMapper(lineMapper)
.linesToSkip(1)
.encoding("UTF-8")
.comments("#")
.build();
}
@Bean
public MultiResourceItemReader<Order> multiFileOrderReader(
FlatFileItemReader<Order> delegateReader,
ResourcePatternResolver resolver) throws IOException {
Resource[] files = resolver.getResources("file:/data/orders/orders_*.csv");
// sort for deterministic ordering (critical for restart)
Arrays.sort(files, Comparator.comparing(Resource::getFilename));
return new MultiResourceItemReaderBuilder<Order>()
.name("multiFileOrderReader")
.delegate(delegateReader)
.resources(files)
.build();
}
MultiResourceItemReader persists its current resource index and the delegate’s line number in ExecutionContext. On restart it resumes at the exact file + line where the job failed.
Complete Order Import Job
Putting it all together: CSV → processor → MySQL.
Writer
@Bean
public JdbcBatchItemWriter<Order> orderWriter(DataSource dataSource) {
return new JdbcBatchItemWriterBuilder<Order>()
.dataSource(dataSource)
.sql("INSERT INTO orders (customer_id, amount, order_date, status) " +
"VALUES (:customerId, :amount, :orderDate, :status)")
.beanMapped()
.build();
}
Step and Job
@Bean
public Step importOrdersStep(JobRepository jobRepository,
PlatformTransactionManager tx,
FlatFileItemReader<Order> orderCsvReader,
JdbcBatchItemWriter<Order> orderWriter) {
return new StepBuilder("importOrdersStep", jobRepository)
.<Order, Order>chunk(100, tx)
.reader(orderCsvReader)
.writer(orderWriter)
.faultTolerant()
.skip(FlatFileParseException.class)
.skipLimit(50)
.listener(new OrderSkipListener())
.build();
}
@Bean
public Job importOrdersJob(JobRepository jobRepository, Step importOrdersStep) {
return new JobBuilder("importOrdersJob", jobRepository)
.start(importOrdersStep)
.build();
}
Skip listener — log bad lines
public class OrderSkipListener implements SkipListener<Order, Order> {
private static final Logger log = LoggerFactory.getLogger(OrderSkipListener.class);
@Override
public void onSkipInRead(Throwable t) {
if (t instanceof FlatFileParseException ex) {
log.warn("Skipped malformed line {}: [{}]", ex.getLineNumber(), ex.getInput());
}
}
@Override public void onSkipInWrite(Order item, Throwable t) {}
@Override public void onSkipInProcess(Order item, Throwable t) {}
}
application.properties
spring.datasource.url=jdbc:mysql://localhost:3306/batch_db
spring.datasource.username=root
spring.datasource.password=secret
spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
spring.batch.jdbc.initialize-schema=always
spring.batch.job.enabled=false
Running
# Run via CommandLineJobRunner
java -jar target/app.jar \
--spring.batch.job.names=importOrdersJob \
runDate=2026-05-03
Restart in Action
When the job fails at line 4 823 of a large CSV:
FlatFileItemReaderstores{orderCsvReader.read.count: 4823}inBATCH_STEP_EXECUTION_CONTEXT.- You fix the bad data and relaunch with the same
runDateparameter. - Spring Batch finds the existing
FAILEDJobInstance, creates a newJobExecution, and restores the step context. FlatFileItemReader.open()reads the persisted count and seeks to line 4 824.- Processing resumes — lines 1–4 823 are not reprocessed.
This works automatically as long as:
- The reader has a unique
.name(...). .saveState(true)(the default).- The
JobInstanceuses the same identifying parameters on relaunch.
Summary
| Scenario | Class | Key config |
|---|---|---|
| CSV file | FlatFileItemReader + DelimitedLineTokenizer | .delimiter(",") |
| Fixed-width | FlatFileItemReader + FixedLengthTokenizer | .setColumns(new Range(...)) |
| Skip header | linesToSkip(1) | Optional callback with skippedLinesCallback |
| Comment lines | .comments("#") | Multiple prefixes allowed |
| Multi-line records | Custom RecordSeparatorPolicy | Override isEndOfRecord |
| Multiple files | MultiResourceItemReader | Sort resources for restart safety |
| Custom type mapping | FieldSetMapper impl | Override mapFieldSet |
| Skip bad lines | .faultTolerant().skip(FlatFileParseException.class) | Log in SkipListener |
What’s Next
Article 6 covers reading from MySQL using JdbcCursorItemReader and JdbcPagingItemReader — including when to choose cursor vs. pagination, thread-safety considerations, and how to use JdbcPagingItemReader in multi-threaded steps.