Streams API: Lazy Pipelines and the Functional Data Model
The Problem Streams Solve
Consider filtering a list of orders to find the names of active premium customers who spent over $500, sorted alphabetically:
// Java 7
List<String> result = new ArrayList<>();
for (Order order : orders) {
if (order.isActive() && order.isPremium() && order.getTotal() > 500) {
result.add(order.getCustomerName());
}
}
Collections.sort(result);
This code is correct but reveals nothing about the structure of the computation at a glance. You have to read every line to understand what’s happening.
// Java 8 Streams
List<String> result = orders.stream()
.filter(Order::isActive)
.filter(Order::isPremium)
.filter(o -> o.getTotal() > 500)
.map(Order::getCustomerName)
.sorted()
.collect(Collectors.toList());
The pipeline reads top-to-bottom like a specification: filter active, filter premium, filter by amount, extract names, sort, collect. The structure of the computation is immediately visible.
What Is a Stream?
A Stream<T> is a sequence of elements that supports sequential and parallel aggregate operations. Key properties:
- Not a data structure — a stream does not hold data; it moves data through a pipeline
- Lazy — intermediate operations are not executed until a terminal operation is called
- Single-use — once a terminal operation is called, the stream is consumed and cannot be reused
- Non-destructive — stream operations never modify the source collection
- Optionally parallel — swap
.stream()for.parallelStream()and the pipeline runs in parallel
Creating Streams
From Collections
List<String> list = Arrays.asList("a", "b", "c");
Stream<String> stream = list.stream();
Stream<String> parallel = list.parallelStream();
From Arrays
String[] arr = {"x", "y", "z"};
Stream<String> stream = Arrays.stream(arr);
// Partial array
Stream<String> partial = Arrays.stream(arr, 1, 3); // "y", "z"
From Values (Stream.of)
Stream<String> stream = Stream.of("a", "b", "c");
Stream<String> empty = Stream.empty();
Stream<String> single = Stream.of("one");
From a Range (IntStream, LongStream)
IntStream range = IntStream.range(0, 10); // 0..9
IntStream rangeClosed = IntStream.rangeClosed(1, 5); // 1..5
LongStream longRange = LongStream.range(0L, 100L);
Infinite Streams
// Stream.iterate: seed + unary operator
Stream<Integer> naturals = Stream.iterate(0, n -> n + 1);
// 0, 1, 2, 3, 4, ...
// Stream.generate: supplier
Stream<Double> randoms = Stream.generate(Math::random);
// Always limit infinite streams before collecting
naturals.limit(10).forEach(System.out::println);
From Files and I/O
// Lines of a file (Java 8+)
try (Stream<String> lines = Files.lines(Paths.get("data.txt"))) {
lines.filter(l -> !l.isBlank())
.forEach(System.out::println);
}
// IMPORTANT: always use try-with-resources — streams backed by I/O hold a resource
// Files in a directory
try (Stream<Path> paths = Files.list(Paths.get("."))) {
paths.filter(Files::isRegularFile)
.forEach(System.out::println);
}
From String Characters
// IntStream of char values
"Hello".chars().forEach(c -> System.out.print((char) c));
The Pipeline Model
A stream pipeline has three parts:
source → [intermediate ops]* → terminal op
- Source — a collection, array, or generator
- Intermediate operations — transform the stream; they return a new
Streamand are lazy - Terminal operation — produces a result or side effect; it triggers execution of the entire pipeline
List<String> result = names.stream() // 1. source
.filter(s -> s.length() > 3) // 2. intermediate
.map(String::toUpperCase) // 2. intermediate
.sorted() // 2. intermediate
.collect(Collectors.toList()); // 3. terminal — triggers execution
Without the terminal operation, nothing runs. This is the key insight:
Stream<String> pipeline = names.stream()
.filter(s -> { System.out.println("filter: " + s); return s.length() > 3; })
.map(s -> { System.out.println("map: " + s); return s.toUpperCase(); });
// Nothing printed yet — pipeline not started
System.out.println("Before terminal");
List<String> result = pipeline.collect(Collectors.toList()); // NOW it runs
Output:
Before terminal
filter: Alice
map: Alice
filter: Bob
filter: Charlie
map: Charlie
Note the interleaving: Java processes elements one at a time through the pipeline, not stage-by-stage.
Intermediate Operations
filter — keep elements matching a Predicate
stream.filter(s -> s.startsWith("A"))
stream.filter(Predicate.not(String::isEmpty)) // Java 11+
map — transform elements with a Function
stream.map(String::toUpperCase)
stream.map(s -> s.length()) // Stream<String> → Stream<Integer>
mapToInt / mapToLong / mapToDouble
Avoid boxing overhead when mapping to primitives:
// Creates IntStream, not Stream<Integer> — no boxing
IntStream lengths = names.stream().mapToInt(String::length);
int totalLength = lengths.sum();
// Box back to object stream if needed
Stream<Integer> boxed = IntStream.range(0, 10).boxed();
flatMap — flatten nested streams
// Each order has a list of items
List<Item> allItems = orders.stream()
.flatMap(order -> order.getItems().stream())
.collect(Collectors.toList());
// Split sentences into words
List<String> words = sentences.stream()
.flatMap(s -> Arrays.stream(s.split(" ")))
.distinct()
.collect(Collectors.toList());
flatMap maps each element to a stream, then flattens all those streams into one.
distinct — remove duplicates
stream.distinct() // uses equals/hashCode
sorted — sort elements
stream.sorted() // natural order (Comparable)
stream.sorted(Comparator.reverseOrder()) // reverse natural
stream.sorted(Comparator.comparing(Person::getAge)) // by field
limit — take the first N elements
stream.limit(5)
skip — skip the first N elements
stream.skip(10) // useful for pagination
peek — inspect elements without consuming (debugging)
stream.peek(s -> System.out.println("Before map: " + s))
.map(String::toUpperCase)
.peek(s -> System.out.println("After map: " + s))
Use peek only for debugging. Do not use it for side effects in production code — its behaviour with short-circuiting and parallel streams is unreliable.
Terminal Operations
collect — accumulate into a collection
The most common terminal operation. See Article 7 for the full Collectors guide.
List<String> list = stream.collect(Collectors.toList());
Set<String> set = stream.collect(Collectors.toSet());
String joined = stream.collect(Collectors.joining(", "));
forEach — execute a Consumer for each element
stream.forEach(System.out::println);
Order is not guaranteed for parallel streams. Use forEachOrdered if order matters.
count
long count = stream.filter(s -> s.length() > 3).count();
reduce — fold elements into one value
Optional<Integer> sum = numbers.stream().reduce((a, b) -> a + b);
int sumWithIdentity = numbers.stream().reduce(0, Integer::sum);
findFirst / findAny
Optional<String> first = stream.filter(s -> s.startsWith("A")).findFirst();
Optional<String> any = parallelStream.filter(s -> s.startsWith("A")).findAny();
findAny is faster in parallel pipelines when you don’t care which matching element you get.
anyMatch / allMatch / noneMatch
Short-circuit terminal operations:
boolean hasLong = names.stream().anyMatch(s -> s.length() > 10);
boolean allShort = names.stream().allMatch(s -> s.length() < 20);
boolean noneEmpty = names.stream().noneMatch(String::isEmpty);
anyMatch stops as soon as it finds a match; allMatch stops as soon as it finds a non-match.
min / max
Optional<String> shortest = names.stream().min(Comparator.comparingInt(String::length));
Optional<String> longest = names.stream().max(Comparator.comparingInt(String::length));
toArray
Object[] arr = stream.toArray();
String[] strArr = stream.toArray(String[]::new); // constructor reference
sum / average / summaryStatistics (primitive streams)
int total = numbers.stream().mapToInt(Integer::intValue).sum();
OptionalDouble avg = numbers.stream().mapToInt(Integer::intValue).average();
IntSummaryStatistics stats = numbers.stream()
.mapToInt(Integer::intValue)
.summaryStatistics();
// stats.getCount(), getSum(), getMin(), getMax(), getAverage()
Laziness and Short-Circuiting
Laziness means intermediate operations accumulate a description of the pipeline without doing work. The terminal operation drives execution.
Short-circuiting means some operations stop processing early:
// Only processes elements until it finds the first match
Optional<String> first = Stream.iterate(0, n -> n + 1)
.map(n -> n * n)
.filter(n -> n > 100)
.map(Object::toString)
.findFirst();
// Does NOT process all integers — stops after finding 121
Short-circuiting terminal operations: findFirst, findAny, anyMatch, allMatch, noneMatch, limit.
Stateful vs. Stateless Operations
Stateless operations process each element independently: filter, map, flatMap, peek, mapToInt.
Stateful operations require seeing other elements to produce a result: sorted, distinct, limit, skip.
Stateful operations can be expensive in parallel streams because they require coordination across threads.
Common Pitfalls
Reusing a Stream
Stream<String> stream = names.stream().filter(s -> s.length() > 3);
stream.collect(Collectors.toList()); // OK
stream.count(); // THROWS: IllegalStateException: stream has already been operated upon or closed
Always create a new stream from the source for each pipeline.
Forgetting the Terminal Operation
// Does nothing — no terminal op
names.stream().filter(s -> s.length() > 3).map(String::toUpperCase);
// Fix: add terminal op
names.stream().filter(s -> s.length() > 3).map(String::toUpperCase).forEach(System.out::println);
Using forEach When collect Is Better
// WRONG: side-effectful, not thread-safe, harder to reason about
List<String> result = new ArrayList<>();
names.stream().filter(s -> s.length() > 3).forEach(result::add);
// RIGHT
List<String> result = names.stream().filter(s -> s.length() > 3).collect(Collectors.toList());
Summary
| Concept | Key point |
|---|---|
| Lazy evaluation | Intermediate ops accumulate; terminal op triggers execution |
| Pipeline model | source → intermediate ops → terminal op |
| Stateless vs stateful | filter/map are stateless; sorted/distinct are stateful |
| Short-circuiting | findFirst, anyMatch, limit stop early |
| Single use | A consumed stream cannot be reused |
Next Step
Advanced Streams: flatMap, Collectors, Grouping, and Partitioning →
Part of the DevOps Monk Java tutorial series: Java 8 → Java 11 → Java 17 → Java 21