Part 6 of 16

Streams API: Lazy Pipelines and the Functional Data Model

The Problem Streams Solve

Consider filtering a list of orders to find the names of active premium customers who spent over $500, sorted alphabetically:

// Java 7
List<String> result = new ArrayList<>();
for (Order order : orders) {
    if (order.isActive() && order.isPremium() && order.getTotal() > 500) {
        result.add(order.getCustomerName());
    }
}
Collections.sort(result);

This code is correct but reveals nothing about the structure of the computation at a glance. You have to read every line to understand what’s happening.

// Java 8 Streams
List<String> result = orders.stream()
    .filter(Order::isActive)
    .filter(Order::isPremium)
    .filter(o -> o.getTotal() > 500)
    .map(Order::getCustomerName)
    .sorted()
    .collect(Collectors.toList());

The pipeline reads top-to-bottom like a specification: filter active, filter premium, filter by amount, extract names, sort, collect. The structure of the computation is immediately visible.


What Is a Stream?

A Stream<T> is a sequence of elements that supports sequential and parallel aggregate operations. Key properties:

  • Not a data structure — a stream does not hold data; it moves data through a pipeline
  • Lazy — intermediate operations are not executed until a terminal operation is called
  • Single-use — once a terminal operation is called, the stream is consumed and cannot be reused
  • Non-destructive — stream operations never modify the source collection
  • Optionally parallel — swap .stream() for .parallelStream() and the pipeline runs in parallel

Creating Streams

From Collections

List<String> list = Arrays.asList("a", "b", "c");
Stream<String> stream = list.stream();
Stream<String> parallel = list.parallelStream();

From Arrays

String[] arr = {"x", "y", "z"};
Stream<String> stream = Arrays.stream(arr);

// Partial array
Stream<String> partial = Arrays.stream(arr, 1, 3); // "y", "z"

From Values (Stream.of)

Stream<String> stream = Stream.of("a", "b", "c");
Stream<String> empty  = Stream.empty();
Stream<String> single = Stream.of("one");

From a Range (IntStream, LongStream)

IntStream range = IntStream.range(0, 10);      // 0..9
IntStream rangeClosed = IntStream.rangeClosed(1, 5); // 1..5
LongStream longRange = LongStream.range(0L, 100L);

Infinite Streams

// Stream.iterate: seed + unary operator
Stream<Integer> naturals = Stream.iterate(0, n -> n + 1);
// 0, 1, 2, 3, 4, ...

// Stream.generate: supplier
Stream<Double> randoms = Stream.generate(Math::random);

// Always limit infinite streams before collecting
naturals.limit(10).forEach(System.out::println);

From Files and I/O

// Lines of a file (Java 8+)
try (Stream<String> lines = Files.lines(Paths.get("data.txt"))) {
    lines.filter(l -> !l.isBlank())
         .forEach(System.out::println);
}
// IMPORTANT: always use try-with-resources — streams backed by I/O hold a resource

// Files in a directory
try (Stream<Path> paths = Files.list(Paths.get("."))) {
    paths.filter(Files::isRegularFile)
         .forEach(System.out::println);
}

From String Characters

// IntStream of char values
"Hello".chars().forEach(c -> System.out.print((char) c));

The Pipeline Model

A stream pipeline has three parts:

source → [intermediate ops]* → terminal op
  1. Source — a collection, array, or generator
  2. Intermediate operations — transform the stream; they return a new Stream and are lazy
  3. Terminal operation — produces a result or side effect; it triggers execution of the entire pipeline
List<String> result = names.stream()           // 1. source
    .filter(s -> s.length() > 3)               // 2. intermediate
    .map(String::toUpperCase)                  // 2. intermediate
    .sorted()                                  // 2. intermediate
    .collect(Collectors.toList());             // 3. terminal — triggers execution

Without the terminal operation, nothing runs. This is the key insight:

Stream<String> pipeline = names.stream()
    .filter(s -> { System.out.println("filter: " + s); return s.length() > 3; })
    .map(s -> { System.out.println("map: " + s); return s.toUpperCase(); });

// Nothing printed yet — pipeline not started
System.out.println("Before terminal");

List<String> result = pipeline.collect(Collectors.toList()); // NOW it runs

Output:

Before terminal
filter: Alice
map: Alice
filter: Bob
filter: Charlie
map: Charlie

Note the interleaving: Java processes elements one at a time through the pipeline, not stage-by-stage.


Intermediate Operations

filter — keep elements matching a Predicate

stream.filter(s -> s.startsWith("A"))
stream.filter(Predicate.not(String::isEmpty))  // Java 11+

map — transform elements with a Function

stream.map(String::toUpperCase)
stream.map(s -> s.length())  // Stream<String> → Stream<Integer>

mapToInt / mapToLong / mapToDouble

Avoid boxing overhead when mapping to primitives:

// Creates IntStream, not Stream<Integer> — no boxing
IntStream lengths = names.stream().mapToInt(String::length);
int totalLength = lengths.sum();

// Box back to object stream if needed
Stream<Integer> boxed = IntStream.range(0, 10).boxed();

flatMap — flatten nested streams

// Each order has a list of items
List<Item> allItems = orders.stream()
    .flatMap(order -> order.getItems().stream())
    .collect(Collectors.toList());

// Split sentences into words
List<String> words = sentences.stream()
    .flatMap(s -> Arrays.stream(s.split(" ")))
    .distinct()
    .collect(Collectors.toList());

flatMap maps each element to a stream, then flattens all those streams into one.

distinct — remove duplicates

stream.distinct()  // uses equals/hashCode

sorted — sort elements

stream.sorted()                                    // natural order (Comparable)
stream.sorted(Comparator.reverseOrder())           // reverse natural
stream.sorted(Comparator.comparing(Person::getAge)) // by field

limit — take the first N elements

stream.limit(5)

skip — skip the first N elements

stream.skip(10)  // useful for pagination

peek — inspect elements without consuming (debugging)

stream.peek(s -> System.out.println("Before map: " + s))
      .map(String::toUpperCase)
      .peek(s -> System.out.println("After map: " + s))

Use peek only for debugging. Do not use it for side effects in production code — its behaviour with short-circuiting and parallel streams is unreliable.


Terminal Operations

collect — accumulate into a collection

The most common terminal operation. See Article 7 for the full Collectors guide.

List<String> list   = stream.collect(Collectors.toList());
Set<String> set     = stream.collect(Collectors.toSet());
String joined       = stream.collect(Collectors.joining(", "));

forEach — execute a Consumer for each element

stream.forEach(System.out::println);

Order is not guaranteed for parallel streams. Use forEachOrdered if order matters.

count

long count = stream.filter(s -> s.length() > 3).count();

reduce — fold elements into one value

Optional<Integer> sum = numbers.stream().reduce((a, b) -> a + b);
int sumWithIdentity = numbers.stream().reduce(0, Integer::sum);

findFirst / findAny

Optional<String> first = stream.filter(s -> s.startsWith("A")).findFirst();
Optional<String> any   = parallelStream.filter(s -> s.startsWith("A")).findAny();

findAny is faster in parallel pipelines when you don’t care which matching element you get.

anyMatch / allMatch / noneMatch

Short-circuit terminal operations:

boolean hasLong   = names.stream().anyMatch(s -> s.length() > 10);
boolean allShort  = names.stream().allMatch(s -> s.length() < 20);
boolean noneEmpty = names.stream().noneMatch(String::isEmpty);

anyMatch stops as soon as it finds a match; allMatch stops as soon as it finds a non-match.

min / max

Optional<String> shortest = names.stream().min(Comparator.comparingInt(String::length));
Optional<String> longest  = names.stream().max(Comparator.comparingInt(String::length));

toArray

Object[] arr   = stream.toArray();
String[] strArr = stream.toArray(String[]::new);  // constructor reference

sum / average / summaryStatistics (primitive streams)

int total = numbers.stream().mapToInt(Integer::intValue).sum();
OptionalDouble avg = numbers.stream().mapToInt(Integer::intValue).average();

IntSummaryStatistics stats = numbers.stream()
    .mapToInt(Integer::intValue)
    .summaryStatistics();
// stats.getCount(), getSum(), getMin(), getMax(), getAverage()

Laziness and Short-Circuiting

Laziness means intermediate operations accumulate a description of the pipeline without doing work. The terminal operation drives execution.

Short-circuiting means some operations stop processing early:

// Only processes elements until it finds the first match
Optional<String> first = Stream.iterate(0, n -> n + 1)
    .map(n -> n * n)
    .filter(n -> n > 100)
    .map(Object::toString)
    .findFirst();
// Does NOT process all integers — stops after finding 121

Short-circuiting terminal operations: findFirst, findAny, anyMatch, allMatch, noneMatch, limit.


Stateful vs. Stateless Operations

Stateless operations process each element independently: filter, map, flatMap, peek, mapToInt.

Stateful operations require seeing other elements to produce a result: sorted, distinct, limit, skip.

Stateful operations can be expensive in parallel streams because they require coordination across threads.


Common Pitfalls

Reusing a Stream

Stream<String> stream = names.stream().filter(s -> s.length() > 3);
stream.collect(Collectors.toList()); // OK
stream.count(); // THROWS: IllegalStateException: stream has already been operated upon or closed

Always create a new stream from the source for each pipeline.

Forgetting the Terminal Operation

// Does nothing — no terminal op
names.stream().filter(s -> s.length() > 3).map(String::toUpperCase);

// Fix: add terminal op
names.stream().filter(s -> s.length() > 3).map(String::toUpperCase).forEach(System.out::println);

Using forEach When collect Is Better

// WRONG: side-effectful, not thread-safe, harder to reason about
List<String> result = new ArrayList<>();
names.stream().filter(s -> s.length() > 3).forEach(result::add);

// RIGHT
List<String> result = names.stream().filter(s -> s.length() > 3).collect(Collectors.toList());

Summary

ConceptKey point
Lazy evaluationIntermediate ops accumulate; terminal op triggers execution
Pipeline modelsource → intermediate ops → terminal op
Stateless vs statefulfilter/map are stateless; sorted/distinct are stateful
Short-circuitingfindFirst, anyMatch, limit stop early
Single useA consumed stream cannot be reused

Next Step

Advanced Streams: flatMap, Collectors, Grouping, and Partitioning →

Part of the DevOps Monk Java tutorial series: Java 8Java 11Java 17Java 21