Advanced Streams: flatMap, Collectors, Grouping, and Partitioning
flatMap in Depth
flatMap is the most powerful transformation in the Streams API. It maps each element to a stream, then flattens all those streams into one continuous stream.
// Without flatMap — stream of lists
Stream<List<String>> nested = Stream.of(
Arrays.asList("a", "b"),
Arrays.asList("c", "d"),
Arrays.asList("e")
);
// Stream<String>: "a", "b", "c", "d", "e"
Stream<String> flat = nested.flatMap(Collection::stream);
Practical flatMap Patterns
Flatten orders into line items:
List<LineItem> allItems = orders.stream()
.flatMap(order -> order.getLineItems().stream())
.collect(Collectors.toList());
Extract unique words from sentences:
List<String> uniqueWords = sentences.stream()
.flatMap(s -> Arrays.stream(s.split("\\s+")))
.map(String::toLowerCase)
.distinct()
.sorted()
.collect(Collectors.toList());
Build permission sets from roles:
Set<Permission> permissions = user.getRoles().stream()
.flatMap(role -> role.getPermissions().stream())
.collect(Collectors.toSet());
flatMap with Optional (Java 8 style):
// Find the first non-empty optional in a list
Optional<String> result = optionals.stream()
.filter(Optional::isPresent)
.map(Optional::get)
.findFirst();
// In Java 9+: optionals.stream().flatMap(Optional::stream)
Collectors in Depth
Collectors is a factory class with static methods that produce Collector instances for the collect() terminal operation.
Collecting to a List or Set
List<String> list = stream.collect(Collectors.toList());
Set<String> set = stream.collect(Collectors.toSet());
// Guaranteed mutable list
List<String> arrayList = stream.collect(Collectors.toCollection(ArrayList::new));
// Unmodifiable (Java 10+) — in Java 8, use Collections.unmodifiableList
Collecting to a Map (toMap)
// Simple key → value
Map<Long, String> idToName = users.stream()
.collect(Collectors.toMap(User::getId, User::getName));
// Value is the whole object
Map<Long, User> idToUser = users.stream()
.collect(Collectors.toMap(User::getId, Function.identity()));
// Handle duplicate keys with merge function
Map<String, Long> nameToId = users.stream()
.collect(Collectors.toMap(
User::getName,
User::getId,
(existing, replacement) -> existing // keep first on duplicate
));
// Control the map implementation
Map<String, User> nameToUser = users.stream()
.collect(Collectors.toMap(
User::getName,
Function.identity(),
(a, b) -> a,
LinkedHashMap::new // preserve insertion order
));
Pitfall: toMap throws IllegalStateException on duplicate keys unless you provide a merge function. Always provide a merge function if duplicates are possible.
Joining Strings
String csv = names.stream().collect(Collectors.joining(", "));
String wrapped = names.stream().collect(Collectors.joining(", ", "[", "]"));
// [Alice, Bob, Charlie]
Counting
long count = stream.collect(Collectors.counting());
// equivalent to stream.count() but usable as a downstream collector
Summing, Averaging, Summarising
int totalAge = users.stream().collect(Collectors.summingInt(User::getAge));
double avgAge = users.stream().collect(Collectors.averagingInt(User::getAge));
IntSummaryStatistics stats = users.stream()
.collect(Collectors.summarizingInt(User::getAge));
// stats.getCount(), getMin(), getMax(), getSum(), getAverage()
minBy / maxBy
Optional<User> oldest = users.stream()
.collect(Collectors.maxBy(Comparator.comparingInt(User::getAge)));
groupingBy
groupingBy partitions a stream into a Map<K, List<T>> by a classifier function:
// Group users by city
Map<String, List<User>> byCity = users.stream()
.collect(Collectors.groupingBy(User::getCity));
Downstream Collectors
The second argument to groupingBy is a downstream collector that post-processes each group:
// Count users per city
Map<String, Long> countByCity = users.stream()
.collect(Collectors.groupingBy(User::getCity, Collectors.counting()));
// Average age per city
Map<String, Double> avgAgeByCity = users.stream()
.collect(Collectors.groupingBy(
User::getCity,
Collectors.averagingInt(User::getAge)
));
// Collect names per city
Map<String, List<String>> namesByCity = users.stream()
.collect(Collectors.groupingBy(
User::getCity,
Collectors.mapping(User::getName, Collectors.toList())
));
Multi-Level Grouping
// Group by country, then by city
Map<String, Map<String, List<User>>> byCountryThenCity = users.stream()
.collect(Collectors.groupingBy(
User::getCountry,
Collectors.groupingBy(User::getCity)
));
// Access: byCountryThenCity.get("UK").get("London")
Controlling the Map Implementation
// TreeMap for sorted keys
TreeMap<String, List<User>> sorted = users.stream()
.collect(Collectors.groupingBy(User::getCity, TreeMap::new, Collectors.toList()));
partitioningBy
partitioningBy divides a stream into exactly two groups — true and false — based on a predicate:
// Partition users: active (true) vs inactive (false)
Map<Boolean, List<User>> partitioned = users.stream()
.collect(Collectors.partitioningBy(User::isActive));
List<User> active = partitioned.get(true);
List<User> inactive = partitioned.get(false);
With downstream:
// Count active vs inactive
Map<Boolean, Long> counts = users.stream()
.collect(Collectors.partitioningBy(User::isActive, Collectors.counting()));
Use partitioningBy over groupingBy with a boolean when you need both sides — it guarantees both true and false keys exist in the result, even if one side is empty.
mapping and collectingAndThen
mapping — transform before collecting downstream
// Collect names (not User objects) grouped by city
Map<String, List<String>> namesByCity = users.stream()
.collect(Collectors.groupingBy(
User::getCity,
Collectors.mapping(User::getName, Collectors.toList())
));
collectingAndThen — post-process the collected result
// Collect to an unmodifiable list
List<String> immutable = names.stream()
.collect(Collectors.collectingAndThen(
Collectors.toList(),
Collections::unmodifiableList
));
// Get the longest name
Optional<String> longest = names.stream()
.collect(Collectors.collectingAndThen(
Collectors.maxBy(Comparator.comparingInt(String::length)),
opt -> opt.orElse(null)
));
Custom Collectors
For complex aggregations not covered by built-in collectors, implement Collector<T, A, R>:
T— type of stream elementsA— mutable accumulation type (intermediate container)R— result type
public class JoiningCollector implements Collector<String, StringJoiner, String> {
private final String delimiter;
public JoiningCollector(String delimiter) { this.delimiter = delimiter; }
@Override
public Supplier<StringJoiner> supplier() {
return () -> new StringJoiner(delimiter);
}
@Override
public BiConsumer<StringJoiner, String> accumulator() {
return StringJoiner::add;
}
@Override
public BinaryOperator<StringJoiner> combiner() {
return StringJoiner::merge; // for parallel streams
}
@Override
public Function<StringJoiner, String> finisher() {
return StringJoiner::toString;
}
@Override
public Set<Characteristics> characteristics() {
return Collections.emptySet();
}
}
In practice, Collector.of is simpler:
Collector<String, StringJoiner, String> joiner =
Collector.of(
() -> new StringJoiner(", "), // supplier
StringJoiner::add, // accumulator
StringJoiner::merge, // combiner
StringJoiner::toString // finisher
);
String result = Stream.of("a", "b", "c").collect(joiner); // "a, b, c"
Real-World Examples
Report: Sales by Region and Product Category
// Map<region, Map<category, totalRevenue>>
Map<String, Map<String, Double>> salesReport = orders.stream()
.collect(Collectors.groupingBy(
Order::getRegion,
Collectors.groupingBy(
Order::getCategory,
Collectors.summingDouble(Order::getRevenue)
)
));
Histogram: Character Frequency
Map<Character, Long> frequency = "hello world".chars()
.mapToObj(c -> (char) c)
.filter(c -> c != ' ')
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
Index: Build a Search Index from Documents
// Map<word, Set<documentId>> — inverted index
Map<String, Set<Long>> invertedIndex = documents.stream()
.flatMap(doc ->
Arrays.stream(doc.getContent().split("\\s+"))
.map(word -> Map.entry(word.toLowerCase(), doc.getId()))
)
.collect(Collectors.groupingBy(
Map.Entry::getKey,
Collectors.mapping(Map.Entry::getValue, Collectors.toSet())
));
Summary
| Collector | Output type | Use case |
|---|---|---|
toList() | List<T> | Ordered collection |
toSet() | Set<T> | Dedup, unordered |
toMap(k, v) | Map<K,V> | Key-value lookup |
joining(delim) | String | Concatenate strings |
groupingBy(f) | Map<K, List<T>> | Group by field |
groupingBy(f, downstream) | Map<K, R> | Group + post-process |
partitioningBy(p) | Map<Boolean, List<T>> | Split into two groups |
counting() | Long | Count per group |
summingInt/Long/Double | Integer/Long/Double | Sum per group |
averagingInt/Long/Double | Double | Average per group |
mapping(f, downstream) | varies | Transform before collecting |
collectingAndThen(c, f) | varies | Post-process result |
Next Step
Parallel Streams: ForkJoinPool, Spliterators, and When NOT to Parallelize →
Part of the DevOps Monk Java tutorial series: Java 8 → Java 11 → Java 17 → Java 21