Consumer Groups: Parallel Processing and Partition Assignment Strategies

Consumer Group Fundamentals

A consumer group is how Kafka distributes work across multiple consumers. Each partition in a topic is assigned to exactly one consumer instance in the group at any given time.

flowchart TB
    subgraph Topic["Topic: orders — 6 partitions"]
        P0["P0"] 
        P1["P1"] 
        P2["P2"] 
        P3["P3"] 
        P4["P4"] 
        P5["P5"]
    end

    subgraph CG1["Group: inventory-service (3 instances)"]
        I1["Instance 1\nP0, P1"]
        I2["Instance 2\nP2, P3"]
        I3["Instance 3\nP4, P5"]
    end

    subgraph CG2["Group: notification-service (1 instance)"]
        N1["Instance 1\nP0,P1,P2,P3,P4,P5"]
    end

    P0 & P1 --> I1
    P2 & P3 --> I2
    P4 & P5 --> I3
    P0 & P1 & P2 & P3 & P4 & P5 --> N1

Both groups receive all events — they are independent. The inventory service spreads work across 3 instances; the notification service uses a single instance reading all 6 partitions.


The Three Built-in Partition Assignors

Kafka provides three partition assignment strategies. The group leader consumer runs the chosen assignor algorithm and distributes the result to all members.

1. RangeAssignor (Default for single-topic groups)

Sorts consumers and partitions, then assigns consecutive partition ranges.

flowchart LR
    subgraph "RangeAssignor — Topic: orders (6 partitions), 3 consumers"
        C1["Consumer 1"] --- |"P0, P1"| A1[" "]
        C2["Consumer 2"] --- |"P2, P3"| A2[" "]
        C3["Consumer 3"] --- |"P4, P5"| A3[" "]
    end

Problem: When subscribing to multiple topics, the “first” consumer gets more partitions. With topics orders (3 partitions) and payments (3 partitions) and 2 consumers:

  • Consumer 1: orders-P0, orders-P1, payments-P0, payments-P1 (4 partitions)
  • Consumer 2: orders-P2, payments-P2 (2 partitions)

2. RoundRobinAssignor

Distributes partitions one at a time, cycling through consumers.

flowchart LR
    subgraph "RoundRobinAssignor — 6 partitions, 3 consumers"
        P0 -->|"round 1"| C1["Consumer 1\nP0, P3"]
        P1 -->|"round 1"| C2["Consumer 2\nP1, P4"]
        P2 -->|"round 1"| C3["Consumer 3\nP2, P5"]
        P3 -->|"round 2"| C1
        P4 -->|"round 2"| C2
        P5 -->|"round 2"| C3
    end

More balanced than Range for multi-topic subscriptions, but still triggers full reassignment on membership change.

Distributes evenly AND minimises partition movement on rebalance — consumers keep as many of their current partitions as possible.

flowchart TD
    subgraph Before["Before: 3 consumers, 6 partitions"]
        B1["C1: P0, P1"]
        B2["C2: P2, P3"]
        B3["C3: P4, P5"]
    end

    subgraph After["After C3 leaves — StickyAssignor"]
        A1["C1: P0, P1, P4"]
        A2["C2: P2, P3, P5"]
        Note["Only P4 and P5 moved\n(minimal disruption)"]
    end

    subgraph AfterRR["After C3 leaves — RoundRobinAssignor"]
        AR1["C1: P0, P2, P4"]
        AR2["C2: P1, P3, P5"]
        NoteRR["All 6 partitions reassigned\n(maximal disruption)"]
    end

    Before --> After
    Before --> AfterRR

4. CooperativeStickyAssignor (Best for production)

Same as StickyAssignor but uses incremental (cooperative) rebalancing — only the partitions that need to move are revoked. Consumers keep processing their unchanged partitions during the rebalance.

sequenceDiagram
    participant C1 as Consumer 1\n(P0, P1)
    participant C2 as Consumer 2\n(P2, P3)
    participant C3 as Consumer 3 (new)
    participant Coord as Coordinator

    Note over C1,C2: Eager (classic) rebalance: ALL stop
    Coord->>C1: Revoke ALL partitions
    Coord->>C2: Revoke ALL partitions
    Note over C1,C2: Processing STOPPED for all

    Note over C1,C3: Cooperative rebalance: only moved partitions stop
    Coord->>C2: Revoke only P3 (will go to C3)
    Note over C1,C2: C1 keeps processing P0,P1\nC2 keeps processing P2
    C2->>Coord: Revoke complete
    Coord->>C3: Assign P3
    Note over C1,C3: Zero interruption for unchanged partitions

Configuring the Assignor

# application.properties — set for all consumers
spring.kafka.consumer.properties.partition.assignment.strategy=\
  org.apache.kafka.clients.consumer.CooperativeStickyAssignor

Or in @Bean configuration:

@Bean
public ConsumerFactory<String, Object> consumerFactory() {
    Map<String, Object> config = new HashMap<>();
    config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
    config.put(ConsumerConfig.GROUP_ID_CONFIG, "inventory-service");
    config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
    config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
    config.put(ConsumerConfig.PARTITION_ASSIGNMENT_STRATEGY_CONFIG,
        List.of(CooperativeStickyAssignor.class));
    return new DefaultKafkaConsumerFactory<>(config);
}

Scaling Consumers: The Rules

flowchart TD
    Partitions["Topic: 6 partitions"]

    subgraph Scale1["1 consumer instance"]
        C1_1["Instance 1: P0,P1,P2,P3,P4,P5\n(all partitions, 1 thread)"]
    end

    subgraph Scale2["2 consumer instances"]
        C2_1["Instance 1: P0,P1,P2"]
        C2_2["Instance 2: P3,P4,P5"]
    end

    subgraph Scale3["3 consumer instances"]
        C3_1["Instance 1: P0,P1"]
        C3_2["Instance 2: P2,P3"]
        C3_3["Instance 3: P4,P5"]
    end

    subgraph Scale6["6 consumer instances"]
        C6_1["Instance 1: P0"]
        C6_2["Instance 2: P1"]
        C6_3["Instance 3: P2"]
        C6_4["Instance 4: P3"]
        C6_5["Instance 5: P4"]
        C6_6["Instance 6: P5"]
    end

    subgraph Scale8["8 consumer instances (2 idle)"]
        C8_1["Instance 1: P0"]
        C8_2["..."]
        C8_7["Instance 7: idle ⚠"]
        C8_8["Instance 8: idle ⚠"]
    end

    Partitions --> Scale1
    Partitions --> Scale2
    Partitions --> Scale3
    Partitions --> Scale6
    Partitions --> Scale8

Rules:

  • Maximum throughput = number of partitions
  • Adding instances beyond partition count creates idle consumers
  • Concurrency within one instance (concurrency=N) creates N consumer threads — each thread is one “consumer” in the group
  • Total active consumers in the group = (instances × concurrency) — capped at partition count

Setting Concurrency in Spring Boot

# Each instance runs 3 consumer threads — 3 partitions per instance
spring.kafka.listener.concurrency=3

With 2 instances and concurrency=3, you have 6 consumer threads total — matching 6 partitions.


Consumer Group Coordinator

The group coordinator is a broker responsible for managing the group:

sequenceDiagram
    participant C1 as Consumer 1
    participant C2 as Consumer 2
    participant Coord as Group Coordinator\n(a Broker)

    C1->>Coord: FindCoordinator(group=inventory-service)
    Coord-->>C1: Coordinator is Broker 2

    C1->>Coord: JoinGroup (protocols: cooperative-sticky)
    C2->>Coord: JoinGroup (protocols: cooperative-sticky)

    Note over Coord: All members joined\nElect leader (first to join)

    Coord-->>C1: JoinGroupResponse (YOU are leader, member list)
    Coord-->>C2: JoinGroupResponse (follower)

    C1->>C1: Run CooperativeStickyAssignor\nC1→[P0,P1,P2], C2→[P3,P4,P5]

    C1->>Coord: SyncGroup (assignment)
    C2->>Coord: SyncGroup (no assignment — follower)

    Coord-->>C1: SyncGroupResponse [P0,P1,P2]
    Coord-->>C2: SyncGroupResponse [P3,P4,P5]

    Note over C1,C2: Both start consuming their partitions

The group leader (first consumer to join) runs the assignment algorithm and submits it to the coordinator. The coordinator distributes it to all members. This design keeps complex logic in the client, not the broker.


Static Group Membership

Normal rebalancing: every consumer restart triggers a rebalance. With static membership, a consumer that restarts within session.timeout.ms keeps its old partition assignment — no rebalance occurs.

sequenceDiagram
    participant C1 as Consumer 1\n(group.instance.id=inv-1)
    participant C2 as Consumer 2\n(group.instance.id=inv-2)
    participant Coord as Coordinator

    Note over C1,C2: Normal operation: C1→P0,P1  C2→P2,P3

    C1->>C1: Crashes (rolling deploy)
    Note over Coord: Static member inv-1 gone\nWaiting session.timeout.ms (45s)

    C1->>Coord: Re-join as inv-1 within 45s
    Coord-->>C1: Re-assign same partitions P0,P1
    Note over C1,C2: No rebalance occurred\nC2 never stopped processing
spring.kafka.consumer.properties.group.instance.id=inventory-instance-${HOSTNAME}
spring.kafka.consumer.properties.session.timeout.ms=60000

Static membership is especially useful for rolling deployments — eliminates rebalance storms when deploying new versions of a service.


Monitoring Consumer Groups with CLI

# List all groups
docker exec kafka kafka-consumer-groups.sh \
  --bootstrap-server localhost:9092 --list

# Describe with lag
docker exec kafka kafka-consumer-groups.sh \
  --bootstrap-server localhost:9092 \
  --describe --group inventory-service

# Output:
# GROUP             TOPIC   PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG   CONSUMER-ID
# inventory-service orders  0          1250            1250            0     inv-1-...
# inventory-service orders  1          1180            1185            5     inv-2-...
# inventory-service orders  2          1340            1340            0     inv-2-...

Lag on partition 1 = 5. That consumer is behind — either slower processing or a stuck thread.


Key Takeaways

  • Each partition is assigned to exactly one consumer in the group — this is the fundamental unit of parallel processing
  • CooperativeStickyAssignor is the best production choice: balanced distribution AND minimal partition movement on rebalance
  • Maximum parallelism = number of partitions; extra consumers beyond partition count are idle
  • concurrency=N creates N consumer threads per application instance — total group size = instances × concurrency
  • Static group membership (group.instance.id) eliminates rebalance storms during rolling deployments
  • The group leader (not the coordinator) runs the assignment algorithm — the coordinator only stores and distributes it

Next: Offset Management: Auto-Commit vs Manual Acknowledgment — control exactly when offsets are committed to prevent data loss and duplicate processing.