group

Processor that groups the data in a list based on a grouping key.

Properties

Name Summary

sourceExpr

Optional expression to select the list to be grouped. If not set, the whole payload will be used as the source list. Processing will fail if the source is not iterable.

groupingKeyExpr

Required expression to select the key to group the list elements on. Expression is executed in the standard expression context with an additional element context property that contains the element to be grouped. If no value is produced, the processor will fall back on the defaultGroupingKeyExpr. Processing will fail if no value can be produced by either expression.

defaultGroupingKeyExpr

Optional expression to select a default key if groupingKeyExpr does not produce a value. Expression is executed in the same context as groupingKeyExpr. Processing will fail if no value can be produced by either expression.

outputFormat

Required OutputFormat that defines whether to produce a map of entry lists keyed on the grouping keys ( MAP_OF_LISTS) or to produce a list of grouped entry lists ( LIST_OF_LISTS).

name

Optional, descriptive name for the processor.

id

Required identifier of the processor, unique across all processors within the flow. Must be between 3 and 30 characters long; contain only lower and uppercase alphabetical characters (a-z and A-Z), numbers, dashes ("-"), and underscores ("_"); and start with an alphabetical character. In other words, it adheres to the regex pattern [a-zA-Z][a-zA-Z0-9_-]{2,29}.

exchangeProperties

Optional set of custom properties in a simple jdk-format, that are added to the message exchange properties before processing the incoming payload. Any existing properties with the same name will be replaced by properties defined here.

retainPayloadOnFailure

Whether the incoming payload is available for error processing on failure. Defaults to false.

Sub-builders

Name Summary

messageLoggingStrategy

Strategy for describing how a processor’s message is logged on the server.

payloadArchivingStrategy

Strategy for archiving payloads.

inboundTransformationStrategy

Strategy that customizes the conversion of an incoming payload by a processor (e.g., string to object). Should be used when the processor’s default conversion logic cannot be used.

Details

Modes

The processor has two modes of group key selection: "direct" or resolution via expressions.

Direct Grouping Key Lookup

This mode is geared for performance. It only supports cases where the input is a list of maps (such as a list of JSON compliant maps). If this is your usecase, we strongly recommend using this approach. It has much better performance characteristics than using the alternative.

Direct mode is activated by prefixing the value of groupingKeyExpr with direct:, so that the format is "direct:<groupingKey>".

In the following example, each map in the input list must contain the providerId key:

groupingKeyExpr = "direct:providerId"

The value under this key will be used as the grouping key.

It is also possible to provide a default key in defaultGroupingKeyExpr to use as a fallback if the key in groupingKeyExpr cannot be found. In the following example, the grouping key unknown will be used if a value cannot be found under providerId:

groupingKeyExpr = "direct:providerId"
defaultGroupingKeyExpr = "unknown"

sourceExpr is an expression and is not affected by the grouping key lookup mode. See below for usage of expressions.

The direct key lookup also only works on top level entries, so values cannot be looked up in maps or lists contained by the row map.

Sharding

Sometimes it is useful to group data across a predefined number of groups instead of groups based on key values. For those cases, direct mode supports generating a bounded shard number from the value of the key lookup. Sharding is activated by appending an entry on the format %<maxNumberOfShards> to the groupingKeyExpr property. For example, "direct:providerId%20".

When sharding is active, the data will be distributed across a maximum of maxNumberOfShards groups. The key for each group will be a non-negative integer, and the number of groups will never exceed maxNumberOfShards. The shard number generation is consistent, so the same group key value will always be assigned to the same shard number. If defaultGroupingKeyExpr is used, the key defined there will also be replaced by a shard number.

In the following example, data will be grouped into a maximum of 20 groups:

groupingKeyExpr = "direct:providerId%20"
defaultGroupingKeyExpr = "unknown"

If the data does not contain providerId, the shard number assigned to unknown will be used as the grouping key.

Expression Based Grouping Key Lookup

This is the default mode for group processing.

Expressions are executed in a context that expands the standard expression context with the element property. element holds the current list element, from which the key is typically looked up.

In the following example, each map in the input list must contain the providerId key:

groupingKeyExpr = "element.providerId"

The value under this key will be used as the grouping key.

It is also possible to provide a default key lookup in defaultGroupingKeyExpr to use as a fallback if the key in groupingKeyExpr cannot be found. In the following example, the grouping key will be looked up from the ownerId property if a value cannot be found under providerId:

groupingKeyExpr = "element.providerId"
defaultGroupingKeyExpr = "element.ownerId"

You can also provide a hardcoded default grouping key by providing a string literal. In the following example, the grouping key will be unknown if a value cannot be found under providerId:

groupingKeyExpr = "element[providerId]"
defaultGroupingKeyExpr = "'unknown'"