receiveFromKafka

Source processor that consumes messages from a Kafka topic and forwards them to the flow pipeline. The processor reads each record’s value as binary data. To transform the binary data to another format, such as JSON, add an inboundTransformationStrategy in a downstream processor.

Scaling with Consumer Groups

receiveFromKafka is a singleton source — only one flow instance will be active at a time. To balance work across the cluster with multiple consumers, create multiple flows that subscribe to the same topic and use the same groupId. Kafka’s consumer group protocol will automatically distribute partitions across the consumers in the group.

For this to work, the topic needs multiple partitions (each partition can only be consumed by one consumer in a group at a time). The number of partitions sets the upper bound on parallelism within a single consumer group.

Consumers in different groups each receive a full, independent copy of the stream and track their own offsets separately, which is useful when multiple applications need to independently consume the same data.

Delivery modes

This source supports two delivery modes, selected by the flow’s deliveryGuarantee property.

ByFlowServer

This is Flow Server Managed Delivery. It is the default mode where deliveryGuarantee is set to ByFlowServer.

In this mode, the flow server persists each message before processing it. The Kafka consumer commits the offset only after the message is stored safely. This provides an at-least-once delivery guarantee on the flow server side, independent of Kafka. If the flow server restarts, persisted messages are replayed.

BySourceSystem

This is Source System Managed Delivery, enabled when deliveryGuarantee is set to BySourceSystem.

In this mode, messages go directly to the flow pipeline without persistence in the flow server. The Kafka consumer manages delivery guarantees by committing offsets only after the flow server confirms successful processing. If the flow server restarts, Kafka redelivers any uncommitted messages.

Use this mode for high-throughput workloads when Flow Server Managed Delivery persistence becomes a bottleneck. The tradeoff is that delivery guarantees depend on Kafka offset management rather than the flow server.

When a message fails, the flow server retries according to the flow’s redeliveryStrategy. If retries are exhausted and a dead letter flow is configured, the failed message is forwarded to it. If flow server-level recovery is exhausted, the configured nackStrategy decides the final action:

SKIP: The message is skipped, and the partition offset advances.
PAUSE: The partition is paused until the problem is resolved.

Behavioral Characteristics

Design downstream processors to be idempotent. Offsets are committed sequentially. If message 6 is still processing while messages 7 and 8 have already succeeded, the committed offset remains at 5. If the flow server crashes at that point, Kafka redelivers messages 6, 7, and 8 on restart, even though 7 and 8 were already processed.
A failed message does not block the partition when nackStrategy is SKIP. When a message fails, NACK blocks offset commits from advancing, but later messages continue processing. This prevents one problematic message from blocking the partition, but it widens the redelivery window if a crash occurs before the failure is resolved.
Backpressure does not trigger rebalancing. When the pipeline is slower than Kafka, partitions pause automatically. The consumer continues sending heartbeats and remains in the group.
Consumer group changes can cause redelivery. Scaling, restarts, and rebalances transfer partition ownership to another consumer. Any in-flight messages on the old consumer are abandoned, and the new consumer resumes from the last committed offset.

Memory Considerations

In Source System Managed Delivery, the flow server keeps in-flight messages in memory instead of writing them to persistence. The flow server can buffer up to twice maxPollRecords messages because a new poll can complete while the previous batch is still draining. It also buffers messages that the pipeline is currently processing:

worst-case memory ≈ (2 x maxPollRecords + maxNumberOfInFlightMessages) x average message size

All flows deployed on the same node run inside the same flow server process and therefore share runtime resources. Large messages and high traffic can increase memory use quickly. Use conservative settings: keep poll batches small, keep in-flight limits moderate, and leave enough memory headroom.

Configuration Properties

Use the following configuration properties only with Source System Managed Delivery, when deliveryGuarantee is BySourceSystem. If you set them for any other delivery guarantee, deployment fails with a validation error.

nackStrategy — Required. Specifies what to do when a message permanently fails: SKIP (drop and advance) or PAUSE (block the partition until resolved). You must set this property explicitly; there is no default.
commitIntervalMillis — How often the processor commits offsets to Kafka.
nackMaxRetries — How many seek-back attempts the processor makes before the NACK strategy triggers.
nackRetryDelayMillis — Delay between seek-back retry attempts.

These properties are optional except nackStrategy, which is required. Optional properties fall back to flow server-level defaults.

Properties

Name Summary

Name	Summary
`clientId`	An ID string to pass to the server when making requests. Used to track the source of requests beyond just the IP/port by allowing a logical application name to be included in server-side request logging. Optional and defaults to `blank`.
`brokerList`	A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. This string must be in the form of `host1:port1,host2:port2`, etc. Required.
`topics`	Comma-separated list of topic names to which the consumer should subscribe. Required.
`groupId`	The consumer group ID. Enables multiple consumer instances to process messages from different partitions in parallel and support individual message offset tracking for multiple consuming applications. Required.
`groupInstanceId`	A unique identifier of the consumer instance provided by the end user. Only non-empty strings are permitted. If set, the consumer is treated as a static member, which means that only one instance with this ID is allowed in the consumer group at any time. This can be used in combination with a larger session timeout to avoid group re-balances caused by transient unavailability (e.g., process restarts). If not set, the consumer will join the group as a dynamic member, which is the traditional behavior.
`maxPollRecords`	Maximum number of records returned in a single poll. This value is only treated as a hint and may be limited by server/tenant wide configuration. Optional and defaults to `500`.
`maxPollIntervalMs`	The maximum delay between invocations of `poll()` when using consumer group management. This places an upper bound on the amount of time that the consumer can be idle before fetching more records. If `poll()` is not called before expiration of this timeout, then the consumer is considered failed, and the group will re-balance in order to reassign the partitions to another member. For consumers using a non-null `groupInstanceId` that reach this timeout, partitions will not be immediately reassigned. Instead, the consumer will stop sending heartbeats, and partitions will be reassigned after expiration of `sessionTimeoutMs`. This mirrors the behavior of a static consumer which has shutdown. Optional and defaults to `300_000`.
`pollTimeoutMs`	The maximum time to block in the call to `poll()`. However, the `poll()` function might block beyond this timeout in order to execute custom `ConsumerRebalanceListener` callbacks. Optional and defaults to `10_000`.
`sessionTimeoutMs`	The timeout used to detect client failures when using Kafka’s group management facility. The client sends periodic heartbeats to indicate its live-ness to the broker. If no heartbeats are received by the broker before the expiration of this session timeout, then the broker will remove this client from the group and initiate a re-balance. Note that the value must be in the allowable range as configured in the broker configuration by `group.min.session.timeout.ms` and `group.max.session.timeout.ms`. Optional and defaults to `10_000`.
`heartbeatIntervalMs`	The expected time between heartbeats to the consumer coordinator when using Kafka’s group management facilities. Heartbeats are used to ensure that the consumer’s session stays active and to facilitate re-balancing when new consumers join or leave the group. The value must be set lower than `session.timeout.ms` but typically should be no higher than 1/3 of that value. It can be adjusted even lower to control the expected time for normal re-balances. Optional and defaults to `3_000`.
`partitionAssignmentStrategy`	Optional list of class names or class types, ordered by preference, of supported partition assignment strategies. The client will use these to distribute partition ownership amongst consumer instances when group management is used. Provided implementations: `org.apache.kafka.clients.consumer.RangeAssignor` (default) `org.apache.kafka.clients.consumer.RoundRobinAssignor`
`autoOffsetReset`	The `KafkaAutoOffsetReset` option to use when there is no initial offset in Kafka, or if the current offset does not exist anymore on the server (i.e., because that data has been deleted). Can be one of the following: `EARLIEST`: Automatically reset the offset to the earliest offset (but can potentially result in consuming a very large amount of messages after starting up). `LATEST`: Automatically reset the offset to the latest offset (default).
`fetchMinBytes`	Optional, minimum amount of data the server should return for a fetch request. If insufficient data is available, the request will wait for that much data to accumulate before answering the request. The default setting of `1` byte means that fetch requests are answered as soon as a single byte of data is available, or the fetch request times out waiting for data to arrive. Setting this to something greater than `1` will cause the server to wait for larger amounts of data to accumulate, which can improve server throughput a bit at the cost of some additional latency.
`maxPartitionFetchBytes`	The maximum amount of data per partition the server will return. Records are fetched in batches by the consumer. If the first record batch in the first non-empty partition of the fetch is larger than this limit, the batch will still be returned to ensure that the consumer can make progress. The maximum record batch size accepted by the broker is defined via `message.max.bytes` (broker config) or `max.message.bytes` (topic config). This value is only treated as a hint and may be limited by server/tenant wide configuration. Optional and defaults to `1_048_576`.
`requestTimeoutMs`	Maximum amount of time (in milliseconds) the client will wait for the response of a request. Optional and defaults to `30_000`.
`authenticationConfigKey`	A secret key the server uses to look up credentials needed for connecting to the Kafka broker. Optional and uses no authentication by default.
`commitIntervalMillis`	Periodic offset commit interval in milliseconds for Source-System Managed Delivery. Only applicable when the flow’s `deliveryGuarantee` is `BySourceSystem`. Optional and defaults to the server-level configuration.
`nackStrategy`	Strategy to apply when a message permanently fails processing in Source-System Managed Delivery after all seek-back retry attempts are exhausted. Can be one of the following: SKIP: Skip the failed message and advance the offset watermark. At this point the message has already been through the configured redelivery attempts and any configured error handling (such as dead letter flow forwarding) has been executed. The message is dropped as a final fallback after all other recovery mechanisms are exhausted. PAUSE: Pause the affected partition and continue retrying indefinitely. The partition resumes automatically once the message processes successfully. If the root cause is never resolved (e.g., a poison message that always fails validation, or a permanently unavailable downstream system), the partition remains blocked indefinitely. Only applicable when the flow’s `deliveryGuarantee` is `BySourceSystem`. Required — must be explicitly set on all `BySourceSystem` Kafka flows. There is no default; flow designers must make a deliberate choice between data loss (SKIP) and partition stalling (PAUSE) for their use case.
`nackMaxRetries`	Maximum number of seek-back retry attempts before the NACK strategy triggers in Source-System Managed Delivery. Only applicable when the flow’s `deliveryGuarantee` is `BySourceSystem`. Optional and defaults to the server-level configuration.
`nackRetryDelayMillis`	Delay in milliseconds between NACK retry attempts in Source-System Managed Delivery. Only applicable when the flow’s `deliveryGuarantee` is `BySourceSystem`. Optional and defaults to the server-level configuration.
`spreadKey`	A cluster placement hint. Sources that share the same spreadKey will be distributed evenly throughout cluster nodes, improving resilience and distributing load. Note that unlike `groupId`, which governs message consumption within Kafka, this value is used exclusively by the server for cluster node placement. If not configured, default placement logic will be applied. Optional.
`name`	Optional, descriptive name for the processor.
`id`	Required identifier of the processor, unique across all processors within the flow. Must be between 3 and 30 characters long; contain only lower and uppercase alphabetical characters (a-z and A-Z), numbers, dashes ("-"), and underscores ("_"); and start with an alphabetical character. In other words, it adheres to the regex pattern `[a-zA-Z][a-zA-Z0-9_-]{2,29}`.
`exchangeProperties`	Optional set of custom properties in a simple jdk-format, that are added to the message exchange properties before processing the incoming payload. Any existing properties with the same name will be replaced by properties defined here.

clientId

An ID string to pass to the server when making requests. Used to track the source of requests beyond just the IP/port by allowing a logical application name to be included in server-side request logging. Optional and defaults to blank.

brokerList

A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. This string must be in the form of host1:port1,host2:port2, etc. Required.

topics

Comma-separated list of topic names to which the consumer should subscribe. Required.

groupId

The consumer group ID. Enables multiple consumer instances to process messages from different partitions in parallel and support individual message offset tracking for multiple consuming applications. Required.

groupInstanceId

A unique identifier of the consumer instance provided by the end user. Only non-empty strings are permitted. If set, the consumer is treated as a static member, which means that only one instance with this ID is allowed in the consumer group at any time. This can be used in combination with a larger session timeout to avoid group re-balances caused by transient unavailability (e.g., process restarts). If not set, the consumer will join the group as a dynamic member, which is the traditional behavior.

maxPollRecords

Maximum number of records returned in a single poll. This value is only treated as a hint and may be limited by server/tenant wide configuration. Optional and defaults to 500.

maxPollIntervalMs

The maximum delay between invocations of poll() when using consumer group management. This places an upper bound on the amount of time that the consumer can be idle before fetching more records. If poll() is not called before expiration of this timeout, then the consumer is considered failed, and the group will re-balance in order to reassign the partitions to another member.

For consumers using a non-null groupInstanceId that reach this timeout, partitions will not be immediately reassigned. Instead, the consumer will stop sending heartbeats, and partitions will be reassigned after expiration of sessionTimeoutMs. This mirrors the behavior of a static consumer which has shutdown.

Optional and defaults to 300_000.

pollTimeoutMs

The maximum time to block in the call to poll(). However, the poll() function might block beyond this timeout in order to execute custom ConsumerRebalanceListener callbacks. Optional and defaults to 10_000.

sessionTimeoutMs

The timeout used to detect client failures when using Kafka’s group management facility. The client sends periodic heartbeats to indicate its live-ness to the broker. If no heartbeats are received by the broker before the expiration of this session timeout, then the broker will remove this client from the group and initiate a re-balance. Note that the value must be in the allowable range as configured in the broker configuration by group.min.session.timeout.ms and group.max.session.timeout.ms. Optional and defaults to 10_000.

heartbeatIntervalMs

The expected time between heartbeats to the consumer coordinator when using Kafka’s group management facilities. Heartbeats are used to ensure that the consumer’s session stays active and to facilitate re-balancing when new consumers join or leave the group. The value must be set lower than session.timeout.ms but typically should be no higher than 1/3 of that value. It can be adjusted even lower to control the expected time for normal re-balances. Optional and defaults to 3_000.

partitionAssignmentStrategy

Optional list of class names or class types, ordered by preference, of supported partition assignment strategies. The client will use these to distribute partition ownership amongst consumer instances when group management is used. Provided implementations:

org.apache.kafka.clients.consumer.RangeAssignor (default)
org.apache.kafka.clients.consumer.RoundRobinAssignor

autoOffsetReset

The KafkaAutoOffsetReset option to use when there is no initial offset in Kafka, or if the current offset does not exist anymore on the server (i.e., because that data has been deleted). Can be one of the following:

EARLIEST: Automatically reset the offset to the earliest offset (but can potentially result in consuming a very large amount of messages after starting up).
LATEST: Automatically reset the offset to the latest offset (default).

fetchMinBytes

Optional, minimum amount of data the server should return for a fetch request. If insufficient data is available, the request will wait for that much data to accumulate before answering the request. The default setting of 1 byte means that fetch requests are answered as soon as a single byte of data is available, or the fetch request times out waiting for data to arrive. Setting this to something greater than 1 will cause the server to wait for larger amounts of data to accumulate, which can improve server throughput a bit at the cost of some additional latency.

maxPartitionFetchBytes

The maximum amount of data per partition the server will return. Records are fetched in batches by the consumer. If the first record batch in the first non-empty partition of the fetch is larger than this limit, the batch will still be returned to ensure that the consumer can make progress. The maximum record batch size accepted by the broker is defined via message.max.bytes (broker config) or max.message.bytes (topic config).

This value is only treated as a hint and may be limited by server/tenant wide configuration. Optional and defaults to 1_048_576.

requestTimeoutMs

Maximum amount of time (in milliseconds) the client will wait for the response of a request. Optional and defaults to 30_000.

authenticationConfigKey

A secret key the server uses to look up credentials needed for connecting to the Kafka broker. Optional and uses no authentication by default.

commitIntervalMillis

Periodic offset commit interval in milliseconds for Source-System Managed Delivery. Only applicable when the flow’s deliveryGuarantee is BySourceSystem. Optional and defaults to the server-level configuration.

nackStrategy

Strategy to apply when a message permanently fails processing in Source-System Managed Delivery after all seek-back retry attempts are exhausted. Can be one of the following:

SKIP: Skip the failed message and advance the offset watermark. At this point the message has already been through the configured redelivery attempts and any configured error handling (such as dead letter flow forwarding) has been executed. The message is dropped as a final fallback after all other recovery mechanisms are exhausted.
PAUSE: Pause the affected partition and continue retrying indefinitely. The partition resumes automatically once the message processes successfully. If the root cause is never resolved (e.g., a poison message that always fails validation, or a permanently unavailable downstream system), the partition remains blocked indefinitely.

Only applicable when the flow’s deliveryGuarantee is BySourceSystem. Required — must be explicitly set on all BySourceSystem Kafka flows. There is no default; flow designers must make a deliberate choice between data loss (SKIP) and partition stalling (PAUSE) for their use case.

nackMaxRetries

Maximum number of seek-back retry attempts before the NACK strategy triggers in Source-System Managed Delivery. Only applicable when the flow’s deliveryGuarantee is BySourceSystem. Optional and defaults to the server-level configuration.

nackRetryDelayMillis

Delay in milliseconds between NACK retry attempts in Source-System Managed Delivery. Only applicable when the flow’s deliveryGuarantee is BySourceSystem. Optional and defaults to the server-level configuration.

spreadKey

A cluster placement hint. Sources that share the same spreadKey will be distributed evenly throughout cluster nodes, improving resilience and distributing load. Note that unlike groupId, which governs message consumption within Kafka, this value is used exclusively by the server for cluster node placement. If not configured, default placement logic will be applied.

Optional.

name

Optional, descriptive name for the processor.

id

Required identifier of the processor, unique across all processors within the flow. Must be between 3 and 30 characters long; contain only lower and uppercase alphabetical characters (a-z and A-Z), numbers, dashes ("-"), and underscores ("_"); and start with an alphabetical character. In other words, it adheres to the regex pattern [a-zA-Z][a-zA-Z0-9_-]{2,29}.

exchangeProperties

Optional set of custom properties in a simple jdk-format, that are added to the message exchange properties before processing the incoming payload. Any existing properties with the same name will be replaced by properties defined here.

Sub-builders

Name	Summary
externalSystemDetails	Strategy for describing the external system integration. Optional.
messageLoggingStrategy	Strategy for describing how a processor’s message is logged on the server.
payloadArchivingStrategy	Strategy for archiving payloads.

Name

Summary

externalSystemDetails

Strategy for describing the external system integration. Optional.

messageLoggingStrategy

Strategy for describing how a processor’s message is logged on the server.

payloadArchivingStrategy

Strategy for archiving payloads.

Details

Authentication

The authenticationConfigKey property supports Sasl and Tls secrets.

You can specify multiple secrets by providing them as a comma-separated list. Comma-separated list support is designed to allow users to combine a Tls secret with a Sasl secret. This is useful when you are using SASL authentication and also need to add a server trust store or other TLS configuration. For example:

authenticationConfigKey: "tlsSecret,saslSecret"

The behavior is undefined if multiple secrets of the same type are provided.

UserNameAndPassword secrets remain supported for backward compatibility and use SASL_SSL with SCRAM-SHA-256. Do not use this secret type for new Kafka configurations; use a Sasl secret instead.

See the Secret Types documentation for formatting details.

Record Key

The receiveFromKafka processor does not include the record key when producing payloads.

Was this page helpful?