Core Concepts
Flow
Integrations in Utilihive are represented as flows. During development, you define a flow by writing a flow specification using the Utilihive SDK. After deployment to the Utilihive Platform or the Utilihive Agent, the flow becomes a message-processing pipeline that acts as the integration layer between systems.
Flow Server
A flow server is the runtime engine that manages and runs integration flows. During development with the Utilihive SDK, flows are deployed to an in-memory flow server instance for testing and validation.
When deployed to the Utilihive Platform, flows run on a highly performant and resilient distributed cluster that provides scalability, reliability, and high availability.
In contrast, the Utilihive Agent includes a lightweight, standalone flow server.
Flow Specification
Flows are created by writing a flow specification, which the flow server then converts into a message-processing pipeline. Messages arrive on a protocol defined by the inbound endpoint.
When writing a flow specification, there are several configuration options to consider. The three main categories of a flow specification are Processors, Message Flow Control, and Reliability.
Processors
Processors represent the sequential journey, or flow, of a message. Starting from the inbound endpoint, the message passes through a chain of processors that either manipulate the data internally or connect to an external outbound service before passing the result to the next step.
For example, a flow might receive XML via a SOAP API endpoint. A processor can then transform the XML to a specific JSON format. Another processor could send the newly formatted JSON to a different REST API on another server. The output of the last processor is returned to the source to indicate whether the flow was successful.
Utilihive provides many pre-built processors to receive, send, transform, parse, convert, and compress data. Integration connectors can also be implemented using these processors. For example, a flow can consume messages from a Kafka cluster by configuring the receiveFromKafka processor, or it can publish messages to a Google Cloud Pub/Sub topic using the sendToGooglePubSub processor.
For a complete list of supported processors, see the Flow Processors Index.
Message Flow Control
When a message is received by the flow server, it is immediately persisted in reliable storage. Unconfirmed messages, also called in-flight messages are those that have been persisted and sent to the corresponding flow for processing but have not yet received a response. The number of concurrent in-flight messages can be configured in the flow specification.
If the in-flight message limit is reached, the server holds any additional messages in memory until a slot becomes available in the flow. The number of messages to buffer in memory can be configured in the flow specification.
If the memory buffer limit is reached, messages can still be temporarily persisted in a stash. Messages in the stash move to the buffer as soon as memory is available. The number of messages allowed in the stash can be configured in the flow specification. Messages that exceed this limit are rejected, and an error is returned.
Reliability
The flow server operates on an "at least once" principle. This means that messages that fail at any point in the pipeline can be retried before they are rejected. A redelivery strategy can be added to a flow specification to define how many times to attempt redelivery and the delay between attempts.
If redelivery fails completely, the error and its related content can still be captured. These failed messages are called dead letters. Dead letters can be logged directly to the server or sent to another flow specifically designed for error handling.
Example Use Cases
Single Flow
Consider a scenario where user accounts are being migrated to a separate service. At the same time, you want to upload a user’s avatar to an AWS S3 bucket during account creation. This can be implemented using a single flow. The inbound endpoint (the source) is a REST API. The next step in the pipeline is the S3 processor, with the final processor being the REST API of the account service.
The flow setup looks like this:
Note: The connectors required for this flow can be implemented using Utilihive’s writeToAwsS3 and restRequest processors.
Multiple Flows
For this example, consider a scenario where you need to consolidate data from several weather services. Some of these services provide XML, while others provide CSV files. In both cases, the data must be converted to JSON before reaching the final destination, which is a database.
This can be accomplished using three flows. The first two act as inbound endpoints, distributing their results to a third, shared flow, also known as a handoff flow.
The setup looks like this:
A dead letter strategy can also be added to the shared flow to capture any failures when writing to the database, leveraging the flow’s built-in reliability.