Scaling Connect Services

This page describes how GridOS Connect services can be scaled, and what configuration changes are required when doing so. We refer only to services managed by the connect Helm chart, not charts of 3rd party tools distributed with Connect (connect-openbao, connect-postgresql, connect-victoria-metrics).

1. Scaling Recommendations

Specific recommendations for the number of pods or resource limits cannot be provided, as the optimal configuration depends entirely on the use case — the number of flows, message throughput, payload sizes, and integration patterns all play a role.

Scaling decisions should be based on evidence: conduct performance testing that reflects your actual workload, observe the results, and adjust accordingly.

2. Connect Services Scaling Support

Connect is deployed as a set of Kubernetes services via Helm. Of all Connect services, only the flow-server supports running as a cluster (multiple pods). All other Connect services are designed to run as a single pod (replicas: 1) and must not be scaled horizontally.

Do not increase replicas beyond 1 for any Connect service other than flowserver. Running multiple replicas of singleton services is not supported and may cause undefined behavior.

3. Scaling the Flow Server

The flow-server uses Apache Pekko Cluster Sharding to distribute flow pipeline execution across multiple pods. This allows the flow-server to scale horizontally to increase message throughput and resilience.

The default replica count is 3, as defined in the Connect chart values. You can increase this value to add more flow-server pods to the cluster.

3.1. Updating maxNumberOfShards

When scaling the number of flow-server replicas, you must also update the mc.flow-server.maxNumberOfShards configuration property.

This property controls the number of shards used by Pekko Cluster Sharding to distribute flow pipelines across pods. The recommended value follows the Pekko documentation formula:

maxNumberOfShards = 10 × number of flow-server pods

The default value is 30, which corresponds to 3 pods.

maxNumberOfShards should be set before scaling and must not be decreased on a running cluster, as it would affect the assignment of existing shards and can lead to data inconsistencies or flow execution errors.

To override this value, add the following to your values.yaml override file (see Chart Value Override Recommendations):

flowserver:
  replicas: 5              (1)
  config:
    application.yml:
      mc.flow-server.maxNumberOfShards: 50  (2)

1	Number of flow-server pods.
2	Set to `10 × replicas` (e.g. `10 × 5 = 50`).

Apply the changes using Helm or Helmfile as described in Deploy with Helmfile.

3.2. Source Processor Distribution

When running multiple flow-server pods, source processors are distributed across the cluster depending on their type:

Most source processors (e.g. restApi) are deployed on every node in the cluster, so incoming work is naturally spread across all pods.
Some source processors are singletons and run on only one node at a time, regardless of how many pods are running. Examples include readFiles, schedule, and receiveFromRabbitMq.

Singleton sources are not parallelised automatically by adding more replicas. If parallel execution is required, the flow developer can apply manual sharding — for example, deploying multiple copies of a flow where each instance reads files with a different name prefix.

4. Vertical Scaling

Any Connect service can be vertically scaled by increasing its CPU and memory limits. This is particularly relevant for the flow-server when handling large message payloads, but applies to all services. See Chart Value Override Recommendations for guidance on adjusting resource limits.