Hardware Sizing Guidelines

This document provides guidance for sizing GridOS clusters with Foundation and Connect. The small, medium, and large sizing options are general benchmarks and not exact specifications.

Connect is a scalable and resilient integration service. The optimal cluster size depends on how the service is used. Actual resource requirements may vary based on workload characteristics, data retention policies, and future growth.

Use these guidelines as a starting point, and adjust sizing based on observed usage patterns and performance metrics.

Workload Characteristics

Different workload characteristics affect how services should be scaled and how resource limits are applied. The main factors that drive resource requirements are message throughput and message size.

When processing large volumes of small messages, scale Connect horizontally to increase throughput through concurrency. When handling large payloads, scale Connect vertically to ensure sufficient memory to process incoming data.

Note that increased integration workloads impact not only the integration service, but also the persistence provider, log indexer, and other GridOS Foundation services.

For any questions on specific use cases or integration scenarios, contact Connect Support.

Cluster Sizing Guidelines

To meet the different processing needs for Connect, three standard GridOS cluster sizes are defined. Each size specifies CPU and memory resources per node, with separate storage recommendations for Elasticsearch and PostgreSQL.

These guidelines are designed to support the different levels of expected message throughput handled by Connect. As the number of messages processed by Connect increases, additional compute resources might be required to support and maintain stability, reliability, and performance across the integration service.

The following table outlines the recommended hardware sizing for small, medium, and large deployments, based on anticipated hourly message volumes. All compute resource values are specified per node. It is recommended that each cluster include at least four nodes, three running control plane components and all running worker components.

Always validate your configuration under expected load conditions before production deployment.

Cluster Size Hourly Message (size ~2kb) Volume Node Count vCPUs (Cores) per Node Memory (GiB) per Node

Small

Up to 200,000

4

8

32

Medium

Up to 1,000,000

4

16

64

Large

Up to 5,000,000

4

32

128

Elasticsearch and PostgreSQL Sizing Guidelines

When deploying Elasticsearch and PostgreSQL, it is important to consider their specific resource requirements. For best performance, use storage designed for critical, Input/Output Operations Per Second (IOPS)-intensive, and throughput-intensive workloads that require low latency, such as Amazon EBS Provisioned IOPS SSD volumes.

Elasticsearch

Elasticsearch data nodes are categorized as dataHot or dataCold. Their usage is controlled by the Index Lifecycle Management (ILM) policy, which defines where data is stored, how long it is retained, and when it transitions between storage phases (for example, from hot to cold). In the context of Connect, only dataHot nodes are used. The following sizing guidelines apply only to dataHot nodes.

Cluster Size Number of Data Nodes CPU Limits (vCPUs) Memory Limits (GiB) Storage (GiB) per Node

Small

3

2

6

50

Medium

3

4

8

200

Large

3

8

12

600

PostgreSQL

The primary database for Connect is a highly available PostgreSQL instance, connect-postgresql, deployed alongside the Connect microservices. Resource usage and storage requirements depend mainly on the database growth rate and the size of the Write-Ahead Log (WAL).

The following table below lists recommended resource allocations for each cluster size.

Cluster Size Number of Servers CPU Limits (vCPUs) Memory Limits (GiB) Storage (GiB) per Server

Small

3

2

4

5

Medium

3

4

8

20

Large

3

8

16

50