Sizing and Scaling Guidelines
|
This page provides a practical starting point for sizing and scaling GridOS clusters running Foundation and Connect. Use these values as baseline guidance, then validate and tune with workload-specific performance tests. |
Scaling Model
For best performance, use low-latency storage designed for IOPS-intensive and throughput-intensive workloads, such as Amazon EBS Provisioned IOPS SSD volumes.
The two main sizing drivers are message throughput and message size, and the general guideline is as follows:
-
Use horizontal scaling (more pods/nodes) for high-concurrency, high-throughput traffic.
-
Use vertical scaling (more CPU/memory per pod/node) for large payloads and memory-heavy processing.
This page is split into two parts:
-
General sizing guideline for small, medium, and large baseline environments.
-
Scenario-specific sizing guideline for workload tuning.
Configuration Ownership: Foundation vs Connect Installation
| Configure During | Components |
|---|---|
Foundation installation |
Kubernetes cluster/node sizing and Foundation-managed services such as Elasticsearch. |
Connect installation |
Connect services and Connect third-party charts, including |
1) General Sizing Guideline (Small / Medium / Large)
The following table outlines recommended Kubernetes cluster baselines based on anticipated hourly message volumes in Connect. At least four nodes are recommended: three control plane nodes, with worker capacity available across the cluster.
| Cluster Size | Hourly Message Volume (~2 KB avg) | Node Count | vCPUs per Node | Memory (GiB) per Node |
|---|---|---|---|---|
Small |
Up to 200,000 |
4 |
8 |
32 |
Medium |
Up to 1,000,000 |
4 |
16 |
64 |
Large |
Up to 5,000,000 |
4 |
32 |
128 |
As a sizing rule of thumb, set memory requests equal to memory limits for JVM-based services to keep heap behavior predictable. For non-JVM services, start with lower memory requests than limits and tune them based on observed usage. For CPU, set requests to expected steady-state demand and limits to expected peak demand.
Elasticsearch Baseline (Foundation Installation)
The following values apply to Elasticsearch dataHot nodes used by Connect workloads.
CPU request values are practical starting points (not product-mandated values) and should be tuned from observed indexing/search load.
| Cluster Size | Data Nodes | CPU Request (vCPUs) | CPU Limit (vCPUs) | Memory Request (GiB) | Memory Limit (GiB) | Storage (GiB) per Node |
|---|---|---|---|---|---|---|
Small |
3 |
2 |
2 |
6 |
6 |
50 |
Medium |
3 |
4 |
4 |
12 |
12 |
200 |
Large |
3 |
8 |
8 |
24 |
24 |
600 |
MinIO Baseline (Foundation Installation)
MinIO is deployed and managed by Foundation (MinIO Operator + MinIO Tenant).
Connect uses MinIO as side-channel storage for large messages and as a backup target for PostgreSQL and OpenBao.
Size MinIO Tenant storage based on expected large message volume and backup retention requirements. Configure MinIO sizing during Foundation installation.
Connect Services Baseline (Connect Installation)
For Connect installation sizing, prioritize tuning Flow Server and PostgreSQL.
In most environments, the shipped chart defaults for other Connect services and third-party services are sufficient and can remain unchanged unless monitoring shows sustained resource pressure.
1. PostgreSQL Baseline
PostgreSQL is deployed in HA mode (replica: 3) and can be tuned vertically by adjusting CPU and memory requests and limits.
| Cluster Size | Replicas | CPU Request (vCPUs) | CPU Limit (vCPUs) | Memory Request (GiB) | Memory Limit (GiB) | Storage per Server (GiB) |
|---|---|---|---|---|---|---|
Small |
3 |
1 |
2 |
2 |
4 |
5 |
Medium |
3 |
2 |
4 |
4 |
8 |
20 |
Large |
3 |
4 |
8 |
8 |
16 |
50 |
2. Flow Server Baseline
Flow server is the only Connect service that supports horizontal scaling, all others (e.g., identity services, insights, resource registry, deployment operator and frontend services) should remain at replicas: 1 and be tuned vertically.
| Cluster Size | replicas |
CPU Request (vCPUs) | CPU Limit (vCPUs) | Memory Request (GiB) | Memory Limit (GiB) | mc.flow-server.maxNumberOfShards |
|---|---|---|---|---|---|---|
Small |
3 |
1 |
1 |
2 |
2 |
30 |
Medium |
5 |
2 |
2 |
4 |
4 |
50 |
Large |
7 |
3 |
3 |
8 |
8 |
70 |
|
|
|
When scaling flow server horizontally, keep the replica count odd to reduce the risk of split-brain scenarios during leader election. For example, if you would otherwise scale to 6 replicas, consider scaling to 7 instead. |
2) Scenario-Specific Sizing Guideline
Use these scenarios to adjust the general baseline for your workload. Apply the listed changes in priority order.
| Scenario | Foundation Installation Changes | Connect Installation Changes |
|---|---|---|
High-Throughput and Small Messages |
Scale worker-node CPU first; verify Elasticsearch indexing CPU and storage IOPS. |
Scale flow server horizontally first; keep |
Large-Payload Integrations |
Scale node memory first; verify Elasticsearch memory/storage throughput and MinIO Tenant storage capacity. |
Scale flow server vertically (memory first, then CPU); verify MinIO storage/throughput because payloads above the |
Mixed Workloads |
Balance node CPU and memory; monitor Elasticsearch indexing latency and disk saturation. |
Combine moderate flow server horizontal scaling with moderate vertical scaling across flow server, PostgreSQL, and Insights. |
Burst / Peak Traffic |
Size nodes and Elasticsearch for peak windows, not average traffic; validate storage burst behavior. |
Maintain flow server replica headroom; verify PostgreSQL CPU/memory and WAL (write-ahead log) storage growth during peak periods. |
Implementation Notes
-
Configure Kubernetes capacity and Foundation-managed services during Foundation installation.
-
Configure Connect services and Connect-managed third-party charts during Connect installation.
-
Define CPU and memory requests for all workloads; define limits according to your runtime policy and test results.
-
Validate under representative load before production rollout.