Metrics and Dashboards
Connect services publish a range of metrics that are available in the Grafana dashboards included with Connect and Foundation. For more information about the Grafana dashboards provided with Foundation, see the Monitoring section of the Foundation Base documentation.
You can access the Foundation Grafana instance at https://admin.YOUR_DOMAIN/monitoring/grafana.
Having these dashboards available is helpful when working with Connect support to investigate performance issues or service instability.
Connect Dashboard
The Connect dashboard displays both business and technical metrics. It provides an overview of message processing activity, organized by flow. These metrics help you identify potential integration issues, such as a high number of failed messages, and understand the system load for the selected time period.
Inflight, Buffered, and Stashed Messages
Three panels display inflight, buffered, and stashed messages. These panels are key tools for monitoring integration traffic and can help identify potential flow misconfigurations.
Connect persists all incoming messages. A limited number of messages are processed concurrently, while additional messages are buffered in memory. When the buffer reaches its limit, persisted messages are no longer held in memory. These are referred to as stashed messages. Stashed messages must be read back from persistence once buffer space becomes available before they can be processed.
A consistently high inflight message count is not always a problem, but it may indicate inefficient configuration. For large messages in particular, high inflight counts can consume significant memory and should be monitored closely.
If the buffer remains full for extended periods, it could indicate a configuration issue. Buffered messages are kept in memory to handle short bursts of load without reading from persistence. For extended periods of high loads, the buffer can fill up, causing messages to move to the stash. In this situation, the buffer continues to consume memory, but you will have to refill the buffer periodically from stash. This effectively removes the performance gain you get by keeping messages in memory.
If you observe consistently high numbers of both buffered and stashed messages, review your buffer configuration. In some cases, reducing the buffer size may improve overall performance.
Processing and Persistence Time
The panels displaying processing and persistence times are useful for identifying performance issues. When Connect is under heavy load, the persistence provider can sometimes become the main bottleneck. Monitoring average persistence time can help determine whether this is the case.
If processing time per message is high, it may indicate problematic processors, for example, a suboptimal map implementation.
JVM Metrics
All microservices deployed as part of Connect expose standard JVM metrics, including memory usage, CPU utilization, and garbage collection statistics.
These metrics are available in the Spring Boot 2.3 Statistics dashboard included with Foundation. To view statistics for a specific Connect microservice, select the job that corresponds to that service (for example, connect-flowserver). Note that you can view only one instance or pod at a time.
While this dashboard provides detailed insight into JVM memory and garbage collection activity, it does not display the usage patterns or business metrics that influence them. They are helpful for investigating performance or stability issues when correlated with metrics from the Connect dashboard.
Kubernetes Dashboards
You can access several Kubernetes dashboards in the GridOS Grafana installation.
The Kubernetes dashboards provide cluster-wide, infrastructure-focused metrics that support resource management. You can view resource usage across all containerized workloads and use the dashboards to assess how workloads utilize their resources, whether they stay within their allocated resource limits, and when resource utilization occurs at the Kubernetes node level.
These dashboards do not explain why certain metrics, such as memory consumption is high or what drives CPU usage. For deeper insights into Connect service performance, use the JVM metrics.