Starburst.io native UI provides limited cluster metrics. This article explains how to set up a full Starburst monitoring pipeline using Prometheus, JMX Exporter, and Grafana — with the Grafana Trino Plugin to run live SQL queries. You’ll learn how to monitor CPU, memory, query stats, and trends across 10+ Starburst clusters in one Grafana dashboard, and how to add Slack/PagerDuty alerts for your Starburst platform.
Starburst is a powerful distributed SQL engine that allows teams to query data across multiple sources. However, when it comes to monitoring Starburst clusters, there are some limitations:
We manage 50+ Starburst clusters running across multiple environments (OpenShift, Kubernetes, Linux).
Manually going into the Starburst UI or CLI to run queries and check metrics was becoming too complex.
That’s why we Come up a solution using:
Out of the box, Starburst exposes basic system metrics:
But for deeper monitoring — you need:
Starburst does not expose all of these metrics directly in the UI or API. This is why we need an additional metrics pipeline.
To fully monitor Starburst, we built this architecture:
+---------------------+ JMX Exporter +---------------+
| Starburst Cluster 1 | --> Port 8081 --> | Prometheus |
+---------------------+ +---------------+
+---------------------+ JMX Exporter +---------------+
| Starburst Cluster N | --> Port 8081 --> | Prometheus |
+---------------------+ +---------------+
Prometheus --> Grafana --> Dashboards / Alerts
+ Grafana Trino Plugin --> SQL-based dashboards
The Grafana Trino Plugin is extremely useful for Starburst monitoring because:
Without this plugin, you would need to:
With Grafana Trino Plugin:
The Trino datasource plugin allows you to query and visualize Trino (and Starburst) data inside Grafana.
Run Grafana with the plugin using Docker:
docker run -d -p 3000:3000 \
-v "$(pwd):/var/lib/grafana/plugins/trino" \
-e "GF_PLUGINS_ALLOW_LOADING_UNSIGNED_PLUGINS=trino-datasource" \
--name=grafana \
grafana/grafana-oss
$timeFrom($column)
— lower boundary of time range$timeTo($column)
— upper boundary of time range$timeGroup($column, $interval)
— group by time$dateFilter($column)
— date range filter$timeFilter($column)
— timestamp range filter$unixEpochFilter($column)
— unix timestamp rangeSELECT
atimestamp AS time,
metric_value AS value
FROM starburst_metrics_table
WHERE $__timeFilter(atimestamp) AND cluster_name IN($cluster)
ORDER BY atimestamp ASC
Configure JMX agent in your Starburst deployment:
start.args=-javaagent:/opt/starburst/jmx_prometheus_javaagent.jar=8081:/opt/starburst/jmx_exporter_config.yaml
Add all your Starburst clusters:
- job_name: 'starburst'
static_configs:
- targets: ['starburst-cluster-1:8081', 'starburst-cluster-2:8081']
SELECT state, count(*) FROM system.runtime.queries GROUP BY state
If you want full Starburst monitoring, this is the best architecture:
This is now my preferred way of managing Starburst monitoring in large environments.
Starburst --> JMX Exporter --> Prometheus --> Grafana Panels + Trino Plugin SQL Queries --> Slack / Alerts / Visual Dashboards