#Gateway Node Monitoring Metrics

1 messages · Page 1 of 1 (latest)

atomic crag
#

Hi NetApp Discord Community,

We’ve got a 2‑site StorageGRID cluster (already live) and are running a POC to add 3 VM Gateway Nodes per site. I’m researching Gateway‑specific monitoring, but docs are pretty light around QoS and VIP monitoring.

Some brainstorming ideas for monitoring:
QoS / throttling - Detect when a tenant or bucket is being throttled by a traffic policy so we can adjust the limit or deal with noisy clients.
Gateway to Storage Node Communication Issues - Spot comms/TLS errors or latency problems.

We already monitor the basics (CPU, node up/down, etc.), but want better visibility into Gateway performance and traffic policy enforcement.

I’ve scraped a bunch of load‑balancer‑related metrics (mostly private Netapp metrics) in Prometheus, which I would be happy to share, but for the sake of brevity I will refrain from posting them all here.

Has anyone here implemented monitoring specific to Gateway nodes and has suggestions? Which metrics are you using to alert on?

Appreciate the time!