#Can some experts please explain what is this Prometheus graph all about?
1 messages · Page 1 of 1 (latest)
You sould be able to see more if you just login to SG admin GUI, it will most likely tell you why the nodes are read-only, and you should also be able to see how full they are... the graph doesn't make sense to me, looks like you are looking at some TCP traffic counters, but without any context it doesn't make sense
This is so called "TCP Retransmission Rates". For privacy purpose, I removed Legends which show all SG internal storage nodes. I believe it shows retransmissition between nodes in SG for replication/resync/resend, due to RO nodes, they ave to retry, those caused high retransmission rates. JUst to confirm with you if what I am saying is true.
TCP retransmissions do not occur because of some logic in the software (like "oh, this node is read only, try again with some other node")
they occur because packets are lost/dropped/corrupted on the network
Any proof to say there won't be any retrying, resync etc when multiple nodes were forced to RO?
When mulitple nodes are RO, only seen when SG is 99% full.
There are miniumum writable nodes required, I believe 2, so, if went down to this number, issue will be started, the user cannot write anymore, at this point, it will slow down the entire ONTAP cluster or the aggregates using FabricPool.
if this is indeed "TCP retransmissions", then this is all part of the TCP protocol, no application layer visibility to those
I didn't say that the nodes will not retry at some level when they are full. I just said that TCP retransmissions are not this kind of retries
It's not clear that under what circumstances the retransmission would happen. But, to you point, why it happneds only when multiple nodes were forced to RO or SG 99% FULL, and error codes 500 start to increase? If it is packets drops, it could happen anytime.
Ultimately, my question is, this graph "TCP Retransmission Rates" under Support-Diagnostics, is it about just layer2 network(Internal grid network inside SG), or about networks outside or interfacing SG networks?
It didn't explicitly say either way. It looks a layer2 network to me.
tcp is going to be layer3, irrespective of client or grid network.