#harvestsnapmirror.yaml at main · NetApp...

1 messages · Page 1 of 1 (latest)

serene hull
#

hi @stable crater the source_node is probably added by the plugin listed on line 37. Let me check

#

what version of Harvest are you on @stable crater ?

stable crater
#

22.05.0-1

#

which should have the fix in it I would suspect

serene hull
#

line 27 lists source-node and I believe the zapi code converts dashes to underscores which is why source_node is referenced later. hat tip to @shadow topaz for remind me of that conversion

stable crater
#

at least in our poller data; this is a dump of zapi -n POLLER show data --api snapmirror-get-iter

serene hull
#

right, that's the real problem. we've been improving this code over the last week. let me dig up the PR and see if that fixes your problem. This is the issue https://github.com/NetApp/harvest/issues/1192 which has been fixed for REST. @civic badge is working on a ZAPI fix now

stable crater
#

hold on, it's not ^snap anymore it's ^Netapp_snap

#

ok, here's the source data

#

total records

serene hull
#

by chance are your snapmirror relationships inter-cluster? If so, that's what that issue is about and is not fixed for ZAPI yet

stable crater
#

no, not inter-cluster, these are snapmirrors to external clusters that house 'snapvault'

serene hull
#

gotcha. ok, in that case, the zapi won't see the external cluster - it's explained in the issue

Snapmirror relationships are destination driven. So, snapmirror info is available in zapi/cli/rest destination cluster, but no info available from source cluster via zapi/cli/rest.

#

that's the bug that @civic badge is fixing in the Zapi plugin. Was fixed earlier this week for REST

stable crater
#

right, this is a destination cluster that I am querying

#

not the source of the data

#

ok, got it to sort out the counts:

Here's the count of everything that is a netapp_snap (not policy) could that has the source_node=""

#

here's checking what shows up with source_node != ""

#

clearly missing data

#

the reason I found this is that the "dashboard" in harvest for it presents back "Source - " data in a graph

serene hull
#

yes, data will be missing until the bug is fixed

stable crater
#

which bug

#

?

#

you have a bug id?

serene hull
#

I pasted above but may have gotten missed. This is the same issue you are hitting. That issue has been fixed for the REST collector (because REST returns a bit more information) and @civic badge is fixing for ZAPI now

stable crater
#

I see the issue, but don't see comments about zapi in that issue

#

the one you sent shows 'fixed:' and closed

serene hull
#

yes, it was auto-closed when the REST PR was fixed

stable crater
#

maybe I missed that

serene hull
#

I reopened it

stable crater
#

Thanks

#

I have to run to a meeting

stable crater
#

is there a timeline for the next release? I noticed that there are nightly builds but I don't see when / what decides a release is coming

shadow topaz
stable crater
#

ah, ok. Though, with having these issues, is it recommended to deviate from 'releases' and use the 'nightly builds'? I would like to have this fixed but not sure what is different between them other than likely a code freeze on that branch you use for it

serene hull
#

nightly is built from the main branch and passes all CI and unit tests. We work hard to keep main green because it's the best way to get feedback from customers on current features and fixes. Generally it's safe to use main, but of course, sometimes things slip through. Documentation may be lagging from features in main and main will go through more testing before being promoted to a release.

The snapmirror issue, you'd like fixed, is still in review for the ZAPI side, and hasn't hit main yet https://github.com/NetApp/harvest/pull/1307

stable crater
#

oh, ok

#

I'll check again next week

serene hull
stable crater
#

Thank you. I will try it out today

stable crater
#

getting the same result with

curl -s 'http://localhost:13003/metrics' | grep ^netapp_snap | grep source_node | grep -v netapp_snapshot_policy | grep 'source_node=""' | wc -l
16996
here's checking what shows up with source_node != ""
curl -s 'http://localhost:13003/metrics' | grep ^netapp_snap | grep source_node | grep -v netapp_snapshot_policy | grep -v 'source_node=""' | wc -l
0

#

it's down from 18200

#

the snapmirror dashboard is populated now though....

#

for some of the clusters, not all of them

serene hull
#

hi @stable crater progress! you won't see the source nodes via the curl, that was covered in the FAQ. Glad to hear the dashboard is populated. Would be better if it was populated for all clusters 🙂 let's see if we can figure out why

brisk geode
#

@stable crater could you share the poller logs from the cluster which haven't been populated the in snapmirror dashboard? Also could you run this command on the poller which haven't been populated just to confirm how many relationships exist there: curl -s 'http://localhost:xxxxx/metrics' | grep ^netapp_snapmirror_labels | wc -l

stable crater
#

do you want the ones that have source_node = "" ?

#

or just all of them for that poller?

#

the snapmirror dashboard isn't populating the 'source' filer when trying to select a node that is the destination of the snapmirrors (vault)

#

I took the snapmirror dashboard from the github site but not sure on what version I should be using of the dashboard

stable crater
#

pulled the latest version from github

#

still broken

#

"expr": "count (count by (relationship_id) (snapmirror_labels{datacenter="$Datacenter",cluster=~"$Cluster"}))",

#

there is no metric called 'snapmirror_labels' anymore

#

it's netapp_snapmirror_labels....

#

and everything else in the dashboard that has 'snapmirror_' needs replaced with 'netapp_snapmirror_'

#

I suspect all the other dashboards are also in that same state, where / when the netapp_ was added, those weren't updated

brisk geode
#

Hi @stable crater I would need logs of all the snapmirrors for that poller. Regarding the statementthe snapmirror dashboard isn't populating the 'source' filer when trying to select a node that is the destination of the snapmirrors (vault), With the fix of https://github.com/NetApp/harvest/issues/1192, Now snapmirror dashboard would be showing the source side view. In simple example: There is relationship between volume v1 as source in cluster C1 to volume v2 as destination in cluster C2, This relationship will be visible only when you select cluster C1 in dropdown at cluster variable, not in C2. Same way if you choose any node which is as source side, then it will show data, not other way.

#

Regarding there is no metric called 'snapmirror_labels' anymore, it's netapp_snapmirror_labels...., we haven't made any changes for this. As per the zapi template https://github.com/NetApp/harvest/blob/main/conf/zapi/cdot/9.8.0/snapmirror.yaml#L3 and rest template https://github.com/NetApp/harvest/blob/main/conf/rest/9.12.0/snapmirror.yaml#L5, object name is snapmirror which ensure that any metric from this template would start with snapmirror_xxxxxxx. It seems you might have custom template for snapmirror which would be doing this change, could you confirm the same?

serene hull
#

@stable crater how are you installing Harvest? we aren't prefixing any of the metrics with netapp_

stable crater
#

ah, nevermind then, I thought it was set to that by harvest, it looks like when we moved from the old harvest version to the github one, we used the prefix 'netapp' to match the old one

#

aka. we should have NOT done that

#

I was able to change the dashboard to match our prefix

#

sed is your friend

serene hull
#

if you need to do that again, another handy way to rewrite is with the --prefix arg of grafana import

stable crater
#

I am using the graphana import via web browser, not command line, unfortunately

stable crater
#

I think I am confused as to how we are getting the source server, is it via the volume labels?

#

nevermind, I found the issue

#

well, I found MY issue, due to 'prefix' but the data is still not appearing

#

it's using 'cluster' to check the 'source point as well' which won't work, since cluster is the 'target' cluster where the snapmirrors are going too, shouldn't we be checking the 'source_volume' against the 'volume' in volume_labels instead?

#

count by(source_node, relationship_status) (snapmirror_labels{datacenter=~"$Datacenter", source_cluster=~"$Cluster"}) * on(source_volume, source_vserver, source_cluster) label_replace(label_replace(label_replace(label_replace(volume_labels{datacenter=~"$Datacenter", cluster=~"$Cluster", node=~"$SourceNode", node!=""}, "source_volume", "$1", "volume", "(.)"), "source_vserver", "$1", "svm", "(.)"), "source_node", "$1", "node", "(.)"), "source_cluster", "$1", "cluster", "(.)")

#

when I read this, the snapmirror labels are looking at the 'source_cluster' of the snapmirrors (which is where the snapmirror is originating from, on the destination side). if you use that same label to search volume_labels, it won't work, since that cluster doesn't contain the 'source volume'

serene hull
#

which panel is your pasted expr for?

stable crater
#

Source Relationships per Node

serene hull
#

this one? I ask because what you pasted does not match the expression in github for that panel and I want to make sure we're talking about the same thing

stable crater
#

yep

#

I pulled this out of the github yesterday

#

is there a new one?

#

the entire dashboard

serene hull
#

the PR was checked in 2 days ago https://github.com/NetApp/harvest/pull/1307 so you should be good on that, but it included dashboard and snapmirror template/plugin changes so you need to take the entire build, not just the dashboard

stable crater
#

where I pulled it from was from 'harvest', is there a seperate location for the dashboards?

serene hull
stable crater
#

I pulled and installed nightly as suggested and it's working from harvest-22.09.26-nightly.x86_64

serene hull
#

ok so you have the new poller and templates too sounds like

stable crater
#

is there an easy way to check?

#

from the binary side?

serene hull
#

yes, bin/harvest --version

stable crater
#

harvest version 22.09.26-nightly (commit 81858dac) (build date 2022-09-26T08:11:30-0400) linux/amd64

serene hull
#

is it possible an older version of the dashboard was imported?

stable crater
#

shouldn't have been

#

I can try to import it again and see

serene hull
#

it's possible I missed something, double check that panel does not have group_left and maybe double check that the version of the file you download does include it

stable crater
#

weird, I assumed I had the latest version from 'main' but clearly I didn't get the latest one

#

now that panel is populating

serene hull
#

not sure if it's too early for confetti ball so I'll give an enthusiastic 👍

stable crater
#

but anytime I select cluster, the panel has 'no data'

serene hull
#

ok panel is populating and works until you select a cluster?

stable crater
#

Yep, if I set cluster to a node that has snapmirrors, that panel goes to 'no data' immediately

#

if it has 'all' in the field, it works as expected, to show them all

#

Here's the query

count by (source_node, relationship_status) (netapp_snapmirror_labels{datacenter=~"$Datacenter",source_cluster=~"$Cluster"} * on (source_volume, source_vserver, source_cluster) group_left(source_node) label_replace( label_replace( label_replace( label_replace (netapp_volume_labels{datacenter=~"$Datacenter",cluster=~"$Cluster",node=~"$SourceNode",node!=""}, "source_volume", "$1", "volume", "(.)") , "source_vserver", "$1", "svm", "(.)"), "source_node", "$1", "node", "(.)") , "source_cluster", "$1", "cluster", "(.)") )

#

the source node based on the snapmirror isn't going to be based on the sourcenode for the volume labels

#

snapmirror is sourced from the destination filer and the source would only be found via volume name, I suspect

#

and/or cluster

serene hull
#

checking

stable crater
#

I am working on giving you some data as an example

serene hull
#

always helpful, thanks

stable crater
#

Mirror location:

netapp_snapmirror_labels{cluster="XXXrcfs70", releasedatacenter="HPC", derived_relationship_type="mirror_vault", destination_location="XXXrcfsv70a:XXXccfs01_build_rw_1a_mirror_vault", destination_node="XXXrcfs70n02a", destination_volume="XXXccfs01_build_rw_1,,a_mirror_vault", destination_vserver="XXXrcfsv70a", group_type="none", healthy="XXXue", instance="XXXciharv01.fqdn.com:12996", job="harvest_scrape", last_XXXansfer_type="update", policy_type="mirror_vault", protectedBy="volume", protectionSourceType="volume", relationship_id="db1dba11-6082-11e9-acb1-00a098d03d56", relationship_status="idle", relationship_type="extended_data_protection", schedule="sv_1110_2310", source_cluster="XXXccfs01", source_volume="build_rw_1a", source_vserver="XXXccfsv01a"} 1

netapp_volume_labels{aggregate="aggr2_XXXrcfs70n02a_L", cluster="XXXrcfs70", datacenter="HPC", instance="XXXciharv01.fqdn.com:12996", isEncrypted="false", isHardwareEncrypted="false", is_sis_volume="XXXue", job="harvest_scrape", node="XXXrcfs70n02a", protectedBy="not_applicable", protectionRole="destination", snapshot_policy="none", state="online", style="flexvol", svm="XXXrcfsv70a", type="dp", volume="XXXccfs01_build_rw_1a_mirror_vault"}

Source location:

netapp_volume_labels{aggregate="aggr1_XXXrcfs01n50a_H", all_sm_healthy="XXXue", cluster="XXXrcfs01", datacenter="HPC", instance="XXXciharv01.fqdn.com:12992", isEncrypted="false", isHardwareEncrypted="false", is_sis_volume="XXXue", job="harvest_scrape", node="XXXrcfs01n50a", protectedBy="snapmirror", protectionRole="protected", snapshot_policy="2perday_5day_retention", state="online", style="flexvol", svm="XXXrcfsv01a", type="rw", volume="build_rw_1a"}

#

so between the mirror and the source, the cluster is NOT the same

#

even though that query assumes they are

#

hope that helps

#

source_cluster isn't a label in snapmirror_labels at all

serene hull
#

what you pasted includes it?

stable crater
#

based on selecting the 'mirror' cluster

#

so, if in the overall panel, I select datacenter = HPC and cluster=XXXrcfs70, it shows no data, because that is put into the query as 'source_cluster=XXXrcfs70', which doesn't have source data, only mirror data

#

maybe that is expected?

#

it seems when I select a cluster node where there IS source volumes, that works fine

serene hull
#

yes i follow your example, when you set the variable cluster=XXXrcfs70 you get no data because in the query that becomes source_cluster="XXXrcfs70" which matches nothing. On the other hand, if you set the variable cluster=XXXccfs01 you will get data. It would be clearer if the cluster variable was named SourceCluster

serene hull
#

open to suggestions on ways to improve this @stable crater if you have any. Seems like it would be clearer to rename the Cluster variable to SourceCluster, although that's only applicable to some of the panels and not all of them. We could create a separate dashboard to address that variable name change. We could add some hover description on the panels were source_cluster=~"$Cluster". snapmirror_labels{cluster="XXXrcfs70", source_cluster="XXXccfs01" could be made clearer too, cluster is really destination_cluster. I think what you want is an or, something like this snapmirror_labels{source_cluster=~"$Cluster"} or snapmirror_labels{destination_cluster=~"$Cluster"}

stable crater
#

is pondering yet

#

I am trying to understand another relationship that isn't showing up, whether it's something on our end or something in the data in harvest

#

we have several XDP relationships on source filers and they are NOT appearing in the list from harvest, I am still investigating

serene hull
#

I need to run to a meeting, but I'll discuss with the team tomorrow and see if we can improve. Really appreciate you taking the time to try nightly and give us such valuable feedback 💯

stable crater
#

sure, I think it should be 'destination_cluster' since that is what we are looking at and what the other queries are pulling out for the list of 'cluster' in that panel, using 'source_cluster' would mean that you'd have to add all clusters to the list, not just those with snapmirror labels

#

I think I would call it "$Destination_Cluster" as a field name and leave the existing queries the same for 'cluster='

#

but that would mean changes to the labels, if that is possible. you are correct that in the snapmirror_labels that 'destination_cluster' would be more clear than just 'cluster'

brisk geode
#

As per your promQL response of volume_labels and snapmirror_labels, summary would be: volume:build_rw_1a which is residing on node:XXXrcfs01n50a and cluster: XXXrcfs01 is in protected state. So, When you choose cluster as XXXrcfs01 and/or node as XXXrcfs01n50a, you could find this relationship as well as all other relationships whose source resides in this node-cluster. This is the perspective of source side. for your questions: so, if in the overall panel, I select datacenter = HPC and cluster=XXXrcfs70, it shows no data, because that is put into the query as 'source_cluster=XXXrcfs70', which doesn't have source data, only mirror data, maybe that is expected? --> Yes, it's expected
it seems when I select a cluster node where there IS source volumes, that works fine -->Yeah, absolutely.