#┊・harvest-nabox🔒

1 messages · Page 1 of 1 (latest)

loud ocean
#

hi all, glad to see you made it

fossil bane
#

Thanks @loud ocean

lyric mulch
#

Really glad you’re all here! I know there are lots of folks excited about harvest

willow comet
#

Is this for the harvester for Grafana?

loud ocean
dusty lance
#

Nice! We'd been holding out for a new release of Harvest at my old job... never got to upgrade though. But what the heck I can run it at home now.

#

If I have neither running yet... Prometheus or InfluxDB?

#

(looks like the influxdb exporter is a bit less work.)

fossil bane
#

@dusty lance we suggest to use Prometheus which has more harvest default dashboards than the influx.

loud ocean
dusty lance
tough kestrel
#

Hi all, I'm planning to move from harvest 1.6 to 2.0. Regarding the prometheus disk requirements I've seen that wiki page https://github.com/NetApp/harvest/wiki/FAQ#sizing

In order to estimate disk usage, can I start from the metrics/min data that comes from the "netapp detail: harvest poller" dashboard in the 1.6 installation? Do it can be converted to ingested_samples_per_second?

Thanks!

ocean kite
fossil bane
fossil bane
tough kestrel
fossil bane
tough kestrel
#

Hi, sorry if I make dumb questions, which is the suggested way to customize grafana? I'm trying with environment variables but it seems they're ignored. I would like to have grafana listening via https and then using ldap authentication.

Thanks

fossil bane
tough kestrel
tough kestrel
# fossil bane <@1004763303618301973> As mentioned in the guide, You should start with harvest ...

Hello, @fossil bane
could you please better explain which data should be gathered from the endpoint? Here's what came from one of my pollers:

Below is the list of metrics provided by my collectors and plugins.
Exposing data from 2 collectors and 35 objects, 429 metrics in total.

It seems another option to get ingested_samples_per_second is to query prometheus itself. How can I access the GUI?

Thank you

tough kestrel
#

Hello, in the containerized environment, do the prom-stack.yml is overwritten when I generate a new compose file (for instance, when I add new pollers)?
If I make some customization for grafana/prometheus in that file, how can make them persistent?

fossil bane
#

If you don't wish to persist it then you can pass a different stack prom-stack file during docker-compose command like below..
docker-compose -f custom-prom-stack.yml -f harvest-compose.yml up -d --remove-orphans

agile oracle
#

Hi gang!

First of all, Thanks for this wonderful project!

I’ve got a question. I have been tweaking the harvest to measure the QoS Limit “DELAY_CENTER_QOS_LIMIT: throttle” and I’ve done uncommenting the 4 workload objects at the bottom in /conf/zapiperf/default.yaml and did the docker-compose again but I don’t see the counter or the workload objects in Grafana. Can someone guide me on what I am missing? Thanks!

GitHub

Open-metrics endpoint for ONTAP . Contribute to NetApp/harvest development by creating an account on GitHub.

agile oracle
lyric mulch
#

Glad you were able to solve it @agile oracle! I’ll make sure one of the Harvest folks come along and see 🙂

agile oracle
sturdy crest
#

Rahul Hello!

#

MrObvious here.

fossil bane
#

Hi, Could you share the poller logs.

sturdy crest
#

Where are they located?

fossil bane
#

Are you using nabox?

sturdy crest
#

Yes.

fossil bane
sturdy crest
#

Whoops took a screenshot of the wrong window

fossil bane
#

Are you able to access prometheus?

sturdy crest
#

Should be able to.

fossil bane
#

see if there is any metrics name qos_latency?

sturdy crest
fossil bane
#

ok share us the logs. We'll take a look there

fossil bane
#

@sturdy crest As discussed, default.yaml still had workload uncommented hence there were no workload metrics.

young steeple
#

Mmmm, shouldn't be the big fat red warning fixed on that page ? 😄

sturdy crest
#

I'll have to check the clock...

#

But the metrics are working now.

#

On the bright side I did find a bug in the volume dashboard.

fossil bane
#

@sturdy crest What's that bug?

sturdy crest
#

The volume dashboard references qos_detail_volume_resource_latency but that doesn't exist.

#

The ones with no data haven't been fixed yet.

fossil bane
sturdy crest
#

Hmm. I mean workload_detail_volume is definitely an object in ONTAP CCMA.

fossil bane
#

yes all volumes of workload-class as autovolume are tracked under this template in harvest

sturdy crest
#

Weird I don't have it in prometheus.

#

Does it come from the ONTAP side, or should it at least exist in prometheus?

fossil bane
#

You may not have any volume matching autovolume. See the response of below zapi request
`<?xml version="1.0" encoding="UTF-8"?>

<netapp xmlns="http://www.netapp.com/filer/admin" version="1.160">
<qos-workload-get-iter>
<query>
<qos-workload-info>
<workload-class>autovolume</workload-class>
</qos-workload-info>
</query>
</qos-workload-get-iter>
</netapp>`

#

Also logs should show if any instances were found for this template

sturdy crest
#

Hmm, ok.

fossil bane
#

For me , I have the relevant data

sturdy crest
#

I wonder how to check that. Would I need to use ZExplore?

fossil bane
#

Zoom tool or harvest cli would help
bin/zapi -p POLLERNAME show data --api qos-workload-get-iter

sturdy crest
#

You should have CLI access if you're on VPN. I think SSH works.

fossil bane
#

I have checked your machine and it still have WorkloadDetailVolume disabled

sturdy crest
#

Derp.

#

Fixed. Rebooting now.

#

Thanks Rahul.

fossil bane
#

You are welcome!

sturdy crest
#

Still not seeing it???

fossil bane
#

takes around 5 minutes for first poll

sturdy crest
#

Ah

#

Ok it is there now. Thanks!

fossil bane
#

awesome!

sturdy crest
#

Now if we can just figure out how to build a dashboard similar to delay center view that'd be amazing.

fossil bane
#

let us know your requirement via Github. You can add any reference from PAS and we can take a look

sturdy crest
#

It was that Github ticket you responded on.

#

Hang on I'll find the PA link for my vSIM once it loads.

fossil bane
#

sure you can dm me the details of PAS instance

agile oracle
#

Hi gang, is there a way to check the max throughput and min throughput of the QoS policy-group with Harvest? I see the qos_detail_ops and the other QoS related metrics but I couldn't figure out how to get those figures.

fossil bane
young steeple
#

When it comes to enable Workloads and QoS counters, can it be enabled in custom.yaml istead of altering default.yaml ?

young steeple
#

Ok, so I have this :

nabox-api:/opt/harvest2-conf/conf/zapiperf# cat custom.yaml 
collector: ZapiPerf
objects:
  Volume: custom_volume_blacklist.yaml
  Workload: workload.yaml
  WorkloadDetail: workload_detail.yaml
  WorkloadVolume: workload_volume.yaml
  WorkloadDetailVolume: workload_detail_volume.yaml
#

And not getting much in Top Volume End-to-End QoS Drilldown

#

or... I hjust have to be patient

#

All good thanks !!

fossil bane
loud ocean
#

which means frequency * 2 or six minutes before Harvest will export metrics for these

young steeple
#

Ok, just so you know, next version of NAbox will have workload/qos turn on by default

agile oracle
tight iron
loud ocean
#

FAQ · NetAppharvest Wiki

uneven pumice
#

👋🏻 I see we moved over from Slack :). I keep seeing some outliers in harvest 2.0 that I use to manage with latency_io_reqd in 1.1x, I have volumes that do very little IO and see crazy latency figures when I'm convinced this is not the case. Is. there anyway of using a similar parameter to stop the false latency displaying

fossil bane
uneven pumice
#

topk($TopResources, volume_read_latency{datacenter="$Datacenter",cluster="$Cluster",svm=~"$SVM",volume=~"$Volume"})/1000 this or volume_write_latency gives me good results

#

I think I may see the issue actually

#

The default query did not have the /1000 on the end

fossil bane
#

Let me check

#

Divide by 1000 is used to display values in ms for tables. Ontap returns this counter in microsec. Grafana charts takes care of this value depending on values received. Also table will show the last value only. value from table should match last value from graph.

#

I think you are comparing 2 different counters... volume_avg_latency and volume_read_latency both are different counters.

loud ocean
# uneven pumice I am seeing it reporting under `avg_latency` yet when I check max write_latency ...

hi @uneven pumice not sure which version of Harvest you are running, but in addition to what Rahul has pointed out, there was a bug described https://github.com/NetApp/harvest/issues/1175#issuecomment-1198327824 that affected latency calculations. This bug was fixed https://github.com/NetApp/harvest/pull/1154 about 29 days ago. Can you confirm which version of Harvest you are using with harvest version ? If you want to try these out before the next release, you can grab the latest https://github.com/NetApp/harvest/releases/tag/nightly build.

uneven pumice
primal basin
#

@loud ocean , @young steeple - good afternoon - we had a VMware vCenter outage that required ESXi host reboots and now our nabox instance is being assigned the wrong IP Address within VMware vCenter.
Can you please help me in assigning the correct IP address?

young steeple
#

Followup question : can you connect to the web interface on this IP address ?

primal basin
#

@young steeple - nope, web interface doesn't connect

young steeple
smoky portal
#

What address are you getting, and what address are you expecting? Mask the last two octets if you want.

primal basin
young steeple
primal basin
young steeple
primal basin
young steeple
#

What’s in /etc/network/interfaces ?

primal basin
#

cat /etc/network interfaces
auto lo
iface lo inet loopback

young steeple
#

That hardly makes any sense 😀 are you sure that’s NAbox ?

primal basin
#

yes sir - VMware vCenter shows the following network interface is connected:
VMNetwork-xxx (connected) | 00:50:56:b7:e1:bf

young steeple
#

What type of interface is it ? Vmxnet3 ?

#

so ifconfig eth0 doesn't return anything ?

primal basin
young steeple
#

Ah ok cool

#

You need a static IP address or DHCP ?

primal basin
#

static

young steeple
#

This is what /etc/network/interfaces should look like :

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0

iface eth0 inet static
address %s
netmask %s
gateway %s
loud ocean
#

The Harvest team is happy to announce the release of 22.08 https://github.com/NetApp/harvest/releases/tag/v22.08.0
Highlights of this major release include:

  • an ONTAP event management system (EMS) events collector with 64 events out-of-the-box
  • Two new dashboards added in this release:
    • Headroom dashboard
    • Quota dashboard
  • We've made lots of improvements to the REST Perf collector. The REST Perf collector should be considered early-access as we continue to improve it. This feature requires ONTAP versions 9.11.1 and higher.
  • New max plugin that creates new metrics from the maximum of existing metrics by label.
  • New compute_metric plugin that creates new metrics by combining existing metrics with mathematical operations.
  • 48 feature, 45 bug fixes, and 11 documentation commits this release
young steeple
#

I have no explanation on how this file has been reset to the current state though

#

before changing it, if it's not too late, can you check it's modification time ?

primal basin
primal basin
young steeple
#

Yes

smoky portal
#

Any idea of the uptime before it was reset?

primal basin
smoky portal
#

so that puts it after March 29

young steeple
#

Is it possible you somehow mounted eth0 manually in the CLI, figure out it seems to be working but config files were never updated ?

primal basin
#

nope - I've never manually configured eth0 in the CLI

#

I've also rebooted the VM several times - also did a VM power cycle

young steeple
#

Ok you can overwrite /etc/network/interfaces or get nabox-api internal docker IP and issue the proper API call, but editing the file is simpler

primal basin
young steeple
#

Just for the sake of nerding, you can use the internal API to reset network config :

curl -X POST -uadmin:Netapp01 -H "Content-type: application/json" -d '{
  "hostname": "nabox",
  "ip": {
    "dns": [
      "192.168.0.100"
    ],
    "domain": "company.com",
    "gateway": "192.168.0.1",
    "ip_address": "192.168.0.100",
    "netmask": "255.255.255.0"
  },
  "use_dhcp": false
}' http://`docker inspect nabox-api|jq -r '.[0].NetworkSettings.Networks["docker-compose_default"].IPAddress'`:5000/api/1.0/system/network-config
primal basin
young steeple
#

Has the file been reset somehow ?

primal basin
primal basin
#

Our network team currently has ICMP disabled throughout the environment
SSH to the NAbox failed...

young steeple
#

Are you positive gateway and ip are properly set ?

primal basin
#

IP = yes, gateway = I don't know - checking internally with our network team

young steeple
#

I’m thinking maybe interfaces has a syntax error

#

Ok

primal basin
# young steeple Ok

@young steeple - this is what I copied and adjusted from another nabox instance

The loopback network interface

auto lo
iface lo inet loopback

The primary network interface

auto eth0

iface eth0 inet static
address 172.24.x.x
netmask 255.255.255.0
gateway 172.24.x.x

young steeple
#

Looking good. How about ip a s eth0 ?

primal basin
young steeple
#

Gateway shows in ip r s ?

#

Not sure about the syntax.

primal basin
#

@young steeple - are you available for a quick Zoom session?

young steeple
#

Not right now but give me an hour

primal basin
#

okie - I'll try to vMotion the VM to another host...

young steeple
#

Not sure if I can help though, looks like IP misconfiguration but we’ll confirm.

primal basin
#

wait!

#

It's working now

smoky portal
#

after moving to another host?

primal basin
#

no - I fixed the incorrect default gateway in /etc/network/interfaces and I guess it took some time to propagate, maybe?

#

I'm able to hit both the admin interface and Grafana dashboards successfully

#

also SSH is functioning perfectly to the OVA

#

the nice thing about Grafana, is we can tell exactly when the VMware vCenter issue started 2 nights ago because the metrics immediately dropped! lol - now onto root cause for this SEV-1 the other night...

young steeple
#

I forgot to mention you needed to reboot or issue service networking restart

young steeple
primal basin
primal basin
smoky portal
#

^ Congrats on the new release 🙂

tight iron
sturdy crest
#

Well...on my vSIM lol.

young steeple
tight iron
young steeple
tight iron
young steeple
#

Some dashboards gets deprecated ?

fossil bane
young steeple
#

Oh I think I see... @fossil bane did you guys rename the dashboards to "ONTAP: *" ?

young steeple
#

ok that's why. I do dahsboards overwrite but if the dashboard name changes it won't be overwritten. I should probably empty the folder when importing dashboards with overwrite

#

Ideally I should use dashboard provisioning feature in Grafana to read it from disk so they're immutable and reflect what's in the folder dynamically

#

Is there a reason you don't use tags on stock dashboards ?

fossil bane
young steeple
#

Would be nice to have to quickly identify default dashboards

#

I'll open an issue on github and see what people think, or see if there is one already

stiff dove
#

The new harvest release makes a very good first impression. It matured really well since the first release 21.05.0. Keep up the good work!

tight iron
#

Hi guys, next two questions about the harvest 🙈 in the Quota dashboard I only see three entry's but since we have many quotas some are missing. Does harvest requires some specific setting on the SVM to collect the data? And second question is about the headroom dashboard. Is there something like a guide which explains what the specific panels are showing?

fossil bane
fossil bane
tight iron
fossil bane
fossil bane
loud ocean
tight iron
loud ocean
#

thanks, we'll double check 9.9.1 P9 in the meantime can you hit the rest endpoint with curl or Harvest's bin/rest and let's make sure the same number is returned? Something like this should do the trick curl --insecure --user admin:pass 'https://10.193.48.11/api/storage/quota/reports?return_timeout=120&fields=type%2Cvolume%2Csvm%2Cqtree%2Cusers%2Cgroup%2Cspace%2Cfiles&show_default_records=false&return_unmatched_nested_array_objects=true'

tight iron
loud ocean
#

thanks. and the ONTAP: Qutoa dashboard Reports table has how many? If you want to check Prometheus instead of Grafana use the metric qtree_disk_limit qtree and quota metrics should be better named. We considered changing the metric names for 22.08 but didn't want to break customers already using these metrics. e.g. qtree_labels are qtrees, while qtree_disk_limit are quotas. we're going to deprecate the misnamed ones and fix them in the next release

tight iron
#

Over the whole farm (19 Clusters) I only see 3 entries

loud ocean
tight iron
#

@young steeple is it possible to show logs for a specific poller or would you say just use grep?

loud ocean
tight iron
#

ahhh there is a plugin error - duplicated instance key

loud ocean
#

now we're getting somewhere 🙂

#

can you share the log file @tight iron so we can track down the duplicate instance key?

tight iron
#

Sure just struggling how to collect only the logs for one poller 😅

loud ocean
#

maybe something like this would work? dc logs NAME_OF_POLLER > log.txt

#

where NAME_OF_POLLER will be from the name column when running dc ps

tight iron
#

mail is send, hope this is enough

loud ocean
#

thanks @tight iron that did the trick. @fossil bane found the problem and is working on a fix. We'll post when it is fixed and hits nightly

viscid agate
#

@loud ocean my Volume Details look like this rn. Alot of points.

loud ocean
viscid agate
fossil bane
sturdy crest
#

@young steeple I have a weird issue where I can't upgrade packages with FF on my lab Harvest.

#

Chrome works fine.

young steeple
sturdy crest
#

It browses for the file, and it will say sometimes "uploaded successfully" and sometimes it times out.

young steeple
#

what is the current version?

sturdy crest
#

I just did it in Chrome, but latest.

#

NAbox 3.1.1 (2022-06-08) - Alpine Linux 3.14.2
Grafana8.4.6
Graphite1.2.0-dev
NetApp Harvest22.08.0-1
NetApp NMSDK9.8P4
Prometheus

#

2.36

young steeple
#

you upgraded to beta or youhad a version prior to that one before upgrade ?

sturdy crest
#

I installed this when nabox was in beta 3 yeah.

young steeple
#

so you downgraded ?

sturdy crest
#

No...I mean this isn't a new NAbox.

young steeple
#

Let's start over 😂

#

do you know the version you came from ?

sturdy crest
#

Task failed successfully.

young steeple
#

lol

sturdy crest
#

3.0x...

#

Pretty sure.

#

I honestly forgot.

young steeple
#

I would have, too !

#

Did you have the progress bar during upload ?

sturdy crest
#

Yeah.

#

But it does freeze.

young steeple
#

before the end or at the end ?

sturdy crest
#

Before the end.

young steeple
#

Ok, I think I changed something a while back to fix an issue with FF, maybe that was it. I'm upgrading 3.1.1 to 3.1.2b3 now and it seems fine

sturdy crest
#

Hmm ok.

young steeple
#

If you get a chance you can try to reapply the update you just did with chrome

sturdy crest
#

Yeah Chrome worked.

tight iron
tight iron
fossil bane
viscid agate
#

If i am allowed.
It is a lil harder to follow here in discord.
Is there something like the Threads in Slack in Discord?

fossil bane
viscid agate
viscid agate
fossil bane
#

I created one

young steeple
#

How 😄

tight iron
#

Hi all, it is intentional that the units in the volume - details dashboard / Per volume Space Used panel is set to bytes(SI) instead of bytes(IEC). This is a bit confusing e.g. our volumes are 24TB if we do e vol show, also in the vCenter but in the dashboard they are 26.4TB

young steeple
#

Hi all it is intentional that the units

sour gale
#

Hi everyone.
I have a question regarding metrics exported by harvest. What is the best way to find description of a metric? For example there is "node_disk_data_written" metric. It used in one of the default Grafana dashboards as a source for disk latency. But for me it seem that it the dashboard shows unrealistic values of latencies near 200ms. So I want to dive deeper and check values on NetApp cluster itself.

fossil bane
# sour gale Hi everyone. I have a question regarding metrics exported by harvest. What is th...

@sour gale Most of Harvest metrics are prefixed by object. Let's take the metric you have mentioned node_disk_data_written , After removing object node, we get disk_data_written If you search this string in in our github repo or codebase, That should take you to template mentioned here https://github.com/NetApp/harvest/blob/main/conf/zapiperf/cdot/9.8.0/system_node.yaml#L24. From the template, you'll get to know about the object name as here (its the query field in template) https://github.com/NetApp/harvest/blob/main/conf/zapiperf/cdot/9.8.0/system_node.yaml#L3 . Now you can run below command to know more about the counters for that object.

bin/zapi -p POLLERNAME show counters --object system:node

sour gale
tight iron
#

Hi all, I've got a question volume_avg_latency. As far as I see this is not the average from read and write, can you specific what metric this is? I'm asking because we have a backup volume which has 10ms volume_avg_latency but 0ms read or write

tight iron
#

@sturdy crest Thank you for the link, will be added to my OneNote. I'm just wondering how a backup volume with no IOPs or throughput can have 10ms latency on it

sturdy crest
#

Are they CIFS/NFS IOPS or how are backups configured?

tight iron
#

just did a statistics collection and there is no CIFS/NFS IOPs traffic on the volume. It is a backup volume which has a snapmirror relation to the productive one but currently there is no job running

sturdy crest
#

If you do a qos statistics volume latency show, does it give a delay center for that volume?

#

And check qos statistics volume performance show to make sure that volume's latency is counted at the qos statistics level.

tight iron
#

The juncation-active state is on false for that specific volume and the all the values of qos statistics volume show are 0ms. But if I do a statistics volume show I see the Total OPS and latency what matches with the harvest

loud ocean
sturdy crest
#

Is there a perf archive?

#

I'll be able to find why from that.

tight iron
tight iron
sturdy crest
#

Are you in EMEA?

#

Just trigger perf archives for a problem time frame. I don't need a case.

tight iron
#

Yes I'm in EMEA

sturdy crest
#

Ok. If you have access to trigger now please do, otherwise hit me up tomorrow.

#

I'm in Kansas, so I'm a few hours behind.

wispy raft
#

getting this error in Harvest on a couple of our clustered NA systems....trying to track it down.

#

oller_XXXYYY.log:{"level":"error","Poller":"XXXYYY","plugin":"Zapi:Volume","object":"Volume","error":"duplicate instance key => ","relationshipId":"","caller":"goharvest2/cmd/collectors/zapi/plugins/volume/volume.go:172","time":"2022-08-24T12:21:53-07:00","message":"Failed to create snapmirror cache instance"}

#

this is working on other clustered systems in the harvest config but this one is failing on 3 of 5 in one location.

#

we recently did volume moves in those clusters, so I would expect that this could be a reason for it but unsure of how to fix it

#

harvest version 22.05.0-1 (commit 2bc2942) (build date 2022-05-11T07:56:11-0400) linux/amd64

wispy raft
#

suggestions on where I should post this if this is not the appropriate place, I saw that some were describing github harvest, which isn't nabox

young steeple
fossil bane
#

Snapmirror issue

tight iron
sturdy crest
#

And volume name?

#

If you don't want to broadcast you can DM me.

#

Ok this is weird.

tight iron
#

you mean an interesting/strange behavior 😅

sturdy crest
#

Yeah. I'm bamboozled.

#

I guess go ahead and open a case.

#

Hmm. It's not coming from outside WAFL. I don't see it in the spinhi counters at all.

#

Nevermind, found it.

#

Dedupe.

#

That's a counter manager version of that command.

tight iron
#

Hi all, does anyone have by chance a prometheus query to filter volumes based on the node model type? For example based on the "issue" above I like to filter every volume out which is on a FAS system and only show volumes on AFFs with the metric volume_avg_latency

fossil bane
tight iron
fossil bane
sturdy crest
#

@tight iron did I help you ok? Or did you have any questions?

tight iron
#

Hi @sturdy crest I see that there is the SIS scan running. Since I'm in the military next week I will open a case after that

sturdy crest
#

Nah no need for a case now.

#

I explained it...

#

If the scan isn't hurting anything you could leave it.

tight iron
#

I see, I'm now trying to write a query which takes the top5 (or topressources) only from the AFF systems. Since we are only using FAS system for backup or snaplock this makes for us more sense 🙂

young steeple
tight iron
agile oracle
#

Hi gang I wanna ask for quick help. I am trying to get just numeric results from the "max_throughput" that I get from the qos_label. So max_throughput from qos_label returns 4000IOPS for example and I only want the 4000. Tried the Value mappings from the Grafana but this won't change the data type to numeric from the string so it's not gonna work for me. At the moment I am trying to work with the plugin LabelAgent in the template but I am not sure I can get it done with this plugin. Any idea would be appreciated!

<qos_labels Result from the prometheus>
qos_labels{cluster="nas", datacenter="Test", instance="nabox-harvest2:12991", job="harvest2", max_throughput="4000IOPS", min_throughput="0", num_workloads="1", policy_group="file-03f22cec-99562eca1d73-wid13042", svm="nas_fbrsvm"}

<Template>
name: QoSLimit
query: qos-policy-group-get-iter
object: qos

collect_only_labels: true

counters:
qos-policy-group-info:
- ^policy-group => policy_group
- ^^vserver => svm
- ^^pgid
- ^policy-group-class => policy_group_class
- ^max-throughput => max_throughput
- ^min-throughput => min_throughput
- ^num-workloads => num_workloads

plugins:

  • LabelAgent:
    value_to_num:
    • status status up 0
      split:
    • max_throughput 'IOPS' ,max_num,placeholder

export_options:
instance_labels:
- svm
- max_throughput
- min_throughput
- num_workloads
instance_keys:
- policy_group

fossil bane
#

qos_labels iops

young steeple
#

Hey team. Collection of QoS policy requires QoS/Workloads collection ? Anything else ? I've got a user getting No Data I'm wondering if that's just the workload collection that's necessary

fossil bane
glacial tree
#

Hi Team, I have request from one of our customers if there is any documentation on metrics collected and what it means?
His question is about qos metrics and below a question from him.

"what exactly does something like qos_volume_ops represent? How can that be used for QOS performance tracking? How is the value different from “normal” volume read and writes? The same question is for all of the QOS metrics."

main drum
#

Hi Team, I am looking to enable snmp request for NAbox to monitore the resources of the VM.
Is it possible to make a switch to enable snmp for NAbox and to configure, or maybe only to have it pre installed. I can configure by my self at cli

loud ocean
# glacial tree Hi Team, I have request from one of our customers if there is any documentation ...

hi @glacial tree ONTAP's documentation for performance metrics is sparse. Generally the recommendation is to use the ONTAP provided metadata - for example, bin/zapi --poller aff-250 show counters --object workload_volume | less will query the cluster named aff-250 for all the performance counter metadata associated with the workload_volume object. If you look at the Harvest template conf/zapiperf/cdot/9.8.0/workload_volume.yaml you can see that template queries the ONTAP object workload_volume and exports those metrics as qos_volume - in this case, qos_volume_ops is a rate (per second) of the number of operations that completed for a workload

young steeple
#

Hi Team I am looking to enable snmp

calm folio
#

hello! since i updated to nabox 3.1.2 & harvest 22.08 i didn´t see the storage nodes & storage shelves on the power dashboard anymore. anyone else who have this problem?

agile oracle
#

Hi gang.
I am trying to make the intervals shorter than the 60s but it doesn't seem to work. I was getting the QoS Latency from zapiperf/cdot/9.8.0/workload_detail.yaml every 180s with the default schedule config and I have changed the config from workload_detail.yaml and /zapiperf/default.yaml to 60s and it works but when I set the config as 15s for the schedule, it doesn't seem to work. Any idea what else should I check? Thanks! 🙂

brave timber
#

Hi, I have a question related to certificiation authentication. How do I need to fill out the server_cert.cnf (CN and alternative names) in the certificate generation process, If I want to use just one signed certificate for multiple ONTAP Cluster. The example contains the SAN data (with FQDN and IP). https://github.com/NetApp/harvest/blob/main/docs/samples/server_cert.cnf

fossil bane
#

QoS scheduling

loud ocean
#

Hi I have a question related to

glacial tree
loud ocean
ivory dune
#

Has anyone here successfully imported NAbox into AWS and created an EC2 instance? I'm having issues with the import

gentle lion
#

Hi guys, love the product, been running Harvest for 6 or 7 years now.
I ran into what seems to be a bug when updating the root password on the appliance. There was a penetration-test done in my company and it detected a default login existed for my NABox 3.1 setup. I webbed into it and when I attempt to set a new password, it says "Wrong password or username". That is with and without the box checked for "change root account instead of admin", though obviously I want it checked.

#

The password I'm setting it to meets the requirements listed.
I resorted to copy & pasting the default root password just to make sure I wasn't mistyping it (I wasn't)

#

I figure I'll update to v3.1.2 but wanted to see if anyone knows what might be up with this. Thanks!

tight iron
#

Hi @young steeple we have an interesting phenomenon with the Nabox 3.1.1. We have errors in Grafana when loading the Dashboards. When we look on the Admin - Systems page we do not see any clusters but the all the containers are up and running. A week ago we had the same and I need to do a restore

#

and the nabox-api log looks like

young steeple
#

Hi Yann8373 we have an interesting

main drum
#

Hello guys! I am using Prometheus Replication with Node Exporter and there is no problem , then i configured harvest to monitor NetApp , but i have this error msg="Out of order sample from remote write" err="duplicate sample for timestamp" series="{name="node_uptime"
When i curl my cluster i see this output , strange that uptime is 2 times there , or ?

[root@server003 harvest-21.08.0-6_linux_amd64]# curl 0.0.0.0:12993/metrics | grep node_uptime
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0node_uptime{datacenter="dc03",cluster="NAPP01",node="NAPP01-01"} 159378
node_uptime{datacenter="dc03",cluster="NAPP01",node="NAPP01-02"} 160569
node_uptime{datacenter="dc03",cluster="NAPP01",node="NAPP01-02"} 160439
node_uptime{datacenter="dc03",cluster="NAPP01",node="NAPP01-01"} 159247
100 1152k 0 1152k 0 0 495M 0 --:--:-- --:--:-- --:--:-- 562M

Someone with same error? Thanks on purpose.

gentle lion
#

Following up regarding failures when trying to reset the root password:
Log in with admin, go to reset the password, check the box to do it for root, put in the admin password for "current pw" and the new root pw in for the new and confirm password.

#

I was trying to do it while logged in as root and it wouldn't work. I believe you HAVE to be in as admin, not root.

young steeple
#

Following up regarding failures when

#

Has anyone here successfully imported

young steeple
tight iron
#

Hi all, Yann activated with NABox 3.1.2 the workload/QOS counters. I've just did a query for the qos_detail_volume_resource_latency counter but I only see the SVM root volumes. Does harvest need specific settings on the volume to collect this informations?

calm folio
#

hello! about 90% of the dashboard "ONTAP: SVM" have no data. highlights, volume performance & capacity is ok, but the rest (all the protocols) display "no data". did you see the same of did you have metrics? thanks

opal summit
#

hello about 90 of the dashboard ONTAP

fossil bane
#

duplicate metrics

tight iron
# young steeple Did you wait a bit ?

Hi Yann, after almost 24h I still only see the root volumes. We have per volume a own qos policy but could it be because we do not use the default policy groups?

fossil bane
#

qos workload detail volume

sturdy crest
#

Volumes not showing up

viscid agate
#

@loud ocean Any news on possible integration of Storagegrid for Harvest?

loud ocean
sterile junco
#

tenant is probably a better place to start... then buckets -> object count, used capacity ...

#

or even farther up/next to the tree... site capacity, node capacity (bla all the hardware counters)

#

there's a lot of layers, hehe... but not so different from ontap in principle... just cassandra is a bit bigger element than wafl

loud ocean
#

thanks @sterile junco if you have a GitHub account, those comments would be a nice addition to the current issue https://github.com/NetApp/harvest/issues/170. If you don't want to bother, I'll add them. Last time I looked closely at StorageGRID, it didn't have a general /metrics endpoint that returns all Prometheus metrics (like what Harvest does). Instead, StorageGRID requires you to query by name. One idea was since StorageGRID already provides open metrics performance data, don't add that to a Harvest collector, but instead focus on the capacity and system health info that it only provides via REST. In other words, build a StorageGRID REST collector

sterile junco
#

it seems one can access the internal prometheus UI at https://admin_node/metrics/graph ... but i can't really tell you exactly what that means... if you get a test system you can dig into the internal configuration (just a linux box) on the admin node and see what they've done

#

i guess the only advantage of having harvest in the mix is just to have a single point for monitoring, etc ... would be nice to be able to configure custom dashboards and alerts (not a fan of the alarms now... they hang too long) ...

#

i always hope for the convergence of the flexibility of graphana+prometheus and the semi-intelligence of *UM

loud ocean
#

yep that makes sense. looks like StorageGRID bundles Prometheus and Grafana. They ship with a set of dashboards, but if you want those StorageGRID dashboards available in a different Grafana instance, looks like you need to export them from StorageGRID and import into a different instance. If you do that, not sure if it's then possible to change the Prometheus datasource to point to the metrics coming from StorageGRID. I'm not sure yet if StorageGRID exposes the metrics in a way that an external dashboard can use. Maybe you could setup remote writes from the internal Prom instance

sterile junco
#

we have a number of external SG dashboards as well and I wish I had time to explore making some better ones with the SG sources

dusky mauve
#

Dear users, I'm using harvest on Alpine Linux. When I try to start harvest I get message: "fork/exec /conf/bin/daemonize: no such file or directory". Does anyone have an idea how to start harvest with docker?

loud ocean
dusky mauve
#

First of all, I'm not sure if it should work inside the container or with alpine. It's failing when I'm try to use command bin/harvest start

#

I also try to start using command "docker exec" for specific container

#

When I try to use starting command from alpine I get "bin/harvest: not found"

loud ocean
#

gotcha, are you wanting to run on Docker and if so, which of the five bullets listed here are you trying? https://github.com/NetApp/harvest/tree/main/docker#harvest-and-containers
or maybe you mean that you want to run Harvest on Alpine without Docker? If you want to run natively without containers we only publish builds for amd64 (easy to build for other platforms, but we haven't gotten requests outside macos). If you want to run a containerized version it should just work

dusky mauve
#

Before I installed the latest release, there was also an error related to pgrep

loud ocean
#

sounds like you're trying to run native amd64 build on Alpine which won't work. it should work fine with docker though

young steeple
#

on Apple Silicon platform you can force x86 image iirc

loud ocean
#

with Rosetta 2 I assume

young steeple
#

Probably yes. And with Docker for Mac

#

which handles that pretty nicely

dusky mauve
loud ocean
#

thanks Pawel, ah I bet I know what's happening - you are probably at step 3 and that's failing when you run it on Alpine Linux. The containers run fine on Alpine, but as I mentioned above, in step 1, you downloaded an amd64 version which will not run on Alpine

young steeple
#

FYI you will have libc issues with Alpine to run Harvest. I finally let go and moved to FROM --platform=linux/amd64 python:3.8-slim-buster

dusky mauve
loud ocean
#

ah did not realize you were using nabox

young steeple
#

What are you trying to achieve with NAbox ?

loud ocean
#

@dusky mauve we'll get things sorted out for you tomorrow then

dusky mauve
young steeple
#

If poller fails and stop, we should start with a dc logs nabox-harvest2

young steeple
#

Also note that harvest container in NAbox does does embed NetApp Harvest, it is mounted from another directory

young steeple
#

Any thoughts on setting custom retention for a given metric ? I don't think Prometheus lets us do that but I figured I'd ask anyway

calm folio
#

hello, did anyone made a vscan dashboard already? would like to steal this 😉

young steeple
#

May be this helps httpsfaun pubhow to

fossil bane
loud ocean
dusky mauve
loud ocean
#

@young steeple will help you with adding Harvest to NAbox

dusky mauve
#

Thank you Chris.

young steeple
#

Harvest issue in NAbox

viscid agate
dusk siren
#

Hi guys,
Is it possible to add metrics in the ONTAP nic area? If we do a ifstat e0d for example, there is in the output " Bus overruns". Is it possible to also grab this with Harvest to show this in the Network Dashboard?

loud ocean
dusk siren
#

frd-ntap41n::*> node run -node frda46104 -command ifstat e0c

-- interface e0c (0 hours, 22 minutes, 26 seconds) --

RECEIVE
Total frames: 238m | Frames/second: 177k | Total bytes: 334g
Bytes/second: 248m | Total errors: 0 | Errors/minute: 0
Total discards: 26 | Discards/minute: 1 | Multi/broadcast: 205
Non-primary u/c: 0 | Errored frames: 0 | Unsupported Op: 0
CRC errors: 0 | Runt frames: 0 | Fragment: 0
Long frames: 0 | Jabber: 0 | Length errors: 0
Alignment errors: 0 | No buffer: 0 | Pause: 0
Jumbo: 0 | Error symbol: 0 | ||Bus overruns: 26||
Queue drops: 0 | LRO segments: 23983k | LRO bytes: 319g
LRO6 segments: 0 | LRO6 bytes: 0 | Bad UDP cksum: 0
Bad UDP6 cksum: 0 | Bad TCP cksum: 0 | Bad TCP6 cksum: 0
Mcast v6 solicit: 0 | Lagg errors: 0 | Lacp errors: 0
Lacp PDU errors: 0
TRANSMIT
Total frames: 274m | Frames/second: 203k | Total bytes: 357g
Bytes/second: 265m | Total errors: 0 | Errors/minute: 0
Total discards: 0 | Queue overflow: 0 | Multi/broadcast: 310
Collisions: 0 | Pause: 0 | Jumbo: 227m
Cfg Up to Downs: 0 | TSO segments: 8981k | TSO bytes: 334g
TSO6 segments: 0 | TSO6 bytes: 0 | HW UDP cksums: 0
HW UDP6 cksums: 0 | HW TCP cksums: 51224k | HW TCP6 cksums: 0
Mcast v6 solicit: 0 | Lagg drops: 0 | Lagg no buffer: 0
Lagg no entries: 0
DEVICE
Mcast addresses: 3 | Rx MBuf Sz: 4096
LINK INFO
Speed: 100G | Duplex: full | Flowcontrol: none
Media state: active | Up to downs: 5 | HW assist: 5655

loud ocean
#

ah! a nodeshell CLI command - Harvest does not run any of those at the moment. I'll see if we can find this via REST or REST private cli

#

hopefully that is surfaced somewhere else

dusk siren
#

Thx Chris

loud ocean
# dusk siren Thx Chris

@dusk siren looks like this counter may be exposed via nic_common can you use the following command to verify that counter gives us what we want? Replace u2 with the name of your system with 26 receive overruns. bin/zapi -p u2 show data --object nic_common --counter tx_bus_overruns --counter rx_bus_overruns and if you have the handy https://github.com/tomwright/dasel you can throw in a | dasel -r xml -w json at the end to get pretty printed json, otherwise add --write color to get output that is more readable than XML

dusk siren
#

@loud ocean These ar ethe right counters:
{ "counters": { "counter-data": [ { "name": "rx_bus_overruns", "value": "670" }, { "name": "tx_bus_overruns", "value": "0" } ] }, "name": "e0d", "sort-id": "0", "uuid": "frda46104:kernel:e0d"
The number increased the last hours

dusk siren
#

Hi guys;
We are using the Prometheus Service Discovery. Since I updated Harvest to version 22.08.0-1 Harvest seems to expose not the whole address for the target. I only can see the port but not the FQDN. E.g.: {"__meta_poller":"vig-ntap11"}},{"targets":[":13099"] in Prometheus I also can only see the ports. Prometheus and Harvest are running on different VMs

fossil bane
dusk siren
#

Sure:
Admin:
httpsd:
listen: :8887
auth_basic:
username: -REDACTED-
password: -REDACTED-
Tools:
grafana_api_token: -REDACTED-
Exporters:
prometheus-zf:
exporter: Prometheus
port_range: 13000-13999
Defaults:
collectors:
- Zapi
- ZapiPerf
use_insecure_tls: true
auth_style: basic_auth
username: -REDACTED-
password: -REDACTED-
exporters:
- prometheus-zf
Pollers:
abt-ntap91:
datacenter: ABT
addr: -REDACTED-
alf-ntap11:
datacenter: ALF
addr: -REDACTED-
alf-ntap91:
datacenter: ALF
addr: -REDACTED-
als-ntap91:
datacenter: ALS
addr: -REDACTED-
..... More systems

fossil bane
#

I see local_http_addr is missing in exporter configuration. Could you add , That should add the address of the target

dusk siren
#

0.0.0.0 is teh default value or? Even if it is not named in the yml?

fossil bane
#

yes

#

given your prometheus is running on a different machine, it will need the target ip address

dusk siren
#

It worked until I did the update 😄

fossil bane
#

hmm that is not something we changed in latest version.

dusk siren
#

So, the targets are named in harvest and Prometheus is using them right?

fossil bane
#

Yes, prometheus is scraping end points created by harvest

dusk siren
#

I didn't update to every version. So the source version was older

fossil bane
#

hmm let's give local_http_addr a try and see

dusk siren
#

ok

#

Ahhh, now harvest exposes also the IP

fossil bane
#

Ideally 0.0.0.0 should have worked as well. Could you share logs where local_http_addr is not set?

dusk siren
#

I set 0.0.0.0 in conf now. I did not set the IP of the Prometheus server

#

sure

fossil bane
#

0.0.0.0 is default only. It should not be the prometheus server IP but the IP of machine on which Harvest is running.

dusk siren
#

Hm, strange

#

Can I trigger a asup to send the logs?

fossil bane
#

That is not yet available in Harvest

dusk siren
#

Yes, I did

fossil bane
#

ok great. So default configuration 0.0.0.0 should have exposed the fqdn with port. If it doesn't then we should see an error in logs (could be related with resolving fqdn). To workaround this, you can mention the fqdn of harvest machine in local_http_addr which should work.

dusk siren
#

ok

fossil bane
#

is it working now with local_http_addr?

dusk siren
#

yeah. I set the IP 0.0.0.0 and it is working

viscid agate
#

I have a question regard Aggregate Dashboard
Physical Space Used is not correct i think because it does show data that is tiered on our Storagegrid.

Can someone look into this?

Here an Example:

loud ocean
#

that dashboard is displaying the aggregate physical_used metric returned by ONTAP. Let me check why the CLI may not agree

#

ah! I see the problem, units, that panel is using bytes (SI) when it should be bytes (IEC). If you change the units for that panel in the dashboard do you see the amount?

#

I'll open a PR to fix

viscid agate
#

The other Panels are wrong as well (have checked 3-4).
I will edit it trough JSON till it is fixed with the next release

loud ocean
#

yes, the PR will include updates to all panels

loud ocean
#

hi @dusk siren thanks for raising the service discovery issue - we found and fixed the issue. As you discovered, when a poller's Prometheus exporter was missing the local_http_addr param or if that param was the empty string, the poller published an incomplete address that causes service discovery to fail. The workaround is to specify local_http_addr: 0.0.0.0 - this PR fixes the problem https://github.com/NetApp/harvest/pull/1278 so that the local_http_addr can be missing or empty. Thanks!

loud ocean
young steeple
#

@quartz wave I think I know what's going on, wrote you an email

wanton pecan
wanton pecan
#

I'm running Harvest inside nabox. In the 22.08 Node Details page, the CPU Busy Domains is very different from 22.05. It looks like an order of magnitude in Idle and host OS data. The rest of the values didn't appear to change.

agile oracle
#

Hi Team, I hope y'all had a good weekend. Does anyone have any experience deploying the harvest on a massive scale? I'm planning to deploy the harvest to monitor over 500 ONTAP cluster environments and I wonder if there is any hiccup that I might face thru the service or the deployment.

fossil bane
#

CPU domain busy

fossil bane
young steeple
#

nathan33851 We have customers monitoring

dusk siren
loud ocean
#

awesome!

empty knot
#

Hi all! Can Harvest scrape any S3 metrics? If not, is it planned? Thanks!

fossil bane
fossil bane
empty knot
fossil bane
#

ok thanks.

mortal stirrup
whole knot
#

making the switch from nabox to nabox3/harvest 2.0 - some of the metrics in nabox/graphite have not come over into nabox3/prometheus - looking for the nfs connection count that I had in the old world

ivory dune
#

does anyone know how to reset the admin password for NAbox? I am able to ssh in as root, so there should be a way..

loud ocean
#

Hi all Can Harvest scrape any S3 metrics

wanton pecan
#

I just installed 22 08 in nabox In the

glacial tree
#

Hi Team, I have got below questions from one of our customers. We have deployed Harvest for them. They are looking for qtree metrics.
Need some help.

For the 7-mode to ONTAP migrated volumes, there may be one or more qtrees within a volume.
* If they apply QOS at the qtree for their different workloads, is it possible to collect statistics at the qtree level.
For newly provisioned workload, there will be one qtree per volume.
* could apply QOS to the volume and collect statistics at the volume but it would be preferable to use one consistent practice of applying QOS at one level.
We will be applying QOS to the qtrees.
* For NFS, this will limit the IOPS at the qtree level.
* For CIFS, this will limit the IOPS at the qtree level once we are at ONTAP 9.9.1

fossil bane
#

qtree workload

calm folio
#

hello! is it possible to bring a netapp e-series to grafana (nabox)?

young steeple
#

hello is it possible to bring a netapp e

mossy coyote
#

I am needing to migrate harvest (21.08.0-6) from an OpenStack environment to Azure. I am looking for the easiest method to accomplish this task. Harvest is running on a Centos VM and will be migrated to a Centos VM. I would like to migrate the applications (grafana, prometheus, and harvest) and keep the historical data. Thanks.

young steeple
#

Do we have metrics for tiering activity ? packets in/out or throughput ?

quick berry
#

Having trouble using influxdb exporter.
i keep seeing these errors in harvest container logs

3:06PM ERR collector/collector.go:433 > export data to [my-influx]: error="Post "http://0.0.0.0:8086/api/v2/write?org=harvest&bucket=harvest&precision=s\": dial tcp 0.0.0.0:8086: connect: connection refused" Poller=cluster-01 collector=ZapiPerf:Volume stack="goroutine 519 [running]:\ngithub.com/netapp/harvest/v2/pkg/logging.MarshalStack({0xbadb60?, 0xc000f9e090?})\n\tgithub.com/netapp/harvest/v2/pkg/logging/logger.go:152 +0x88\ngithub.com/rs/zerolog.(*Event).Err(0xc0008ea000, {0xbadb60, 0xc000f9e090})\n\tgithub.com/rs/zerolog@v1.27.0/event.go:381 +0x63\ngithub.com/netapp/harvest/v2/cmd/poller/collector.(*AbstractCollector).Start(0xc000279ee0, 0xc0005c6040?)\n\tgithub.com/netapp/harvest/v2/cmd/poller/collector/collector.go:433 +0x10ee\ncreated by main.(*Poller).Start\n\t./poller.go:399 +0x2c5\n"

Pollers:
cluster-01:
datacenter: DC-01
addr: 10.193.48.163
auth_style: basic_auth

credentials_file: path/to/credentials.yml # read credentials from the file

 username: admin
 password: netapp1!
 use_insecure_tls: true  # Disable TLS verification when connecting to ONTAP cluster
 exporters:
   - my-influx

my-influx:
exporter: InfluxDB
addr: localhost
bucket: harvest
org: harvest
token: mXHXQF2y3wsC3D3ItwC6RH3Wd3xtZCMEqBC_07AoPWELxGl4DBGGnhycOPrxviQcOUVU9JxqalXqn_NTk0RsxQ==

quick berry
quick berry
#

Hi,
I'm trying to deploy harvest pod in my k8s cluster. Able to generate k8s deployment.yaml using kompose.
can someone tell me importance of these two volume mounts

volumes:
- hostPath:
path: /root/harvest-22.08.0-1_linux_amd64/conf
name: cluster-01-hostpath0
- hostPath:
path: /root/harvest-22.08.0-1_linux_amd64/cert
name: cluster-01-hostpath1

My plan is to deploy harvest dynamically from a service. i wont have access to these folders from my service.
How do i go abt it?

fossil bane
#

k8 Harvest

mossy coyote
#

i just noticed that I am not getting logs in /var/log/harvest. In fact the directory /var/log/harvest does not even exists. How can I start getting logs?

mossy coyote
#

i just noticed that I am not getting

young steeple
sturdy crest
#

Ooh I like it.

safe hornet
#

Hi,
is compute-metric plugin supported for REST only?

fossil bane
safe hornet
#

It should work in Zapi also

wispy raft
#

it shows all our data with source_node="" in all cases

#

which seems strange

#

on all our metrics

blissful moon
#

Using Histograms with Harvest

viscid agate
#

Hello,

we have a Problem.
From the Variable cluster (from the screenshot) sometimes we miss Systems. After we disable and reenable them in Nabox System Configs they reappear in the variable.
Can someone look into that?

fossil bane
#

dashboard issue NABox

calm folio
fossil bane
#

NetApp Overview CDot dashboard

sturdy crest
#

Hey. So my Harvest lab instance isn't collecting qos delay centers again, despite my enabling in the zapiperf/default.yaml.
pstejska-harvest:/opt/harvest2-conf/conf/zapiperf# tail -n 5 default.yaml

Uncomment to collect workload/QOS counters

Workload: workload.yaml
WorkloadDetail: workload_detail.yaml
WorkloadVolume: workload_volume.yaml
WorkloadDetailVolume: workload_detail_volume.yamlpstejska-harvest:/opt/harvest2-conf/conf/zapiperf#
pstejska-harvest:/opt/harvest2-conf/conf/zapiperf#

pstejska-harvest:/opt/packages/harvest2/conf/zapiperf# tail -n 5 default.yaml

Uncomment to collect workload/QOS counters

Workload: workload.yaml
WorkloadDetail: workload_detail.yaml
WorkloadVolume: workload_volume.yaml
WorkloadDetailVolume: workload_detail_volume.yamlpstejska-harvest:/opt/packages/harvest2/conf/zapiperf#

#

Could it be a 9.11 thing?

rigid relic
#

Good Morning All - I have installed harvest using docker compose and wanted to keep the retention as below and restarted prometheus services to take affect, but after a day I still see only 15 days default metrics.

‘—storage.tsdb.retention.time=184d ’

Am I missing something here?

long merlin
#

I am trying the new Ems collector in the 22.08 release but it does not collect any event.
poller_flc1-noprod-ash-storage.log:{"level":"info","Poller":"flc1-noprod-ash-storage","collector":"Ems:Ems","path":"conf/ems/9.6.0/ems.yaml","v":"9.8.0","caller":"collector/helpers.go:133","time":"2022-09-28T08:44:25-07:00","message":"best-fit template"}
poller_flc1-noprod-ash-storage.log:{"level":"info","Poller":"flc1-noprod-ash-storage","collector":"Ems:Ems","total instances":0,"caller":"ems/ems.go:387","time":"2022-09-28T08:44:25-07:00"}
poller_flc1-noprod-ash-storage.log:{"level":"info","Poller":"flc1-noprod-ash-storage","collector":"Ems:Ems","queried":61,"caller":"ems/ems.go:456","time":"2022-09-28T08:44:26-07:00","message":"No EMS events returned"}
What am I missing?

long merlin
#

we'd like to use ems collector to monitor callhome.spares.low. also want to add resolve_when_ems: condition
is there an ems message that indicates spares low is resolved?

loud ocean
#

we d like to use ems collector to

tight iron
#

Hi all, short question about the NABox. What is the maximum supported size of the data disk?

restive trellis
#

Hi All!
I am trying to get the cluster network interconnect usage by using these two metrics in Grafana:

  1. lif_recv_data{datacenter=~"$Datacenter",cluster=~"$Cluster",node=~"$Node",port=~"$Eth",svm=~"Cluster"}
    this metrics can know which ports are for cluster port
  2. nic_util_percent{datacenter=~"$Datacenter",cluster=~"$Cluster",node=~"$Node",nic=~"$Eth"}
    This metrics can get the nic usage percentage
    How can I merge this two together or how can get the result from metric one(knowing which ports are for cluster) and add the value to metric 2 to get the result?
loud ocean
#

Hi All

hybrid night
#

@loud ocean Hello Chris; running into a weird one, every couple of days I lose my main cluster from the reports, even though it is listed in my configuration.

ancient stone
#

looking to export metrics from older netapp grafana with graphite into nabox 3.1.2 with prometheus. anybody else run into this? want to keep history.

main drum
#

Missing user role in Harvest for NABox

young steeple
crisp loom
#

Hi team. IHAC that is running Harvest in their environment, and they are asking us for a historical report from 2018-2022 to show their storage growth over that time for all their systems. Is this the type of data that harvest would collect and report on?

tight iron
ancient stone
# calm folio https://nabox.org/faq/#general-questions

Thanks the older version isn't the nabox packaged VM. it's a separate grafana4.6, graphite 1.3.10, and harvest 1.3 install with imported dashboards. I wanted to import the last year of metrics from the old database which looks like it's stored in graphite to the new nabox 3.1.2 which has grafana8.2.7, graphite 1.3.14, harvest2, and prometheus.

young steeple
#

We had 150gb filled in around 3 month

young steeple
#

Thanks the older version isn t the nabox

viscid agate
#

@young steeple i must ask again because my colleagues keep asking me over and over...
Is it possible to change default from grafanaserver/admin to grafanaserver/grafana?

young steeple
young steeple
#

@hybrid night could be a memory issue indeed.

agile oracle
#

Hi Gang, I am doing a pitch for partners about Harvest and I am trying to include some examples in the slides. Can anyone let me know if there are any enterprises that are utilizing them? Thanks!

static heart
#

Do we have metrics for Flexcache in Harvest?

fossil bane
static heart
hybrid night
fossil bane
#

flexcache metrics

calm folio
#

ontap: "vol show -fields used" show me 5.52TB.
AIQUM: show me 5.52 TiB used.
grafana: dashboard volume / per volume space used show me "volume size used" 6.07TB.

what i´d like to say:

  • ontap show TiB but name it TB (terabyte).
  • AIQUM make it right (in tebibyte).
  • grafana: show us TiB but name it TB in the legend.

can you confirm that and correct it in the next version of harvest?

safe hornet
#

Hi, Can anyone tell me how does harvest find the best-fit template?

fossil bane
#

volume dashboard IEC units

main drum
#

I have a question about certificate authentication.
At my customer no self-signed certificates may be used. If I understand the documentation correctly, the mapping to the ONTAP user to be allowed to read the counters is made using the CN that is used for the CSR. https://github.com/NetApp/harvest/blob/main/docs/AuthAndPermissions.md#using-certificate-authentication
In the example harvest2. When creating the certificate, however, the FQDN (e.g. harvest-host1.domain.com) must be specified as the CN. My question is if the hostname alone can be specified as alt_name (e.g. harvest-host1) which is then mapped to an ONTAP user (harvest-host1 in this case). Is this the right way?

GitHub

Open-metrics endpoint for ONTAP . Contribute to NetApp/harvest development by creating an account on GitHub.

quick oyster
#

This is probably a newbie question,
Is there a way to sent the data collected to an external graphite server, couldn't find on the documentation.

loud ocean
#

This is probably a newbie question

manic aurora
#

Hi i have just installed nabox 3.1.2 with netapp-harvest 22.08.0-1

#

It s quite cool softwarer but i have some problems with environmental monitoring

#

i got no date from my DS460-12 shelf ( no temparature or power consumtion )

#

and no power consumption for all my FAS80x node

#

all other shelf or node are working well

#

is there some hardware that are not supported?

fossil bane
#

Missing Power metrics for FAS80x and DS460-12 shelf

empty cairn
#

Hello guys, just a very basic question - If I want to get Harvest running, do I need to have an external Grafana Server or something like that to visualise things, or can Harvest itself show dashboards somehow?

mortal stirrup
empty cairn
mortal stirrup
fossil bane
loud ocean
#

Any way to do that on a linux system

gentle pike
#

In some Graphs I have a vertical dotted line moving with the mouse cursor. But in the most Graphs not. Is there a way to have in all Graphs a dotted line? Btw. In the old Grafana (harvest 2.x) the dotted line was moved in all Graphs in the dashboard simultaneous.

opal summit
#

@gentle pike Could you share which version of Grafana you are using with which version of Harvest? I could see vertical & horizontal lines in both the mentioned graphs in my setup.

gentle pike
safe hornet
#

Hi,
I tried adding multiple pollers in the harvest.yml, but I see only one poller logs, there are no other errors. How to identify what went wrong with the other poller
Pollers:
ontap1:
datacenter: a330e568-63de-462a-bfb9-1f03c3cd04a7-DC
addr: <ip>
auth_style: basic_auth
credentials_file: /opt/secret/ontapcred.yml
use_insecure_tls: true
exporters:
- influx
collectors:
- Zapi:
- zapi_custom.yaml
- ZapiPerf:
- zapiperf_custom.yaml
umeng-aff300-01-02:
datacenter: a330e568-63de-462a-bfb9-1f03c3cd04a7-DC
addr: <ip>
auth_style: basic_auth
credentials_file: /opt/secret/ontapcred.yml
use_insecure_tls: true
exporters:
- influx
collectors:
- Zapi:
- zapi_custom.yaml
- ZapiPerf:
- zapiperf_custom.yaml

fossil bane
#

Grafana dotted line

empty cairn
#

Hello guys, does nabox support storagegrid/e-series?

gentle pike
#

Rene Meier3282 Could you share which

loud ocean
#

@mental vale can you paste docker version again?

loud ocean
young peak
#

Hey all: Is there a metric for measuring the amount of tiered cloud storage that I'm missing?

loud ocean
#

Hey all Is there a metric for measuring

empty cairn
#

Hello guys, in nabox the power consumption of NVME/A250 Systems just shows as 0w or empty, is this a known thing?

safe hornet
#

Hi all, where can I check for the unit of the metrics? Could someone point me to documentation if any.

dusky mauve
#

Hello everyone! I have a strange problem when I try to configure LDAP in NAbox - Grafana. When I upload andsubmit data in Nabox, the configuration file (grafana.ini) automatically falls back to the default path to ldap.toml. I have a connection (check mark next to LDAP settings) but the user mapping is not working - it shows that it cannot find the user. I am using Linux with dockers. Additionally, I would like to ask how to enable logs in docker (grafana.log does not appear after uncommenting in the configuration file) to see what exactly is going on?

hardy oracle
#

Hello everyone, i am using the new version of Harvest. Is it possible to add some object.counters from CDOT Ontap to the Prometheus as metrics?

sturdy crest
#

Possible? Yes? How? I"m not actually sure.

loud ocean
#

Hello everyone i am using the new

main drum
#

NAbox 3 Cluster Dashboard-SVM Performance Drilldown Latency

mental vale
#

Team in the latest harvest version i see grafana dashboard shows no data for quotas for one of my cluster and no data collected by harvest as well.The other poller is collecting the data perfectly

this error i can see in the docker logs

:39AM ERR collector/collector.go:394 > plugin [Qtree]: error="duplicate instance key => st-svm.vol12345.qt_bluser.4.file-limit" Poller=Pollername collector=Zapi:Qtree stack="goroutine 434 [running]:\ngithub.com/netapp/harvest/v2/pkg/logging.MarshalStack({0xbae240?, 0xc003b41ea0?})\n\tgithub.com/netapp/harvest/v2/pkg/logging/logger.go:152 +0x88\ngithub.com/rs/zerolog.(*Event).Err(0xc00040c000, {0xbae240, 0xc003b41ea0})\n\tgithub.com/rs/zerolog@v1.27.0/event.go:381 +0x63\ngithub.com/netapp/harvest/v2/cmd/poller/collector.(*AbstractCollector).Start(0xc0002e8000, 0x0?)\n\tgithub.com/netapp/harvest/v2/cmd/poller/collector/collector.go:394 +0x17ae\ncreated by main.(*Poller).Start\n\t./poller.go:399 +0x2c5\n"

uneven pumice
#

I am trying to get QOS metrics working in harvest2, I have uncommented the workload metrics and rebooted and each time I reboot the metrics get re commented out

#

The location I am editing default.yaml in /opt/harvest2-conf/conf/zapiperf

#
  Workload:               workload.yaml
  WorkloadDetail:         workload_detail.yaml
  WorkloadVolume:         workload_volume.yaml
  WorkloadDetailVolume:   workload_detail_volume.yaml********:/opt/harvest2-conf/conf/zapiperf#reboot
******:/opt/harvest2-conf/conf/zapiperf# Connection to ****** closed by remote host.```
#

Reconnect and ```

Uncomment to collect workload/QOS counters

Workload: workload.yaml

WorkloadDetail: workload_detail.yaml

WorkloadVolume: workload_volume.yaml

WorkloadDetailVolume: workload_detail_volume.yaml*****

:/opt/harvest2-conf/conf/zapiperf```

sturdy crest
wispy raft
#

I am trying to add information related to network port status, where they are and whether they are home, related to this post:

https://github.com/NetApp/harvest/issues/471

I am not able to get this to work to be able to have a metric that has this information so we can graph changes in the environment and possibly even put alerts in based on this to inform us when this happens.

GitHub

Is your feature request related to a problem? Please describe. We need more details on LIFs to enable better sorting. Describe the solution you'd like I think you can use the net-interface-...

#

Any thoughts on how I can do that?

#

I get this error when trying to add that one in

fossil bane
#

net-interface template

rustic steppe
#

ISSUE WITH HARVESTER

safe hornet
#

Hi, could someone tell me why is the counter display name appended with the unit for node metrics?
I'm using Rest templates, system node for ONTAP v.9.12.1
`name: SystemNode
query: api/cluster/counter/tables/system:node
object: node

counters:

  • ^^id
  • ^node.name => node
  • total_data
  • total_latency
  • total_ops

export_options:
instance_keys:
- node`

loud ocean
#

Hi could someone tell me why is the

compact breach
loud ocean
#

The Harvest team is happy to announce a beta release of the StorageGRID collector is available in the latest nightly build. Please try it out and let us know how it works for you. Details can be found here https://github.com/NetApp/harvest/issues/170#issuecomment-1297448602

GitHub

As a storage administrator, I'm looking to have a single tool to monitor my NetApp environment. ActiveIQ Unified Manager, Harvest and NABox are greats tools but focused only to ONTAP. It wi...

empty cairn
#

Hey guys, I now installed the nightly release on the nabox, and even managed to add the storagegrid as described on github. However, I can't seem to find the storagegrid dashboard. Help would be appreciated 😀

muted wigeon
#

Hi, Hope this is right forum. I have query regarding the custom.yaml files for ontap. Can we decide to use REST or ZAPI yamls at object level for given ontap version. Like disks.yaml to use zapi and aggregate.yaml to use rest?

uneven pumice
safe hornet
#

Hi, can anyone tell me if harvest allows only snake case for custom display name?

loud ocean
#

Hi can anyone tell me if harvest allows

uneven pumice
#

hey, following on from #┊・harvest-nabox🔒 message do folks have the volume qos resource latency drill down metrics working? I can now see qos metrics in end-end but I am wanting to see "throttled" by qos metrics

main drum
#

Hi, I am trying to get nabox to work with an old netapp (7-mode) , but I get an ssl handshake error (zapi.py), I can give more details if anyone is available to help me...
Thank you in advance!

loud ocean
#

Hi I am trying to get nabox to work with

#

The team has been working hard on improving Harvest documentation. Many of you shared that it is difficult to find the docs on GitHub, so we've prioritized moving to a separate documentation site this release https://netapp.github.io/harvest/ We have lots more improvements planned, but this is a strong step in the right direction

glacial tree
#

Hi team, where can I find the metric units for harvest metrics?

loud ocean
#

Hi team where can I find the metric

echo shard
#

Sudden change of NetApp clusters temperature measures after NAbox and NetApp Harvest update

long merlin
#

Hi, does harvest exports its own metrics? We are interested in the error stats. We can view the log and see them. But wondering if we can collect those with prometheus

warm anchor
#

Hello,

I have configured new NABOX instance and added two clusters. I don't see option netapp when i try to add new dashboard in Grafana. Can anyone help me on this

main drum
#

Hello everyone, I would like to know why i am not seeing the metrics from netapp in prometheus. I am able to see the raw metrics arriving from all the collectors with the command "dc logs -f --tail 20 nabox-harvest2"

#

Where can I see all that raw metrics?

#

I would like to see raw metrics in order to be able to create new metrics

#

Thanks for your help.

dreamy sorrel
#

Have a customer questioning if Harvest itself has any APIs they can pull information from? They understand Harvest is utilizing ZAPIs / APIs from Ontap to pull info from...but their ask is specific to any Harvest APIs/corresponding documentation. I'm not finding any, so thinking it's a 'no', but would love validation. Thanks!

clever grotto
#

Hello all.

We have a large enterprise customer that currently has three AIQUM clusters, but is asking the following:

“I would like to know about how Netapp integrates with Prometheus as that is a priority for us to integrate metrics with other compute platforms to get a detailed end to end view.”

We are working on setting up a meeting for this Friday, but for now those are all the detail we have.

I am fairly certain we will not need the included Grafana / UIs, but that this will end up just being a bridge from the ONTAP clusters into their internal Prometheus-based monitoring tool.

My assumption is that Harvest will be the best route to go, over working with the APIs directly or using something like https://github.com/sapcc/netapp-api-exporter. Wanted to see if anyone else thinks that seems like the best route as well.

Thanks!

@verbal hazel

viscid agate
tough kestrel
#

Hi all,

I'm running containerized version of harvest 2.0. Which is the safest way to add new poller? Do I have to stop running pollers when generating compose files after editing harvest.yml?

Thanks!

viscid agate
#

We love the StorageGrid collector so far.
Is there an ETA when the next release of harvest will be available?

fossil bane
#

most likely next week.

slender bane
#

Hi @young steeple

Is there a way to expose the NABox's Prometheus to use it as a datasource in my own Grafana instance?

loud ocean
#

The Harvest team is happy to announce the release of 22.11 https://github.com/NetApp/harvest/releases/tag/v22.11.0 This is one of our biggest releases with the largest number of external contributors to date. Go team! Highlights of this major release include: a StorageGRID collector with a Tenant/Buckets dashboard, production ready REST collectors with a full set of REST templates that export ZAPI identical metrics, and a new documentation site that consolidates Harvest documentation into one place

GitHub

22.11.0 / 2022-11-21
📌 Highlights of this major release include:

✨ Harvest now includes a StorageGRID collector and a Tenant/Buckets dashboard. We're just getting started with StorageGRID das...

long merlin
#

Hi, question about metrics lif_recv_errors and lif_sent_errors. what is time period to calculate them? how to find what the error is? thanks!

tight iron
ivory field
#

Harvest DB cleanup

mental vale
fossil bane
#

Updated now

mental vale
#

@fossil bane After upgrading harvest to 22.11 i still cant see the fabricpool dashboards

tight iron
#

Hi guys, I've just upgraded harvest 22.11 on some of our systems but we have an interesting behavior. In the SnapMirror dashboard we do not see any source cluster and only the destination nodes. Also it says we have two SnapMirrors but no last transfer

empty cairn
#

Hello guys, love the new release so far!

fossil bane
safe hornet
#

Hi, can anyone tell me how do I get a tagged version of harvest image and from where?

mental vale
#

@fossil bane i upgraded harvest for one of my customer from 22.08 to 22.11 , i used the migration steps as part of it for docker volume migration.

mortal stirrup
#

Following the update instructions of the 22.11 harvest package, I tried replacing "qtree" with "quota" for my dashboards but I am missing an equivalent of "qtree_total_ops".
Is this intentional?

opal summit
#

Following the update instructions of the

empty cairn
#

Hello guys, I'm currently looking at the harvest dashboard for nfs clients - is my understanding correct that we need to have ontap 9.12.1 for it to work (due to the rest interfaces)?

loud ocean
#

Hello guys I m currently looking at the

mortal stirrup
#

Where do I see what unit a metric available in Prometheus is in?
For example, I'm running NAbox and want to look at the metric quota_disk_used

safe hornet
#

Hi, can anyone tell me why is the datacenter id different every time I try to collect metrics from the same cluster? Is there a way I could ignore this field before it is exported?

long merlin
#

Hi, seems the grafana tool released in 22.11 creates new UIDs when importing dashboards. This breaks many of our links because UID is part of the dashboard link. We were using garfana in 21.08 and it did not create new UIDs.

mental vale
#

@young steeple why Prometheus container in Nabox was stuck in reboot loop when i checked the logs looks like the data LVM has gone full

sinful wren
#

Hi all, is there a Harvest-Dashboard to show FSA metrics?

fossil bane
#

Hi all is there a Harvest Dashboard to

ivory dune
#

is there a way to get a list of all dashboards & metrics contained in those dashboards? like from a text file or .yml or..

loud ocean
#

is there a way to get a list of all

calm folio
#

good morning! first: 22.11.0 looks good - thanks for that. checked all the dashboards again and saw that the "Harvest Metadata" didn´t show anything (no data). can´t select a datacenter / hostname / poller ....
the harvest 22.08.0 show also nothing.
did this work for you? my installation is nabox 3.1.2.

fossil bane
#

metadata metrics

wanton pecan
#

Feature request: On the maintenance page, when upgrading Harvest, can you please add a comment/help button (or change the file requestor) that says to upload the tarball and not the rpm? I accidentally uploaded the rpm, it took it, went through some motions, and then didn't change the version 🙂

hybrid night
#

@loud ocean CDOT/ONTAP Node dashboard is showing me one of my four FAS9000 running up to 6x the IOPs of the other three. When I look at CLI, they appear to be closer to each other. Have you or your team heard of this before?

sturdy crest
#

That chart uses sum(volume_total_ops{datacenter="$Datacenter",cluster="$Cluster",node=~"$Node"})

#

So stats on volume object total ops counter.

#

The statistics node show may show nblade counters, which can include indirect i/o.

#

So if you have a node with no data LIFs it will show 0 IOPS for example, because 0 IOPS are coming into the nblade from users.

#

The volume object is dblade, so it gets translated through nblade of data LIF node to cluster network to disk node where volume object is measured.

#

That could be one reason.

young steeple
#

EMS integration is now live in NAbox 3.2b

fossil bane
#

cgrindst3618 CDOTONTAP Node dashboard is

bronze arrow
#

Our data is full from Nabox. We have resized the disk 3 to 400g . I have extended the lv but how can i resize the disk? Can someone help?

#

Resize2fs is Not installed

lament jasper
#

hello, what is configuration should i change to collect data more than 2 week ? netapp harvest only show last 2 week ?

fossil bane
#

data retention

storm shard
#

Hi There, hope you are doing well, on my harvest server i have customized, almost every conf files, do you know if i can know from one version to another version of harvest the configurations files provided have been modified?

echo shard
#

How to upgrade Grafana

storm shard
#

Hi, is there any technical explanation on why on the grafana dashboards the option "Connect Null values" is set to "never"?

young steeple
#

Null value connect

shut lintel
#

So we did the upgrade to v21.11.1, and did some data migrations with prometheus, is there additional migrations for grafana? as I see a bunch of other volumes for grafana_data and harvest? Also, when I go into grafana there is no data being displayed? So not quite sure if was the data migration with prometheus or something else?

wispy raft
#

22.11.0 upgrade and seeing this in the logs:

#

{"level":"warn","Poller":"XXXYYY01","error":"unable to import template=[] no best-fit template found","collector":"Zapi","object":"Status_7mode","caller":"./poller.go:684","time":"2022-12-08T10:12:06-08:00","message":"init collector-object"

XXXharv:/opt/harvest/conf # grep Status_7mode /
zapi/default.yaml: Status_7mode: status_7.yaml

XXXharv:/opt/harvest/conf/zapi/7mode/8.6.0 # ls -atlr
-rw-r--r-- 1 harvest harvest 2021 Nov 21 07:34 volume.yaml
-rw-r--r-- 1 harvest harvest 686 Nov 21 07:34 subsystem.yaml
-rw-r--r-- 1 harvest harvest 596 Nov 21 07:34 status_7.yaml

I don't have 7mode, but it's weird that this exists and it says it can't find it.

wispy raft
#

is there a way to define a label for either a node, or set of nodes such that we can identify uniqueness across nodes? I know that datacenter can define it for a cluster, but really we need to drill down to the node level

opal summit
#

@wispy raft Could you share some detail about the use-case for label a node/nodes? you would like to see THAT label in prometheus metric level or grafana panels?

wispy raft
#

example: A cluster could be in a 'Datacenter' but individual nodes may be in a different 'ROOM' within the cluster, so having a way to denote which 'node' is in which 'room' would be beneficial to our alerting system.

#

additionally, it would be nice to see that 'data' in grafana as well, since we would be able to sort based on 'room' AND 'datacenter'

ivory dune
#

can anyone share a Harvest screenshot that shows some FlexCache statistics?

fossil bane
#

node tagging

#

flexcache stats

uneven pumice
#

afternoon, i've recently upgraded harvest to 22.11 and i'm looking for the new dashboards. I've reset the dashboards in NAbox but i'm still only seeing the previous dashboards, all the containers are running

main drum
#

Hi All, I've installed Harvest (22.11.0). It's talking to a Filer and (as a test) I configured the Promethus exporter, I can fetch data from there using curl. So far so good. But the customer would rather use InfluxDB ...

#

The Filer and the Harvest system are both in the "public" network zone, but the Grafana host is in a (more) secure "management network" zone.
I see that I can configure the InfluxDB exporter with a URL parameter to do HTTPS communication, but what does that do, does Harvest write to that URL? Or does Grafana read from it?
(It's unlikely I'll be allowed to allow open communication from the public network into the management network.)

young steeple
wanton pecan
#

I just put up nabox 3.2. I'm playing around with the headroom dashboard. On a new AFF-250, with no active data being served, it's projecting 300-400 iops of headroom per aggregate. Looks a wee bit on the low side 🙂

wanton pecan
#

I found an issue with the environmental display in nabox for an AFF-A250. In the Cluster dashboard, nabox is showing fan speeds in orange and red whereas sensor show on the node cli shows them in normal range.

fossil bane
#

Node Fan speed in Red

calm folio
#

had a look to the audit logs on ontap with "security audit log show -application http" and found this entries:

Tue Dec 06 06:47:55 2022 <MYFILER> [kern_audit:info:2460] 8503e80001e6e7cd :: <MYFILER>:http :: <MYHARVEST>:33922 :: <MYFILER>:harvest2 :: GET /api/private/cli/network/connections/active?return_records=true&fields=proto,remote_ip,remote_host,vserver,cid,blocks_lb,local_address,node,service,lif_name,local_port,lru :: Pending
Tue Dec 06 06:47:55 2022 <MYFILER> [kern_audit:info:2460] 8503e80001e6e7cd :: <MYFILER>:http :: <MYHARVEST>:33922 :: <MYFILER>:harvest2 :: GET /api/private/cli/network/connections/active?return_records=true&fields=proto,remote_ip,remote_host,vserver,cid,blocks_lb,local_address,node,service,lif_name,local_port,lru :: Error: invalid operation

on the 6th of dec. i updates from harvest 22.08. to 22.11.

can you check this please on your ontap system and how can i fix this?

thank you!

long merlin
#

Hi, question about disk_labels. we replaced a failed disk and assigned ownership to container and partitions but disk_labels still has outage="unassigned". what should we look for? we use 22.11 harvest and the filer is FAS 2750 with ontap 9.8P11. Thanks!

loud ocean
#

Hi question about disk labels we

keen furnace
keen furnace
#

I've used Chrome and Edge on both my work and personal PC and same issue

fossil bane
#

NABox download issue

clever grotto
#

I'm re-posting here to see if anyone has extended Harvest in this way before

Greetings all,

I was just curious if anyone else has had to integrate ONTAP with Splunk recently and what direction you may have taken? The officially supported Splunk add-on is EOL in a month, and the third-party add-on is listed as not supported. This makes it a non-starter option for my large enterprise customer. Has anyone tried to make it work with Harvest, or taken another approach?

Thanks and happy holidays!

fossil bane
#

Harvest with splunk

gentle pike
#

NaBox 3.2 fresh installation. But Systems Dashboard is empty. Tried with Chrome, IE, Edge, Safari from different source clients. Any idea ?

tight iron
#

Hi all, short question for Harvest/NABox about the collecting. Will data from nodes which are going to be replaced keeped or will they be deleted? E.g. we had a techrefresh of an A700 Metro to A800 but now the "old" nodes are missing/lost

young steeple
#

NaBox 3 2 fresh installation But Systems

#

Hi all short question for HarvestNABox

viscid agate
#

@young steeple on my installation only the harvest CDOT and 7MODE Dashboards get imported if i reset in webgui. I am missing the StorageGRID Dashboards. Can you look into this ?

fossil bane
#

StorageGrid NABox

fossil bane
#

FYI. Harvest team will be out of office next week. Wish you in advance Merry Christmas and Happy New Year!

tough vault
#

Hi, is it possible to set default url of NABox 3.2 to /grafana like for NABox 2.x ?
(i.e : redirect to ./grafana when you enter URL of the NABox server)

leaden vigil
#

Hello, I have installed harvest and prometheus but I am unable to collect data from activeIQ UM . Is it something which can be done or I should create a poller for each of my netapp box ?

stiff dove
opal summit
#

@leaden vigil Harvest 2 does not collect data from ActiveIQ Unified Manager.

young steeple
#

Default URL to grafana

leaden vigil
#

thanks guys

covert schooner
#

my customer is running harvest 1.6 and 2.0 in their environment. they are looking for help to import counters for CIFS into Harvest: statistics start -object volume -instance * -counter cifs_total_ops -sample-id mu1statistics show  -sample-id mu1 -instance * -object volume -fields instance,counter,value -counter cifs_total_ops -sort-order descending  -sort-key cifs_total_ops -max 50statistics stop -sample-id mu1statistics sample delete -sample-id mu1 scc29::*> statistics show  -sample-id mu1 -instance * -object volume -fields instance,counter,value -counter cifs_total_ops -sort-order descending  -sort-key cifs_total_ops -max 50object instance         counter        value ------ ---------------- -------------- ----- volume swbld_releases_5 cifs_total_ops 6291  volume swbld_rel_hld    cifs_total_ops 2654  volume swbld_releases_hld                         cifs_total_ops 226   volume swip_pai_1       cifs_total_ops 126   volume swbld_hammer_7   cifs_total_ops 33    volume psg_data_31      cifs_total_ops 31

wispy raft
#

Trying to find documentation for what the counters are in system_node.yaml under zapiperf, specifically, what is system:node -> net_data_recv / sent (as well as others, but that's a starting point). I have hunted around, clearly these mean something but there is no description on any of them.

fossil bane
#

Perf counter information

hot belfry
#

Any known impact to clusters when polling from the old NABOX 2 and new NABOX 3.1 at the same time whilst we cycle out old stats? Should also say old NABOX is using Harbvest 1.6 and new is planning to use Harvest 2 only

calm folio
hot belfry
hot belfry
#

Has the network LIF dashboard been retired when using Harvest 2.0, I don't seem to able to see it under the Harvest - CDOT dashboard folder in Grafana. It obviously still appears under the General folder but a lot of these dashboards do not work when using Harvest 2.0.

fossil bane
#

lif dashboard

uneven tide
#

Is there a list, where I can lookup the meaning of a specific metric?

hot belfry
#

Is or has NMSDK been phased out? is it not required anymore for use with NABOX? wasnt sure what its requirement was with NABOX previously.

uneven tide
#

Hey guys I hope that's the right place for some unsolicited feedback for NAbox,
currently I am a dual study student and the past month I was tasked with setting up a NAbox instance as prototype in our environment.
First thing I've got to say is that doing so was surprisingly easy even though that was done in a very restricted environment.
There were only two things that stuck with me as not as nice as the rest of it. First thing is that the /prometheus path is accessible without any authentication and deactivating it causes the homepage in the NAbox admin interface to be empty, maybe the web interface could get the data in the backend without an XHR Request on client side?
Second "issue" is that setting up the LDAP Connection for Grafana isn't as much fun if you have to change the ldap.toml with vi afterwards anyway. This problem came from the web interface replacing the whole file upon clicking submit and because we use an AD I had to comment out the group_search_filter settings, and changing the member_of attribute to "memberOf" to avoid 15s-1m login time. Same thing different attributes were username and email which I had to update regularly because the web interface set it to default. Maybe you could use sed instead of overwriting the file or offer those settings in the web interface.
I hope this doesn't sound too negative, those are minor issues and the whole project is still incredible work^^

loud ocean
#

Hey guys I hope that s the right place

long merlin
#

Hi, we have a problem with disk_labels. After replacing and assigning a failed disk, outage="unassigned" is stuck in disk_labels. This also causes disk_new_status == 0.

disk  outage-reason
----- -------------
1.0.4 -

disk_labels{cluster="flc1-sys-phx1-coresys", container_type="spare", datacenter="PHX", disk="1.0.4", failed="false", firmware_revision="NA02", instance="flc1-sys-phx1-coresys", job="netapp-harvest", model="X341_SSKBE900A10", node="flc1-01-sys-phx1-coresys", outage="unassigned", owner_node="flc1-01-sys-phx1-coresys", serial_number="WFK9G16F", shared="false", shelf="8262126240392612432", shelf_bay="4", type="SAS"} 1

disk_new_status{cluster="flc1-sys-phx1-coresys", datacenter="PHX", disk="1.0.4", instance="flc1-sys-phx1-coresys", job="netapp-harvest", node="flc1-01-sys-phx1-coresys"} 0
hot belfry
#

I am new to writing queries in Grafana, using Prometheus. Basically trying to do a node read latency (and also one for write) query for all nodes then create an alert when certain latency is breached, at the moment I am trying to get the query working, but i dont think it likes the wildcard, though I am not sure I am right query. There used to be a node latency metric in Graphite, trying to find a similar one in Prom.

opal summit
#

I am new to writing queries in Grafana

untold field
#

Hello All, I may be missing something simple, but what are the container image tags available for cr.netapp.io/harvest? I can't seem to figure out the pattern other than just latest and I don't like using that

hot belfry
#

Whats the easiest way to configure email alerting, smtp server etc within Grafana, I see you can do it from cmd line but there is also a section under Admin under the Alerting section on the Grafana web page.

smoky portal