#EMS setting ignored

1 messages · Page 1 of 1 (latest)

serene saddle
#

In current NAbox version up to 3.3.0, the checkbox for EMS if ignored and EMS setting stays enabled even if you un-check it.
Will be fixed in next 3.3.1b

ruby frost
#

Hello @serene saddle short question about EMS. They where implemented with version 3.2 and enabled by default right?

serene saddle
#

Correct

ruby frost
#

Thanks 😊 need to have a deeper look on that

ruby frost
#

Do you have an example command to send test events?

alpine heart
#

@ruby frost You can try below from CLI

set diag
event generate -message-name arw.volume.state -values 1 2 3 4 5 6 7 8 9 10 11 12
ruby frost
#

Thanks @alpine heart just testing with sk.panic

ruby frost
#

Executed the command but did not saw any entry in prometheus on the nabox 🤔

alpine heart
#

@ruby frost I've tested this locally(not nabox) and it works. Could you please share your logs (https://netapp.github.io/harvest/23.08/help/log-collection/). Also, check if the record is in the Prometheus history by running query ems_events[24h]. Running ems_events alone only shows instant records.

event generate -message-name sk.panic -values 1

ruby frost
#

Hi @alpine heart as requested, if tested it again and send the nabox logs at the supportaddress

alpine heart
#

@ruby frost Thanks. Can you check if Ems collector is enabled. I don't see any logs for Ems collector in shared logs.

ruby frost
#

Thanks for checking. The checkbox in the nabox configuration for ems is enabled. Could it be that the config file also needs to be adjusted? @serene saddle is there something more to do?

alpine heart
#

I tried enabling Ems collector in NAbox 3.3 (2023-07-25) and it shows me EMS logs as below.

nabox-harvest2 | 2023-08-28T09:19:42Z INF collector/collector.go:483 > Collected Poller=A250-41-42-43 apiMs=1062 calcMs=0 collector=Ems:Ems instances=0 metrics=0 parseMs=0 pluginMs=0

#

@ruby frost From the logs you've provided, it appears that the EMS collector is set up in the Harvest config, but it doesn't seem to be running. We can try by restarting the pollers.

ruby frost
#

I‘ve just restarted the whole nabox and generated a new ems event, looks still the same

alpine heart
#

could you share output of below command

dc exec -w /conf nabox-harvest2 /netapp-harvest/bin/harvest doctor --print

ruby frost
#

<cluster>:
addr: -REDACTED-
autostart: '1'
collectors:
- Zapi
- ZapiPerf
- Rest
- Ems
datacenter: ZOI
password: -REDACTED-
prometheus_port: 12994
use_insecure_tls: true
username: -REDACTED-
default:
send_autosupport_stats: '0'

Looks that the EMS collector is enabled

alpine heart
#

Yes looks enabled. Could you share NABox and Harvest version?

ruby frost
#

Nabox 3.3 and harvest 23.08

alpine heart
#

@ruby frost Let's examine the startup logs for any potential issues that might not have been evident in the logs you previously shared. Below are the steps.

  1. Restart the Harvest service using the dc down command.
  2. Wait for approximately 10 minutes to allow the service to fully restart and generate new logs.
  3. Share the new logs @ ng-harvest-files@netapp.com.

Additionally, let's try running Ems collector for debug purpose. To do this, please execute the following commands, replacing POLLERNAME as applicable and share the output.

docker exec -it nabox-harvest2 sh
/netapp-harvest/bin/poller  --poller POLLERNAME --loglevel=2 --collectors Ems
ruby frost
#

@alpine heart In the log from the poller is the following error " init collector-object error="auth failed => 401 Unauthorized" Poller=<cluster> collector=Ems object=Ems
That is interesting because the user which have definied hast the readonly role for everything

alpine heart
#

Thanks. Okay we have found the problem. Let's verify your role permissions, as EMS Collector requires permissions for REST calls. Could you run the following CLI command, replacing ROLE with your role name? Permissions details are available here https://netapp.github.io/harvest/23.08/prepare-cdot-clusters/

security login show -role ROLE
security login role show -role ROLE
ruby frost
#

User/Group Authentication Acct Authentication
Name Application Method Role Name Locked Method


nabox ontapi password readonly no none

#

Role Command/ Access
Vserver Name Directory Query Level


<cluster> readonly DEFAULT readonly
security readonly
security login password all
security login publickey all
security login role show-user-capability all
set all

#

we currently use the default readonly role

alpine heart
#

Okay. Can you give HTTP permission for use nabox via System Manager as attached below in screenshot.

#

You can do the same via CLI as below, replace USER/ROLE as applicable

security login create -user-or-group-name USER -application http -authentication-method ROLE -role ROLE
ruby frost
#

the Unauthorized error is gone but what I now see in the log is
Issue while reading prometheus port Poller=<cluster> exporter=nabox-prometheus
Unable to init exporter error="missing parameter => port" Poller=iefszoisc10 name=nabox-prometheus
exporter (nabox-prometheus) requested by (Ems:Ems) not available Poller=<cluster>

alpine heart
#

You can ignore this error as We were trying to run collector in debug mode.

#

Let's restart Harvest Pollers now and try running your ems test generation event. It should work this time.

ruby frost
#

Now it works thank you @alpine heart for your great support 😊

alpine heart
#

Gr8!

woven fulcrum
#

hello, i enabled the following collectors:

  • Zapi
  • ZapiPerf
  • Rest
  • Ems

have the right role & application:
Second
User/Group Authentication Acct Authentication
Name Application Method Role Name Locked Method


harvest2 http password harvest2-role no none
harvest2 ontapi password harvest2-role no none

and also have an event with severity EMERGENCY:
Time Node Severity Event


9/7/2023 14:10:00 <NODE> EMERGENCY monitor.globalStatus.critical: Power Supply Status Critical: PSU2. Disk shelf fault.

but still doesn´t see the EMS event in health the dashboard

alpine heart
woven fulcrum
#

thank you! i thought every EMERGENCY event is tracked: The EMS collector gathers EMS events as defined in your ems.yml file. This panel displays events with emergency severity that occurred within the selected time range.