somber meteor Dec 8, 2022, 2:50 PM

#

hi @fresh epoch how did you install? nabox, docker compose, something else?

fresh epoch Dec 8, 2022, 2:51 PM

#

docker compose

#

when I go into http://harvest-app001:9090/alerts its showing things as inactive

#

was following https://netapp.github.io/harvest/22.11/install/containers/ the upgrade steps & https://github.com/NetApp/harvest/blob/main/docs/MigratePrometheusDocker.md

#

basically stopped all the containers, did the data migrations and then brought everything back up.

#

[root@harvest-app001 current]# docker volume ls
DRIVER VOLUME NAME
local harvest-21080-6_linux_amd64_grafana_data
local harvest-21080-6_linux_amd64_harvest
local harvest-21110-1_linux_amd64_grafana_data
local harvest-21110-1_linux_amd64_harvest
local harvest-21111-1_linux_amd64_grafana_data
local harvest-21111-1_linux_amd64_harvest
local harvest-22020-4_linux_amd64_grafana_data
local harvest-22020-4_linux_amd64_harvest
local harvest-22110-1_linux_amd64_grafana_data
local harvest-22110-1_linux_amd64_harvest
local harvest-22110-1_linux_amd64_prometheus_data
local harvest-220401-nightly_linux_amd64_grafana_data
local harvest-220401-nightly_linux_amd64_harvest
local harvest_prometheus_data

#

those are the volumes I currently have

#

lot of older ones for older versions so not sure I need them?

#

or if those also need data moved over?

#

I moved everything to the "harvest_prometheus_data" volume

#

for prometheus data migration

somber meteor Dec 8, 2022, 2:57 PM

#

thanks! yep that looks good. it's probably sufficient to cp the data from the most recent version you were using previously to 22.11. Those older volumes can be cleaned up, but no rush if you want to move slower. Cleaning up is step 5 on that page, but the more important thing for us to figure out is why aren't you seeing data in 22.11

fresh epoch Dec 8, 2022, 2:59 PM

#

That would be great

#

So there is a build specific volume, but then teh harvest_prometheus_data is that more the archive one then? It'd be nice to not have to worry about moving data between upgrades

somber meteor Dec 8, 2022, 3:02 PM

#

agreed. you will not need to do that in the future. That's the change we made in 22.11, to used named volumes that are not named after the release

fresh epoch Dec 8, 2022, 3:03 PM

#

Even though I see one for this version?

#

local harvest-22110-1_linux_amd64_grafana_data
local harvest-22110-1_linux_amd64_harvest
local harvest-22110-1_linux_amd64_prometheus_data

#

plus this one ; local harvest_prometheus_data

somber meteor Dec 8, 2022, 3:05 PM

#

let's check your prom-stack - can you paste the output of head -10 prom-stack.tmpl

#

and head -10 prom-stack.yml

fresh epoch Dec 8, 2022, 3:08 PM

#

version: '3.7'

volumes:
prometheus_data: {}
grafana_data: {}
harvest: {}

networks:
frontend:
backend:

#

humm, seems a bit different from the template

somber meteor Dec 8, 2022, 3:10 PM

#

was that from yml or .tmpl ? that's the old version that we fixed in 22.11. It should look like this

fresh epoch Dec 8, 2022, 3:10 PM

#

yml

#

as I copied it from the old version

somber meteor Dec 8, 2022, 3:11 PM

#

ok that makes sense, can you check the tmpl file and paste it's contents too?

fresh epoch Dec 8, 2022, 3:12 PM

#

version: '3.7'

volumes:
prometheus_data:
name: harvest_prometheus_data
grafana_data:
name: harvest_grafana_data

networks:
frontend:
backend:

somber meteor Dec 8, 2022, 3:12 PM

#

can you rerun this command bin/harvest generate docker full --port --output harvest-compose.yml

#

and double check that your prom-stack.yml now matches the tmpl

fresh epoch Dec 8, 2022, 3:13 PM

#

Now it does yes

#

dang so maybe missed that step

somber meteor Dec 8, 2022, 3:14 PM

#

no i think our documentation should be explicit about that step. We said this

fresh epoch Dec 8, 2022, 3:15 PM

#

ah ok.

#

Would be good to add that full command then

somber meteor Dec 8, 2022, 3:15 PM

#

and that's too vague 🙂 sorry about the confusion. If you run docker-compose -f prom-stack.yml -f harvest-compose.yml up -d --remove-orphans now do you see your earlier copied prom data?

fresh epoch Dec 8, 2022, 3:17 PM

#

humm, missing our pollers now

#

also getting " Error updating options: Bad Gateway " in graphana

#

oh, it regenerated the harvest.yml

somber meteor Dec 8, 2022, 3:20 PM

#

when you regenerated the harvest-compose.yml it by default used the harvest.yml file in the current directory, which if you downloaded a new release, will be an example harvest.yml. we don't rewrite that file. Maybe you did not copy your earlier one into this directory?

fresh epoch Dec 8, 2022, 3:20 PM

#

that could be

#

let me recopy that in and do it again

#

prometheus seems to having an issue, restarting

#

ts=2022-12-08T15:25:48.160Z caller=main.go:798 level=info msg="Stopping scrape discovery manager..."
ts=2022-12-08T15:25:48.160Z caller=main.go:812 level=info msg="Stopping notify discovery manager..."
ts=2022-12-08T15:25:48.160Z caller=main.go:834 level=info msg="Stopping scrape manager..."
ts=2022-12-08T15:25:48.160Z caller=main.go:808 level=info msg="Notify discovery manager stopped"
ts=2022-12-08T15:25:48.160Z caller=main.go:794 level=info msg="Scrape discovery manager stopped"
ts=2022-12-08T15:25:48.160Z caller=manager.go:945 level=info component="rule manager" msg="Stopping rule manager..."
ts=2022-12-08T15:25:48.160Z caller=manager.go:955 level=info component="rule manager" msg="Rule manager stopped"
ts=2022-12-08T15:25:48.160Z caller=notifier.go:600 level=info component=notifier msg="Stopping notification manager..."
ts=2022-12-08T15:25:48.160Z caller=main.go:1054 level=info msg="Notifier manager stopped"
ts=2022-12-08T15:25:48.160Z caller=main.go:828 level=info msg="Scrape manager stopped"
ts=2022-12-08T15:25:48.160Z caller=main.go:1063 level=error err="opening storage failed: get segment range: segments are not sequential"

somber meteor Dec 8, 2022, 3:30 PM

#

that makes it sound like the earlier Copy the historical Prometheus data had issues? We kinda skipped over it, but which of the many prom volumes did you copy in step 4? I wonder if that was a "good" source to copy from https://github.com/NetApp/harvest/blob/main/docs/MigratePrometheusDocker.md#copy-the-historical-prometheus-data

#

we can keep helping you try to migrate the prometheus data if you want. Or if you don't care about that data, we can blow it away, recreate the volume, and up everything again to get you going

#

one idea - the previous prometheus volume with the most data is probably the one you want? if so,
docker system df -v will show you the largest in the Local Volumes space usage: section.

fresh epoch Dec 8, 2022, 3:38 PM

#

there was like 4 older volumes I copied from to get all the data into the new one

somber meteor Dec 8, 2022, 3:39 PM

#

ah! ok that makes sense. I doubt Prometheus supports copying multiple into the same folder

#

what if we delete the new volume, recreate it, copy the most recent or the largest, re up, and see if that unblocks you?

fresh epoch Dec 8, 2022, 3:40 PM

#

Sure, so I guess if we can get it going and then not have to worry about data migrations again then could just create a new one

somber meteor Dec 8, 2022, 3:41 PM

#

yes, we should never need to have this conversation again in the future 😆

fresh epoch Dec 8, 2022, 3:41 PM

#

Nice 🙂

#

ok how do I delete, recreate it again

somber meteor Dec 8, 2022, 3:42 PM

#

docker volume rm harvest_prometheus_data will remove the recently created one

fresh epoch Dec 8, 2022, 3:42 PM

#

oh i see it in the docs

#

docker volume create --name harvest_prometheus_data

#

then what you said

somber meteor Dec 8, 2022, 3:43 PM

#

yep, the rm first and then that volume create, you got it

fresh epoch Dec 8, 2022, 3:47 PM

#

ok better.. docker ps shows the pollers running, but not sure they are collecting. harvest metadata showing there is only the unix poller

#

maybe just need to wait a bit?

#

oh, thats just that chart...

somber meteor Dec 8, 2022, 3:48 PM

#

sounds promising! you can check one of the pollers if you want to docker logs -f name-of-poller-from-docker-ps

fresh epoch Dec 8, 2022, 3:48 PM

#

Yah I can see it collecting data for the pollers

#

lots of ZapiPerf data

somber meteor Dec 8, 2022, 3:49 PM

#

so much data from perf

fresh epoch Dec 8, 2022, 3:49 PM

#

🙂

somber meteor Dec 8, 2022, 3:49 PM

#

Here's what I captured, two changes to the documentation:

you must regenerate your harvest-compose file
mention that you should only copy one previous prometheus_data into the new volume. Not multiple.

fresh epoch Dec 8, 2022, 3:50 PM

#

sounds good

somber meteor Dec 8, 2022, 3:50 PM

#

and sounds like you're all set?

fresh epoch Dec 8, 2022, 3:50 PM

#

Had a couple other questions if you could

somber meteor Dec 8, 2022, 3:50 PM

#

sure, shoot

fresh epoch Dec 8, 2022, 3:51 PM

#

When we login to Grafana is there a way to put in better authentication? ldap to ad ?

somber meteor Dec 8, 2022, 3:51 PM

#

yes, let me find the link

#

https://github.com/NetApp/harvest/discussions/580

fresh epoch Dec 8, 2022, 3:52 PM

#

How do we add back the "ALL" to select multiple DC's and Clusters?

somber meteor Dec 8, 2022, 3:53 PM

#

you're in luck - that was one of the features in 22.11 😄

fresh epoch Dec 8, 2022, 3:55 PM

#

Is there a way to keep data longer ?

#

now with the new volume for prometheus / assuming grafana gets data from prometheus ?

#

can we keep data for like 13 months ?

somber meteor Dec 8, 2022, 3:57 PM

#

yes, Grafana uses Prometheus as its datasource. you can change Prometheus's retention https://prometheus.io/docs/prometheus/latest/storage/#operational-aspects

fresh epoch Dec 8, 2022, 10:47 PM

#

Trying to get the ldap going based on the above example.

#

t=2022-12-08T22:45:19+0000 lvl=eror msg="Failed to read plugin provisioning files from directory" logger=provisioning.plugins path=/etc/grafana/provisioning/plugins error="open /etc/grafana/provisioning/plugins: no such file or directory"
t=2022-12-08T22:45:19+0000 lvl=eror msg="Can't read alert notification provisioning files from directory" logger=provisioning.notifiers path=/etc/grafana/provisioning/notifiers error="open /etc/grafana/provisioning/notifiers: no such file or directory"
t=2022-12-08T22:45:19+0000 lvl=info msg="warming cache for startup" logger=ngalert
t=2022-12-08T22:45:19+0000 lvl=info msg="starting MultiOrg Alertmanager" logger=ngalert.multiorg.alertmanager
t=2022-12-08T22:45:19+0000 lvl=info msg="HTTP Server Listen" logger=http.server address=[::]:3000 protocol=http subUrl= socket=
t=2022-12-08T22:45:53+0000 lvl=info msg="LDAP enabled, reading config file" logger=ldap file=/etc/grafana/ldap.toml
t=2022-12-08T22:45:53+0000 lvl=eror msg="Error while trying to authenticate user" logger=context userId=0 orgId=0 uname= error="LDAP Result Code 200 "Network Error": dial tcp 127.0.0.1:389: connect: connection refused" remote_addr=10.94.8.240
t=2022-12-08T22:45:53+0000 lvl=eror msg="Request Completed" logger=context userId=0 orgId=0 uname= method=POST path=/login status=500 remote_addr=10.94.8.240 time_ms=3 size=53 referer=http://harvest-app001:3000/login
t=2022-12-08T22:45:54+0000 lvl=info msg="LDAP enabled, reading config file" logger=ldap file=/etc/grafana/ldap.toml
t=2022-12-08T22:45:54+0000 lvl=eror msg="Error while trying to authenticate user" logger=context userId=0 orgId=0 uname= error="LDAP Result Code 200 "Network Error": dial tcp 127.0.0.1:389: connect: connection refused" remote_addr=10.94.8.60
t=2022-12-08T22:45:54+0000 lvl=eror msg="Request Completed" logger=context userId=0 orgId=0 uname= method=POST path=/login status=500 remote_addr=10.94.8.60 time_ms=3 size=53 referer=http://harvest-app001:3000/login
[root@harvest-app001 current]#

#

from the docker logs.

#

from prom-stack.yml

#

grafana:
container_name: grafana
image: grafana/grafana:8.3.4
depends_on:
- prometheus
ports:
- 3000:3000
volumes:
- grafana_data:/var/lib/grafana
- ./grafana:/etc/grafana/provisioning # import Harvest dashboards

- ./docker/grafana/ldap.toml:/etc/grafana/ldap.toml

somber meteor · 2022-12-08T14:50:49.778Z

Upgrade to 22.11 | NetApp | Page 1

- /etc/grafana/ldap.toml:/etc/grafana/ldap.toml

    networks:
        - backend
        - frontend
    restart: unless-stopped
    labels:
        kompose.service.type: nodeport
    environment:
        - GF_AUTH_LDAP_ENABLED=true
        - GF_AUTH_LDAP_CONFIG_FILE=/etc/grafana/ldap.toml

somber meteor Dec 9, 2022, 2:16 PM

#

hi @fresh epoch maybe you ldap.toml has the wrong address? Looks like Grafana reads the file fine but when it tries to connect to the ldap server it fails with connection refused. Firewall, wrong ip? t=2022-12-08T22:45:54+0000 lvl=eror msg="Error while trying to authenticate user" logger=context userId=0 orgId=0 uname= error="LDAP Result Code 200 "Network Error": dial tcp 127.0.0.1:389: connect: connection refused" remote_addr=10.94.8.60

fresh epoch Dec 9, 2022, 2:22 PM

#

yah I noticed that error, seems to trying to be sending it to the localhost? Can I DM/email you with that file info?

somber meteor Dec 9, 2022, 2:23 PM

#

yes, I'll take a look, but you might get better help from Grafana. https://github.com/NetApp/harvest/wiki/FAQ#how-do-i-share-sensitive-log-files-with-netapp

#

maybe this https://github.com/grafana/grafana/issues/12344#issuecomment-398423462

#

what version of Grafana?

fresh epoch Dec 9, 2022, 2:25 PM

#

fd221b0b9278 grafana/grafana:8.3.4 "/run.sh" 16 hours ago Up 16 hours 0.0.0.0:3000->3000/tcp, :::3000->3000/tcp grafana

#

looks like 8.3.4 ?

somber meteor Dec 9, 2022, 2:26 PM

#

yep

#

i know something we can check, let me form the command

#

i suspect the ldap.toml grafana is using is different than the one you setup. let's try to confirm that by looking at the one Grafana is using by running docker exec -it grafana less /etc/grafana/ldap.toml

fresh epoch Dec 9, 2022, 2:32 PM

#

yep, looks like default config

#

vs what we have entered, so may not be reading it in right

somber meteor Dec 9, 2022, 2:35 PM

#

you can copy your file into the container or setup a volume mount so the container "sees" your version

#

generally a volume mount is better

#

i can dig up the volume mount for you if needed

fresh epoch Dec 9, 2022, 2:41 PM

#

I believe we have that in the promstack.yml

#

volumes:
- grafana_data:/var/lib/grafana
- ./grafana:/etc/grafana/provisioning # import Harvest dashboards
- /etc/grafana/ldap.toml:/etc/grafana/ldap.toml

#

environment:
- GF_AUTH_LDAP_ENABLED=true
- GF_AUTH_LDAP_CONFIG_FILE=/etc/grafana/ldap.toml

#

although, just saw my co-worker remmed that out yesterday so just un-remmed it out again and stopped/started

#

now whenI do

#

docker exec -it grafana less /etc/grafana/ldap.toml
less: can't open '/etc/grafana/ldap.toml': Permission denied

somber meteor Dec 9, 2022, 2:42 PM

#

progress!!! 😄

fresh epoch Dec 9, 2022, 2:42 PM

#

-rw-r----- 1 root grafana 2654 Dec 8 16:44 ldap.toml

#

unless its taking it from here

#

ls -al ./docker/grafana/ldap.toml
-rw-r----- 1 root root 2641 Dec 8 16:40 ./docker/grafana/ldap.toml

#

that's not owned by grafana

somber meteor Dec 9, 2022, 2:45 PM

#

your volume mount says the file is on the host machine at /etc/grafana/ldap.toml

#

the permission issue should be on that file or the potentially the parent directory /etc/grafana you could rule those out I suppose with some permissive chmods

#

could be selinux also? I made the mapping locally and i see the ldap.toml inside the container without issue. This SO thread has some things that look relevant https://stackoverflow.com/questions/60175493/docker-container-cant-access-mapped-directory-from-host

fresh epoch Dec 9, 2022, 2:52 PM

#

drwxr-xr-x. 3 root root 84 Dec 8 16:44 grafana

#

that is /etc/grafana

#

ls -al /etc/grafana/ldap.toml
-rw-r----- 1 root grafana 2654 Dec 8 16:44 /etc/grafana/ldap.toml

#

[root@harvest-app001 current]# sestatus
SELinux status: disabled

somber meteor Dec 9, 2022, 2:56 PM

#

#

those are the permissions seen inside the container on my side

#

I added the volume like so, just in my current Harvest directory
- ./ldap.toml:/etc/grafana/ldap.toml

fresh epoch Dec 9, 2022, 2:58 PM

#

strange

#

[root@harvest-app001 current]# docker exec -it grafana ls -la /etc/grafana/
total 48
drwxr-xr-x 3 root root 62 Jan 17 2022 .
drwxr-xr-x 1 root root 66 Dec 9 14:44 ..
-rw-r--r-- 1 root root 43461 Jan 17 2022 grafana.ini
-rw-r----- 1 root 984 2654 Dec 8 22:44 ldap.toml
drwxr-xr-x 5 root root 54 Dec 8 22:42 provisioning

#

its lost its group

#

i'll try how you have it..

somber meteor Dec 9, 2022, 3:00 PM

#

hmm yeah id 984 is the problem i reckon and that was from yesterday. Maybe stop the grafana container, rm it and reup?

#

perhaps that was from yesterdays experiment

fresh epoch Dec 9, 2022, 3:01 PM

#

oooh

#

that seemed to help

#

for whatever reason

#

[root@harvest-app001 current]# docker exec -it grafana ls -al /etc/grafana/
total 48
drwxr-xr-x 3 root root 62 Jan 17 2022 .
drwxr-xr-x 1 root root 66 Dec 9 15:01 ..
-rw-r--r-- 1 root root 43461 Jan 17 2022 grafana.ini
-rw-r----- 1 root root 2654 Dec 9 15:00 ldap.toml
drwxr-xr-x 5 root root 54 Dec 8 22:42 provisioning

somber meteor Dec 9, 2022, 3:02 PM

#

the stop, rm, and reup or moving the file locally?

fresh epoch Dec 9, 2022, 3:02 PM

#

now its root:root

#

copied the ldap.toml to the current version directory same as where harvest.yml and prom-stack.yml are

#

then I stopped the containers and started it

somber meteor Dec 9, 2022, 3:03 PM

#

cool, probably caused by yesterday's experiments. we undid those and now hopefully you're all set

fresh epoch Dec 9, 2022, 3:08 PM

#

Still not letting me in so maybe I have some sort of other config problem in the ldap file

somber meteor Dec 9, 2022, 3:08 PM

#

hopefully something new in the grafana log file?

fresh epoch Dec 9, 2022, 3:18 PM

#

not seeing much as far as errors

#

t=2022-12-09T15:06:39+0000 lvl=info msg="LDAP enabled, reading config file" logger=ldap file=/etc/grafana/ldap.toml
t=2022-12-09T15:06:39+0000 lvl=eror msg="Invalid username or password" logger=context userId=0 orgId=0 uname= error="invalid username or password" remote_addr=10.94.8.60
t=2022-12-09T15:06:39+0000 lvl=info msg="Request Completed" logger=context userId=0 orgId=0 uname= method=POST path=/login status=401 remote_addr=10.94.8.60 time_ms=17 size=42 referer=http://harvest-app001:3000/login

somber meteor Dec 9, 2022, 3:20 PM

#

looks like it's talking to the ldap server now so that's good - but looks like the uname is blank and/or the username/password is wrong? t=2022-12-09T15:06:39+0000 lvl=eror msg="Invalid username or password" logger=context userId=0 orgId=0 uname= error="invalid username or password" remote_addr=10.94.8.60

fresh epoch Dec 9, 2022, 3:24 PM

#

entering username/password and is correct.. comparing a sample to what I have

fresh epoch Dec 9, 2022, 4:53 PM

#

OH! got it! went to a simplified version of ldap.toml and that worked!

#

I took the AD example from grafana docs and removed all the #'s basically

#Upgrade to 22.11

- ./docker/grafana/ldap.toml:/etc/grafana/ldap.toml

- /etc/grafana/ldap.toml:/etc/grafana/ldap.toml