#Upgrade to 22.11

1 messages · Page 1 of 1 (latest)

somber meteor
#

hi @fresh epoch how did you install? nabox, docker compose, something else?

fresh epoch
#

docker compose

#

basically stopped all the containers, did the data migrations and then brought everything back up.

#

[root@harvest-app001 current]# docker volume ls
DRIVER VOLUME NAME
local harvest-21080-6_linux_amd64_grafana_data
local harvest-21080-6_linux_amd64_harvest
local harvest-21110-1_linux_amd64_grafana_data
local harvest-21110-1_linux_amd64_harvest
local harvest-21111-1_linux_amd64_grafana_data
local harvest-21111-1_linux_amd64_harvest
local harvest-22020-4_linux_amd64_grafana_data
local harvest-22020-4_linux_amd64_harvest
local harvest-22110-1_linux_amd64_grafana_data
local harvest-22110-1_linux_amd64_harvest
local harvest-22110-1_linux_amd64_prometheus_data
local harvest-220401-nightly_linux_amd64_grafana_data
local harvest-220401-nightly_linux_amd64_harvest
local harvest_prometheus_data

#

those are the volumes I currently have

#

lot of older ones for older versions so not sure I need them?

#

or if those also need data moved over?

#

I moved everything to the "harvest_prometheus_data" volume

#

for prometheus data migration

somber meteor
#

thanks! yep that looks good. it's probably sufficient to cp the data from the most recent version you were using previously to 22.11. Those older volumes can be cleaned up, but no rush if you want to move slower. Cleaning up is step 5 on that page, but the more important thing for us to figure out is why aren't you seeing data in 22.11

fresh epoch
#

That would be great

#

So there is a build specific volume, but then teh harvest_prometheus_data is that more the archive one then? It'd be nice to not have to worry about moving data between upgrades

somber meteor
#

agreed. you will not need to do that in the future. That's the change we made in 22.11, to used named volumes that are not named after the release

fresh epoch
#

Even though I see one for this version?

#

local harvest-22110-1_linux_amd64_grafana_data
local harvest-22110-1_linux_amd64_harvest
local harvest-22110-1_linux_amd64_prometheus_data

#

plus this one ; local harvest_prometheus_data

somber meteor
#

let's check your prom-stack - can you paste the output of head -10 prom-stack.tmpl

#

and head -10 prom-stack.yml

fresh epoch
#

version: '3.7'

volumes:
prometheus_data: {}
grafana_data: {}
harvest: {}

networks:
frontend:
backend:

#

humm, seems a bit different from the template

somber meteor
#

was that from yml or .tmpl ? that's the old version that we fixed in 22.11. It should look like this

fresh epoch
#

yml

#

as I copied it from the old version

somber meteor
#

ok that makes sense, can you check the tmpl file and paste it's contents too?

fresh epoch
#

version: '3.7'

volumes:
prometheus_data:
name: harvest_prometheus_data
grafana_data:
name: harvest_grafana_data

networks:
frontend:
backend:

somber meteor
#

can you rerun this command bin/harvest generate docker full --port --output harvest-compose.yml

#

and double check that your prom-stack.yml now matches the tmpl

fresh epoch
#

Now it does yes

#

dang so maybe missed that step

somber meteor
#

no i think our documentation should be explicit about that step. We said this

fresh epoch
#

ah ok.

#

Would be good to add that full command then

somber meteor
#

and that's too vague 🙂 sorry about the confusion. If you run docker-compose -f prom-stack.yml -f harvest-compose.yml up -d --remove-orphans now do you see your earlier copied prom data?

fresh epoch
#

humm, missing our pollers now

#

also getting " Error updating options: Bad Gateway " in graphana

#

oh, it regenerated the harvest.yml

somber meteor
#

when you regenerated the harvest-compose.yml it by default used the harvest.yml file in the current directory, which if you downloaded a new release, will be an example harvest.yml. we don't rewrite that file. Maybe you did not copy your earlier one into this directory?

fresh epoch
#

that could be

#

let me recopy that in and do it again

#

prometheus seems to having an issue, restarting

#

ts=2022-12-08T15:25:48.160Z caller=main.go:798 level=info msg="Stopping scrape discovery manager..."
ts=2022-12-08T15:25:48.160Z caller=main.go:812 level=info msg="Stopping notify discovery manager..."
ts=2022-12-08T15:25:48.160Z caller=main.go:834 level=info msg="Stopping scrape manager..."
ts=2022-12-08T15:25:48.160Z caller=main.go:808 level=info msg="Notify discovery manager stopped"
ts=2022-12-08T15:25:48.160Z caller=main.go:794 level=info msg="Scrape discovery manager stopped"
ts=2022-12-08T15:25:48.160Z caller=manager.go:945 level=info component="rule manager" msg="Stopping rule manager..."
ts=2022-12-08T15:25:48.160Z caller=manager.go:955 level=info component="rule manager" msg="Rule manager stopped"
ts=2022-12-08T15:25:48.160Z caller=notifier.go:600 level=info component=notifier msg="Stopping notification manager..."
ts=2022-12-08T15:25:48.160Z caller=main.go:1054 level=info msg="Notifier manager stopped"
ts=2022-12-08T15:25:48.160Z caller=main.go:828 level=info msg="Scrape manager stopped"
ts=2022-12-08T15:25:48.160Z caller=main.go:1063 level=error err="opening storage failed: get segment range: segments are not sequential"

somber meteor
#

we can keep helping you try to migrate the prometheus data if you want. Or if you don't care about that data, we can blow it away, recreate the volume, and up everything again to get you going

#

one idea - the previous prometheus volume with the most data is probably the one you want? if so,
docker system df -v will show you the largest in the Local Volumes space usage: section.

fresh epoch
#

there was like 4 older volumes I copied from to get all the data into the new one

somber meteor
#

ah! ok that makes sense. I doubt Prometheus supports copying multiple into the same folder

#

what if we delete the new volume, recreate it, copy the most recent or the largest, re up, and see if that unblocks you?

fresh epoch
#

Sure, so I guess if we can get it going and then not have to worry about data migrations again then could just create a new one

somber meteor
#

yes, we should never need to have this conversation again in the future 😆

fresh epoch
#

Nice 🙂

#

ok how do I delete, recreate it again

somber meteor
#

docker volume rm harvest_prometheus_data will remove the recently created one

fresh epoch
#

oh i see it in the docs

#

docker volume create --name harvest_prometheus_data

#

then what you said

somber meteor
#

yep, the rm first and then that volume create, you got it

fresh epoch
#

ok better.. docker ps shows the pollers running, but not sure they are collecting. harvest metadata showing there is only the unix poller

#

maybe just need to wait a bit?

#

oh, thats just that chart...

somber meteor
#

sounds promising! you can check one of the pollers if you want to docker logs -f name-of-poller-from-docker-ps

fresh epoch
#

Yah I can see it collecting data for the pollers

#

lots of ZapiPerf data

somber meteor
#

so much data from perf

fresh epoch
#

🙂

somber meteor
#

Here's what I captured, two changes to the documentation:

  1. you must regenerate your harvest-compose file
  2. mention that you should only copy one previous prometheus_data into the new volume. Not multiple.
fresh epoch
#

sounds good

somber meteor
#

and sounds like you're all set?

fresh epoch
#

Had a couple other questions if you could

somber meteor
#

sure, shoot

fresh epoch
#

When we login to Grafana is there a way to put in better authentication? ldap to ad ?

somber meteor
#

yes, let me find the link

fresh epoch
#

How do we add back the "ALL" to select multiple DC's and Clusters?

somber meteor
#

you're in luck - that was one of the features in 22.11 😄

fresh epoch
#

Is there a way to keep data longer ?

#

now with the new volume for prometheus / assuming grafana gets data from prometheus ?

#

can we keep data for like 13 months ?

somber meteor
fresh epoch
#

Trying to get the ldap going based on the above example.

#

t=2022-12-08T22:45:19+0000 lvl=eror msg="Failed to read plugin provisioning files from directory" logger=provisioning.plugins path=/etc/grafana/provisioning/plugins error="open /etc/grafana/provisioning/plugins: no such file or directory"
t=2022-12-08T22:45:19+0000 lvl=eror msg="Can't read alert notification provisioning files from directory" logger=provisioning.notifiers path=/etc/grafana/provisioning/notifiers error="open /etc/grafana/provisioning/notifiers: no such file or directory"
t=2022-12-08T22:45:19+0000 lvl=info msg="warming cache for startup" logger=ngalert
t=2022-12-08T22:45:19+0000 lvl=info msg="starting MultiOrg Alertmanager" logger=ngalert.multiorg.alertmanager
t=2022-12-08T22:45:19+0000 lvl=info msg="HTTP Server Listen" logger=http.server address=[::]:3000 protocol=http subUrl= socket=
t=2022-12-08T22:45:53+0000 lvl=info msg="LDAP enabled, reading config file" logger=ldap file=/etc/grafana/ldap.toml
t=2022-12-08T22:45:53+0000 lvl=eror msg="Error while trying to authenticate user" logger=context userId=0 orgId=0 uname= error="LDAP Result Code 200 "Network Error": dial tcp 127.0.0.1:389: connect: connection refused" remote_addr=10.94.8.240
t=2022-12-08T22:45:53+0000 lvl=eror msg="Request Completed" logger=context userId=0 orgId=0 uname= method=POST path=/login status=500 remote_addr=10.94.8.240 time_ms=3 size=53 referer=http://harvest-app001:3000/login
t=2022-12-08T22:45:54+0000 lvl=info msg="LDAP enabled, reading config file" logger=ldap file=/etc/grafana/ldap.toml
t=2022-12-08T22:45:54+0000 lvl=eror msg="Error while trying to authenticate user" logger=context userId=0 orgId=0 uname= error="LDAP Result Code 200 "Network Error": dial tcp 127.0.0.1:389: connect: connection refused" remote_addr=10.94.8.60
t=2022-12-08T22:45:54+0000 lvl=eror msg="Request Completed" logger=context userId=0 orgId=0 uname= method=POST path=/login status=500 remote_addr=10.94.8.60 time_ms=3 size=53 referer=http://harvest-app001:3000/login
[root@harvest-app001 current]#

#

from the docker logs.

#

from prom-stack.yml

#

grafana:
container_name: grafana
image: grafana/grafana:8.3.4
depends_on:
- prometheus
ports:
- 3000:3000
volumes:
- grafana_data:/var/lib/grafana
- ./grafana:/etc/grafana/provisioning # import Harvest dashboards

- ./docker/grafana/ldap.toml:/etc/grafana/ldap.toml

- /etc/grafana/ldap.toml:/etc/grafana/ldap.toml

    networks:
        - backend
        - frontend
    restart: unless-stopped
    labels:
        kompose.service.type: nodeport
    environment:
        - GF_AUTH_LDAP_ENABLED=true
        - GF_AUTH_LDAP_CONFIG_FILE=/etc/grafana/ldap.toml
somber meteor
#

hi @fresh epoch maybe you ldap.toml has the wrong address? Looks like Grafana reads the file fine but when it tries to connect to the ldap server it fails with connection refused. Firewall, wrong ip? t=2022-12-08T22:45:54+0000 lvl=eror msg="Error while trying to authenticate user" logger=context userId=0 orgId=0 uname= error="LDAP Result Code 200 "Network Error": dial tcp 127.0.0.1:389: connect: connection refused" remote_addr=10.94.8.60

fresh epoch
#

yah I noticed that error, seems to trying to be sending it to the localhost? Can I DM/email you with that file info?

somber meteor
#

what version of Grafana?

fresh epoch
#

fd221b0b9278 grafana/grafana:8.3.4 "/run.sh" 16 hours ago Up 16 hours 0.0.0.0:3000->3000/tcp, :::3000->3000/tcp grafana

#

looks like 8.3.4 ?

somber meteor
#

yep

#

i know something we can check, let me form the command

#

i suspect the ldap.toml grafana is using is different than the one you setup. let's try to confirm that by looking at the one Grafana is using by running docker exec -it grafana less /etc/grafana/ldap.toml

fresh epoch
#

yep, looks like default config

#

vs what we have entered, so may not be reading it in right

somber meteor
#

you can copy your file into the container or setup a volume mount so the container "sees" your version

#

generally a volume mount is better

#

i can dig up the volume mount for you if needed

fresh epoch
#

I believe we have that in the promstack.yml

#

volumes:
- grafana_data:/var/lib/grafana
- ./grafana:/etc/grafana/provisioning # import Harvest dashboards
- /etc/grafana/ldap.toml:/etc/grafana/ldap.toml

#

environment:
- GF_AUTH_LDAP_ENABLED=true
- GF_AUTH_LDAP_CONFIG_FILE=/etc/grafana/ldap.toml

#

although, just saw my co-worker remmed that out yesterday so just un-remmed it out again and stopped/started

#

now whenI do

#

docker exec -it grafana less /etc/grafana/ldap.toml
less: can't open '/etc/grafana/ldap.toml': Permission denied

somber meteor
#

progress!!! 😄

fresh epoch
#

-rw-r----- 1 root grafana 2654 Dec 8 16:44 ldap.toml

#

unless its taking it from here

#

ls -al ./docker/grafana/ldap.toml
-rw-r----- 1 root root 2641 Dec 8 16:40 ./docker/grafana/ldap.toml

#

that's not owned by grafana

somber meteor
#

your volume mount says the file is on the host machine at /etc/grafana/ldap.toml

#

the permission issue should be on that file or the potentially the parent directory /etc/grafana you could rule those out I suppose with some permissive chmods

fresh epoch
#

drwxr-xr-x. 3 root root 84 Dec 8 16:44 grafana

#

that is /etc/grafana

#

ls -al /etc/grafana/ldap.toml
-rw-r----- 1 root grafana 2654 Dec 8 16:44 /etc/grafana/ldap.toml

#

[root@harvest-app001 current]# sestatus
SELinux status: disabled

somber meteor
#

those are the permissions seen inside the container on my side

#

I added the volume like so, just in my current Harvest directory
- ./ldap.toml:/etc/grafana/ldap.toml

fresh epoch
#

strange

#

[root@harvest-app001 current]# docker exec -it grafana ls -la /etc/grafana/
total 48
drwxr-xr-x 3 root root 62 Jan 17 2022 .
drwxr-xr-x 1 root root 66 Dec 9 14:44 ..
-rw-r--r-- 1 root root 43461 Jan 17 2022 grafana.ini
-rw-r----- 1 root 984 2654 Dec 8 22:44 ldap.toml
drwxr-xr-x 5 root root 54 Dec 8 22:42 provisioning

#

its lost its group

#

i'll try how you have it..

somber meteor
#

hmm yeah id 984 is the problem i reckon and that was from yesterday. Maybe stop the grafana container, rm it and reup?

#

perhaps that was from yesterdays experiment

fresh epoch
#

oooh

#

that seemed to help

#

for whatever reason

#

[root@harvest-app001 current]# docker exec -it grafana ls -al /etc/grafana/
total 48
drwxr-xr-x 3 root root 62 Jan 17 2022 .
drwxr-xr-x 1 root root 66 Dec 9 15:01 ..
-rw-r--r-- 1 root root 43461 Jan 17 2022 grafana.ini
-rw-r----- 1 root root 2654 Dec 9 15:00 ldap.toml
drwxr-xr-x 5 root root 54 Dec 8 22:42 provisioning

somber meteor
#

the stop, rm, and reup or moving the file locally?

fresh epoch
#

now its root:root

#

copied the ldap.toml to the current version directory same as where harvest.yml and prom-stack.yml are

#

then I stopped the containers and started it

somber meteor
#

cool, probably caused by yesterday's experiments. we undid those and now hopefully you're all set

fresh epoch
#

Still not letting me in so maybe I have some sort of other config problem in the ldap file

somber meteor
#

hopefully something new in the grafana log file?

fresh epoch
#

not seeing much as far as errors

#

t=2022-12-09T15:06:39+0000 lvl=info msg="LDAP enabled, reading config file" logger=ldap file=/etc/grafana/ldap.toml
t=2022-12-09T15:06:39+0000 lvl=eror msg="Invalid username or password" logger=context userId=0 orgId=0 uname= error="invalid username or password" remote_addr=10.94.8.60
t=2022-12-09T15:06:39+0000 lvl=info msg="Request Completed" logger=context userId=0 orgId=0 uname= method=POST path=/login status=401 remote_addr=10.94.8.60 time_ms=17 size=42 referer=http://harvest-app001:3000/login

somber meteor
#

looks like it's talking to the ldap server now so that's good - but looks like the uname is blank and/or the username/password is wrong? t=2022-12-09T15:06:39+0000 lvl=eror msg="Invalid username or password" logger=context userId=0 orgId=0 uname= error="invalid username or password" remote_addr=10.94.8.60

fresh epoch
#

entering username/password and is correct.. comparing a sample to what I have

fresh epoch
#

OH! got it! went to a simplified version of ldap.toml and that worked!

#

I took the AD example from grafana docs and removed all the #'s basically