#I just upgraded my HA Blue yesterday

1 messages ยท Page 1 of 1 (latest)

gentle relic
#

HA OS is killing off processes and crippling the migration. I can reboot from the shell, but migration just starts again and I end up here.

copper sentinel
#

Please install the glances addon and share what it looks like. Make sure to press z and m to show processes and sort by memory usage before.

fallow rapidsBOT
#

Please use imgur or other image sharing web sites, and share the link here.

Image posting is blocked in most channels to discourage people from sharing text as images. Sharing text as images assumes that everybody sees the world as you do, which isn't the case. Some people are colour blind, or have visual impairment that means they can't make sense of an image of text.

gentle relic
#

After the OS killed some stuff off, it settled back a bit. Here is glances nowhttps://imgur.com/a/vzs2MSI

copper sentinel
#

Make sure to press z and m to show processes and sort by memory usage before.

gentle relic
copper sentinel
#

This is still not sorted by memory.

gentle relic
#

Argh. Now when I press any key in glances (like h or m to sort) it just beeps at me and does nothing. I tried leaving glancves and coming back and refreshing the app (cmd+r)

copper sentinel
#

:<

gentle relic
#

It did that a minute ago and refreshing brought it back.

copper sentinel
#

Some things might be out of view this way but I can see that HA itself uses 2G alone which I think is a bit high.

gentle relic
#

I had to use the browser. The macOS App is not always fun ๐Ÿ™‚

copper sentinel
gentle relic
#

I might need to rebooto it again. I get this when trying to restart in safe mode:

#

Failed to restart Home Assistant
The system cannot restart while a database upgrade is in progress.

copper sentinel
#

Do you know how big your database is roughly?

#

You can check with ls -lh /config/ via the SSH addon.

#

You can monitor the core logs with ha core logs -vf.

#

Might take a while if the database is large or the host is slow. I saw that there was a decent amount of IO. For this kind of storage at least.

gentle relic
#

Yup, I've been watching both OS and CORE logs in a ssh. Here is the db:

#

-rw-r--r-- 1 root root 675037184 Nov 25 2021 home-assistant_v2.db

#

-rw-r--r-- 1 root root 643.8M Nov 25 2021 home-assistant_v2.db

#

(-lh)

#

But it seems OS has killed off the containers and so the migrations get halted until I reboot

copper sentinel
#

I wouldn't reboot just yet. The writes hint towards the upgrade being underway.
You can also monitor progress with watch -n1 ls -lah /config/ | sort -h. A temporary file should be written somewhere.

gentle relic
#

Lots of OS log entries like:

#

2024-08-17 17:49:33.490 owl kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/docker-bbde1c54fe766666822523dae17d9b1f40dcc44b4b26894a74b700cb5648d5dd.scope,task=beam.smp,pid=26335,uid=0

#

"watch -n1 ls -lah /config/ | sort -h" - very cool! Watching...

#

Ah, and now the app has lost its connection. ๐Ÿ˜ฆ

#

but I am still SSH'ed in from a macOS terminal

copper sentinel
#

SSH runs in its own container. Maybe the homeassistant container was killed.

#

ha host logs -vf should tell you this too.

#

You could try stopping the addons (besides SSH, of course) to gain back some head room but I think this should not be necessary.

gentle relic
#

no output from watch or ha host logs now... Seems like it's ground to a halt.

#

top in ssh session shows basically zero activity.

copper sentinel
#

I have a somewhat poor idea if you have backups. Stop core, rename the database file and let it create a new one. Restore core later.

gentle relic
#

yup, daily backups on a SMB share to a Mac Mini

copper sentinel
#

Another way would be to re-install HA completely, restore core and if that upgraded fine restore the addons.
I don't really like these options but I don't have another idea (aside from what I shared initially) right now if the logs are quiet.

gentle relic
#

let's try the first one. I issues ha core stop in the terminal but it took several seconds and then:

#

ha core stop

Processing... Done.

Error: Another job is running for job group home_assistant_core

copper sentinel
#

ha jobs info should tell you what they are. Likely the upgrade.

fallow rapidsBOT
#

To format your text as code, enter three backticks on the first line, press Enter for a new line, paste your code, press Enter again for another new line, and lastly three more backticks.
```yaml
example: here
```
Don't forget you can edit your post rather than repeatedly posting the same thing.

gentle relic
#

huh, some weatchdog must have kicked in because the app just rteconnected and is going through startup as if I had restarted CORE

#

~ # ha jobs info
ignore_conditions: []
jobs:

  • child_jobs: []
    done: false
    errors: []
    name: addon_restart_after_problem
    progress: 0
    reference: core_samba
    stage: null
    uuid: d627493951d04821b0f9ff50a94e6225
  • child_jobs: []
    done: false
    errors: []
    name: home_assistant_core_restart
    progress: 0
    reference: null
    stage: null
    uuid: 3706b3c713494a22b25e63f3dff7946a
    ~ id:browse
copper sentinel
#

Lol. That last link ๐Ÿ˜„
That's why we use code blocks ๐Ÿ™‚

#

Gotta go for now.

gentle relic
#

ok, thanks @copper sentinel for the help!

#

675037184 Nov 25 2021 home-assistant_v2.db

#

It seems this file has not been updated in a looong time. I moved to mariahdb - is that the reason? Is the database in some other place?

gentle relic
#

OK, so I renamed home-assistant_v2.db and rebooted. The system came back up and said it is migrating the database. Core logs say the same: The database is about to upgrade from schema version 44 to 45

#

But there is no new home-assistant_v2.db

gentle relic
#

And I'm back at 100% SWAP etc.

#

and OS just killed off a bunch of stuff

gentle relic
#

I disabled both the influxdb and mariahdb addons and rebooted. Stable (but no recorder). I re-enabled and started influxdb. So far stable. Homeassistant process only using 743M. 64% MEM use, no SWAP use. Now to enable mariahdb...

#

Now I started mariadb and curiously there is no MEM/SWAP creep yet. Also no message on upgrading the database. Maybe I need to restart to trigger that?

gentle relic
#

I found the mariadb database itself mounted in the container

#

Strike that. SWAP is at 100% and OS is killing things off.

#

mariadb is huge...```config # docker exec 426efbbf3f62 du -sh /data/databases
24.2G /data/databases

copper sentinel
#

What are your recorder settings?

gentle relic
#

Recorder in configuration.yaml

#
  db_url: !secret recorder_url
  purge_keep_days: 120
#

recorder_url: mysql://homeassistant:Chr0n1cle@core-mariadb/homeassistant?charset=utf8mb4

#

426efbbf3f62 in my docker command above is the mariadb container

copper sentinel
#

120 can be quite a lot.

gentle relic
#

is 120 the days of full resolution data? Will it also keep a sub-sampled version for much longer?

#

aha. So let me set that to much smaller.

#

Here is the filan error in the log that got cut off in the imgur:sqlalchemy.exc.PendingRollbackError: Can't reconnect until invalid transaction is rolled back. Please rollback() fully before proceeding (Background on this error at: https://sqlalche.me/e/20/8s2b)

copper sentinel
#

Yeah and HA's recorder is very inefficient storing such data. If you want granular data for a long gime I recommend grafana and a time series database.

#

HA's database is not quite my expertise.

gentle relic
#

I do have graphana instaled. What time-series DB? mySQL?

copper sentinel
#

I use VictoriaMetrics but InfluxDB is available as addon.

gentle relic
#

I also have INfluxDB but it is not used now

copper sentinel
#

MySQL is not a time series database.

#

But back to topic, I'm not quite sure how to fix the SQL error.

gentle relic
#

So immediate question is how do I recover agt least some of the data I have. I keep water bill (3 month) stats etc...

#

The help in the referenced webpage says:When a connection is invalidated, any Transaction that was in progress is now in an invalid state, and must be explicitly rolled back in order to remove it from the Connection.

#

I suspect when OS terminates the container due to OOM (out of memory) it has corrupted the mariadb.

#

I was reading in other posts that mariadb was recommended some time ago due to DB corruption issues and that that is no longer a problem so I should use the built-in DB support for recorder.

#

That's why I had switched to mariadb back then

copper sentinel
#

You could try to temporarily switch the recorder to the default just so you can start. Disable all your non-essential addons, then restore both HA and the MySQL addon to a earlier state.
Have you tried safe mode yet?

gentle relic
#

I'm in safe mode now ๐Ÿ™‚

copper sentinel
#

HA tends to get very slow when the database is bigger than 1G or so.

gentle relic
#

so comment out the configuration.yaml recorder entry and reboot? Then it will use the default (mysql?)

copper sentinel
#

The default is SQLite but yes.

#

The .db file is SQLite.

gentle relic
#

ah. ok. sigh. I wonder if I can recover some of the mariadb - maybe I can manually do a rollback?

copper sentinel
#

Perhaps. I'm very rusty. Haven't played DBa in a while now.

#

If you restart MySQL the transactions should be invalidated so I don't think that's it. Like it says, a transaction has to succeed.

gentle relic
#

the mariadb log shows it does a check on startup and it seems to say all the tables are OK.

#

I set the days to keep to 14 - I'll try one more reboot in the hope that 1. mariadb is not actually corrupted and 2. HA will trim the DB back BEFORE trying to migrate it...

copper sentinel
#

"Trim" only happens on sundays.

#

It's kind of an intense process.

gentle relic
#

ah. OK, commenting ou tthe recorder entry in config. I can still start mariadb and if I get ambitious I might try to salvage some data from it.

copper sentinel
#

bdraco is the database wizard here but I don't recommend pinging them.
Rather search for the last error in the github organization's issues.

gentle relic
#

good idea. Thanks for sticking with me @copper sentinel !

#

cc: @lean reef in case you have any ideas...

#

OK, commented out recorder and rebooted. Now I have a shiny new empty historical DB :(-rw-r--r-- 1 root root 4943872 Aug 17 14:21 home-assistant_v2.db -rw-r--r-- 1 root root 32768 Aug 17 14:21 home-assistant_v2.db-shm -rw-r--r-- 1 root root 4214792 Aug 17 14:21 home-assistant_v2.db-wal

copper sentinel
#

That was just one step ๐Ÿ™‚

gentle relic
#

well, it's growing like a banshi! -rw-r--r-- 1 root root 181.0M Aug 17 15:20 home-assistant_v2.db -rw-r--r-- 1 root root 32.0K Aug 17 15:20 home-assistant_v2.db-shm -rw-r--r-- 1 root root 6.1M Aug 17 15:20 home-assistant_v2.db-wal

gentle relic
#

What the heck is it recording? -rw-r--r-- 1 root root 287.2M Aug 17 15:57 home-assistant_v2.db -rw-r--r-- 1 root root 32.0K Aug 17 15:56 home-assistant_v2.db-shm -rw-r--r-- 1 root root 6.6M Aug 17 15:57 home-assistant_v2.db-wal

copper sentinel
gentle relic
#

Wow, DbStats is exactly what I needed! I can immediately see what the issue is. One of my home-brew devices (water usage monitor) is spaming the system!