#I just upgraded my HA Blue yesterday
1 messages ยท Page 1 of 1 (latest)
HA OS is killing off processes and crippling the migration. I can reboot from the shell, but migration just starts again and I end up here.
Please install the glances addon and share what it looks like. Make sure to press z and m to show processes and sort by memory usage before.
Please use imgur or other image sharing web sites, and share the link here.
Image posting is blocked in most channels to discourage people from sharing text as images. Sharing text as images assumes that everybody sees the world as you do, which isn't the case. Some people are colour blind, or have visual impairment that means they can't make sense of an image of text.
After the OS killed some stuff off, it settled back a bit. Here is glances nowhttps://imgur.com/a/vzs2MSI
Make sure to press
zandmto show processes and sort by memory usage before.
This is still not sorted by memory.
Argh. Now when I press any key in glances (like h or m to sort) it just beeps at me and does nothing. I tried leaving glancves and coming back and refreshing the app (cmd+r)
:<
It did that a minute ago and refreshing brought it back.
Some things might be out of view this way but I can see that HA itself uses 2G alone which I think is a bit high.
I'd try starting HA in safe mode and see if the memory usage goes down. Also check this: #general-archived message
I might need to rebooto it again. I get this when trying to restart in safe mode:
Failed to restart Home Assistant
The system cannot restart while a database upgrade is in progress.
Do you know how big your database is roughly?
You can check with ls -lh /config/ via the SSH addon.
You can monitor the core logs with ha core logs -vf.
Might take a while if the database is large or the host is slow. I saw that there was a decent amount of IO. For this kind of storage at least.
Yup, I've been watching both OS and CORE logs in a ssh. Here is the db:
-rw-r--r-- 1 root root 675037184 Nov 25 2021 home-assistant_v2.db
-rw-r--r-- 1 root root 643.8M Nov 25 2021 home-assistant_v2.db
(-lh)
But it seems OS has killed off the containers and so the migrations get halted until I reboot
I wouldn't reboot just yet. The writes hint towards the upgrade being underway.
You can also monitor progress with watch -n1 ls -lah /config/ | sort -h. A temporary file should be written somewhere.
Lots of OS log entries like:
2024-08-17 17:49:33.490 owl kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/docker-bbde1c54fe766666822523dae17d9b1f40dcc44b4b26894a74b700cb5648d5dd.scope,task=beam.smp,pid=26335,uid=0
"watch -n1 ls -lah /config/ | sort -h" - very cool! Watching...
Ah, and now the app has lost its connection. ๐ฆ
but I am still SSH'ed in from a macOS terminal
SSH runs in its own container. Maybe the homeassistant container was killed.
ha host logs -vf should tell you this too.
You could try stopping the addons (besides SSH, of course) to gain back some head room but I think this should not be necessary.
no output from watch or ha host logs now... Seems like it's ground to a halt.
top in ssh session shows basically zero activity.
I have a somewhat poor idea if you have backups. Stop core, rename the database file and let it create a new one. Restore core later.
yup, daily backups on a SMB share to a Mac Mini
Another way would be to re-install HA completely, restore core and if that upgraded fine restore the addons.
I don't really like these options but I don't have another idea (aside from what I shared initially) right now if the logs are quiet.
let's try the first one. I issues ha core stop in the terminal but it took several seconds and then:
ha core stop
Processing... Done.
Error: Another job is running for job group home_assistant_core
ha jobs info should tell you what they are. Likely the upgrade.
To format your text as code, enter three backticks on the first line, press Enter for a new line, paste your code, press Enter again for another new line, and lastly three more backticks.
```yaml
example: here
```
Don't forget you can edit your post rather than repeatedly posting the same thing.
huh, some weatchdog must have kicked in because the app just rteconnected and is going through startup as if I had restarted CORE
~ # ha jobs info
ignore_conditions: []
jobs:
- child_jobs: []
done: false
errors: []
name: addon_restart_after_problem
progress: 0
reference: core_samba
stage: null
uuid: d627493951d04821b0f9ff50a94e6225 - child_jobs: []
done: false
errors: []
name: home_assistant_core_restart
progress: 0
reference: null
stage: null
uuid: 3706b3c713494a22b25e63f3dff7946a
~ id:browse
glances is back up. https://imgur.com/a/xZMaBIA
ok, thanks @copper sentinel for the help!
675037184 Nov 25 2021 home-assistant_v2.db
It seems this file has not been updated in a looong time. I moved to mariahdb - is that the reason? Is the database in some other place?
OK, so I renamed home-assistant_v2.db and rebooted. The system came back up and said it is migrating the database. Core logs say the same: The database is about to upgrade from schema version 44 to 45
But there is no new home-assistant_v2.db
I disabled both the influxdb and mariahdb addons and rebooted. Stable (but no recorder). I re-enabled and started influxdb. So far stable. Homeassistant process only using 743M. 64% MEM use, no SWAP use. Now to enable mariahdb...
Now I started mariadb and curiously there is no MEM/SWAP creep yet. Also no message on upgrading the database. Maybe I need to restart to trigger that?
I found the mariadb database itself mounted in the container
Strike that. SWAP is at 100% and OS is killing things off.
mariadb is huge...```config # docker exec 426efbbf3f62 du -sh /data/databases
24.2G /data/databases
Oh. That explains it. Yeah. This is the default sqlite file.
What are your recorder settings?
Recorder in configuration.yaml
db_url: !secret recorder_url
purge_keep_days: 120
recorder_url: mysql://homeassistant:Chr0n1cle@core-mariadb/homeassistant?charset=utf8mb4
426efbbf3f62 in my docker command above is the mariadb container
migration just failed. Here is the log message. https://imgur.com/a/R7g2wDe
120 can be quite a lot.
You don't need more than 10 usually due to long term statistics: #1216777289270951957 message
is 120 the days of full resolution data? Will it also keep a sub-sampled version for much longer?
aha. So let me set that to much smaller.
Here is the filan error in the log that got cut off in the imgur:sqlalchemy.exc.PendingRollbackError: Can't reconnect until invalid transaction is rolled back. Please rollback() fully before proceeding (Background on this error at: https://sqlalche.me/e/20/8s2b)
Yeah and HA's recorder is very inefficient storing such data. If you want granular data for a long gime I recommend grafana and a time series database.
HA's database is not quite my expertise.
I do have graphana instaled. What time-series DB? mySQL?
I use VictoriaMetrics but InfluxDB is available as addon.
I also have INfluxDB but it is not used now
MySQL is not a time series database.
But back to topic, I'm not quite sure how to fix the SQL error.
So immediate question is how do I recover agt least some of the data I have. I keep water bill (3 month) stats etc...
The help in the referenced webpage says:When a connection is invalidated, any Transaction that was in progress is now in an invalid state, and must be explicitly rolled back in order to remove it from the Connection.
I suspect when OS terminates the container due to OOM (out of memory) it has corrupted the mariadb.
I was reading in other posts that mariadb was recommended some time ago due to DB corruption issues and that that is no longer a problem so I should use the built-in DB support for recorder.
That's why I had switched to mariadb back then
You could try to temporarily switch the recorder to the default just so you can start. Disable all your non-essential addons, then restore both HA and the MySQL addon to a earlier state.
Have you tried safe mode yet?
I'm in safe mode now ๐
HA tends to get very slow when the database is bigger than 1G or so.
so comment out the configuration.yaml recorder entry and reboot? Then it will use the default (mysql?)
ah. ok. sigh. I wonder if I can recover some of the mariadb - maybe I can manually do a rollback?
Perhaps. I'm very rusty. Haven't played DBa in a while now.
If you restart MySQL the transactions should be invalidated so I don't think that's it. Like it says, a transaction has to succeed.
the mariadb log shows it does a check on startup and it seems to say all the tables are OK.
I set the days to keep to 14 - I'll try one more reboot in the hope that 1. mariadb is not actually corrupted and 2. HA will trim the DB back BEFORE trying to migrate it...
ah. OK, commenting ou tthe recorder entry in config. I can still start mariadb and if I get ambitious I might try to salvage some data from it.
bdraco is the database wizard here but I don't recommend pinging them.
Rather search for the last error in the github organization's issues.
good idea. Thanks for sticking with me @copper sentinel !
cc: @lean reef in case you have any ideas...
OK, commented out recorder and rebooted. Now I have a shiny new empty historical DB :(-rw-r--r-- 1 root root 4943872 Aug 17 14:21 home-assistant_v2.db -rw-r--r-- 1 root root 32768 Aug 17 14:21 home-assistant_v2.db-shm -rw-r--r-- 1 root root 4214792 Aug 17 14:21 home-assistant_v2.db-wal
That was just one step ๐
well, it's growing like a banshi! -rw-r--r-- 1 root root 181.0M Aug 17 15:20 home-assistant_v2.db -rw-r--r-- 1 root root 32.0K Aug 17 15:20 home-assistant_v2.db-shm -rw-r--r-- 1 root root 6.1M Aug 17 15:20 home-assistant_v2.db-wal
What the heck is it recording? -rw-r--r-- 1 root root 287.2M Aug 17 15:57 home-assistant_v2.db -rw-r--r-- 1 root root 32.0K Aug 17 15:56 home-assistant_v2.db-shm -rw-r--r-- 1 root root 6.6M Aug 17 15:57 home-assistant_v2.db-wal
Check here: #general-archived message
Wow, DbStats is exactly what I needed! I can immediately see what the issue is. One of my home-brew devices (water usage monitor) is spaming the system!