Suddenly today I discovered that when I try the "Check configuration" i only get attached message.
Then checking around - access to the file system is dead and the log reports that the database is corrupt.
But HA itself runs as before and all atuomations and entities work as they should.
Restart and reboot and shutdown fails.
WireGuard and SSH&Terminal addons work, but I never set a username/pwd for SSH so I can't get in that way.
I'm only remotely connected to HA.
If I try to run HA from local LAN using HA-IP:8123 I get the logo but not the login screen.
But I can run HA via HA-Cloud.
I can remotely power cycle my Yellow, but I'm afraid to do so as I fear it will never come back up.
Any advise (please - no guesswork)?
#Yellow: File system and database corrupt, but HA still runs
19 messages · Page 1 of 1 (latest)
Either the configuration.yaml has been deleted or moved by accident or the filesystem has been corrupted severely.
Fix for the first would be to recreate the configuration.yaml. either by moving it back, extracting it from a backup or creating it from scratch (if nothing has been added by you).
For the second a full reinstall (not just a factory reset) and a restoration from a backup would be needed.
Certainly, but as I am in a different country with only remote access, none of that is possible as long as I cannot log into SSH. My only option (as I see) is to remotely power cycle the Yellow (which I can do), but what are the chances of it getting back up ever again?
HA will likely stop entirely,if you powercycle the Yellow. The configuration.yaml is needed to start.
It would at least need a basic configuration.yaml with the following content:
default_config:
automation: !include automations.yaml
script: !include scripts.yaml
scene: !include scenes.yaml
Ok, but I find it strange that HA and all its addons still runs allthough the whole file system seems corrupt?
The configuration has already been loaded to memory before the issue occured.
What does you make believe the filesystem is corrupted? Any errors in the logs?
For now it is just complaining about a missing configuration.yaml. Which might also have been deleted or moved by user accident.
No user accident. (there are no other users, but me) and this happened suddenly one night with no additional external events.
Is there any way I can check determine if the rest of the file system is corrupt or alive? (regretfully none of the addons are available via WEB UI)
The host logs might show I/O errors
If it is corrupted indeed, the only fix is a full re-flashing of the OS. And restoring a backup.
As the image shows; The Host log can't be loaded.....
Without a new configuration.yaml powercycling is a gamble with bad chances.
With one it is still a gamble.
Without access to the serial console, chances are even slimmer.
Do you have a recent backup downloaded to another computer?
Exactly. I just don't understand how the Yellow/HA could corrupt the file system completely. I don't believe this to be a HW disk failure as I'm using a high-end Samsung M2 NVME and it's only 6months old.
I do have regular backups to OneDrive, but again - remotely - I'm stuck.
Hardware malfunctions happen. I also had a defective Samsung 980 NVMe. After changing that to a 970 Evo Plus, I had no problems anymore.
Filesystem corruption can also happen, if the Yellow is shut down improperly - like a power loss.
Yes, we did have have power loss - twice - a couple of weeks back, but the Yellow rebooted properly after both incidents although complaining about DB not properly being shut down, nothing else seemed to be wrong.....