I'm trying to resolve a really annoying issue with what seems to be the Home Assistant instance seemingly randomly losing network connection very frequently and then randomly restoring.
It doesn't appear to be a client side issue because multiple devices are impacted at the same time.
It appears to be just impacting the home assistant process because the addon containers are still reachable and the machine can be connected to via other methods (samba shares still up, can view the console via proxmox)
The HA process doesn't seem to be restarting or frozen since I can watch the logs roll in showing the connections failing while this happens.
There seems to be no consistency in what causes these disconnections to occur or for how long
I have tried removing all custom components, creating a fresh instance and restoring from a backup and the issue still happens.
For what it's worth, I'm running HAOS on proxmox. My gut says its not proxmox related since this seems to be happening specifically to the hass process and other sibling docker processes (hass addons) on that VM are still able to connect
#Home Assistance process loses network connection.
1 messages · Page 1 of 1 (latest)
Can you share some of those logs? Does it happen in safe mode too?
If Proxmox is taxed for resources it's not at all out of the question that it's causing your HA instance to ebb and flow. Have you watched the resource utilization on the Proxmox VM to see if it's tapping into the full extent of memory ? What about the core itself? I've seen an over taxed Proxmox really try to balance resouces on VM's in a way that would precisely do what you are describing.
I'm having exactly the same problem. I also tried removing my custom integrations to no avail. There's no rime or reason... I'm also using HAOS on top of Proxmox. At this moment I have the VM by itself on the node to make sure it has nothing to do with the other VMs, but its still having the same problem.
What are the logs we should be looking at?
Well they mentioned logs so I wanted to see them.
You generally want to check ha host logs, ha supervisor logs, ha core logs.
Note that you can use -vf as arguments for these to follow and get more information. ha supervisor logs -vf for example.
Does this show up as Connection Lost. Reconnecting... on the front end? If so I'm getting the same but I also have a notification on the desktop saying Login attempt with invalid authentication.... May have to remove my post from earlier today if this is indeed the same. The Supervisor log shows a timeout error but I'm not sure this is related
Yup. "Connection Lost. Reconnecting...". I don't have any notification about "Login attempt with invalid authentication....".
Ok thanks for clarifying. No idea how to troubleshoot this but probably start with the logs ... Still want to remove that login attempt error to remove this as a possibility
What would I look for in these logs?
I don't know. Something that sticks out as wrong.
Have you found anything @pastel sierra? I can check the various logs as suggested by @karmic path but have no idea what to look out for. The fact it happens both on the LAN and remotely (via the companion app) would make me believe it has nothing to do with Nabu Casa.
Actually I'm wondering whether we could use Wireshark to sniff the network but again I don't know the communication requirements to filter the traffic between the desktop and the server
If you're running Proxmox and have certain Intel NIC, you should disable hardware offloading. Connect a monitor to your Proxmox host and check if you get something like en0 reset. If you do, follow the steps below. Remember to either do this at every reboot or put in in the post-up config for your iface. Everything is in the forum thread below. 🙂
https://forum.proxmox.com/threads/e1000e-reset-adapter-unexpectedly.87769/
Hello,
I have the latest Proxmox installed with an Intel 82579V Gb network controller.
Proxmox ui (and CT/VM) are not reachable for a few seconds.
This would happen up to several times a day at seemingly random.
What am I supposed to do to solve that issue ?
Is it a driver or hardware issue ?
The reason why this does not seem to affect (LXC) containers is that the actual connection dropout time is very short (and for example smb can easily deal with this). Also, you can't see this in the Proxmox web shell, you need to actually connect a monitor/KVM to see that error.
Thanks. Unfortunately I'm not running that config. HA runs as a VM on Fedora Server KVM Hypervisor