#Nanoleaf Essentials Disconnecting
1 messages · Page 1 of 1 (latest)
You can Dm it if you like, or put it on the aiohomekit github
I wanted to include from startup to point of failure and that was just massive so here it is in a gist: https://gist.githubusercontent.com/MrSaiclops/254e15972b3e162fa8ad59b66412d9b2/raw/b8af716ed41ddf246c8939ed142e45c54bf7b7bb/gistfile1.txt
ok looking at that i see a bunch of ENCRYPT counter=196 without a corresponding DECRYPT - which I think means there is no reply
So the bulbs aren't talking back to HASS?
no
the ip and port don't change (at least in according to HA)
have you restarted since?
sort of curious to see if you turn on the logs again what you see
Nope, I'll restart and let all the bulbs come back online, then send you the log
so if you search for id='70:ac:8f:ed:7e:a4' after a restart, do you see address='fd42:780f:2110:0:13c3:5ff6:67c7:bcc8'
It's 100% success when the server has first started up
i suspect the answer is that you will
in which case the only other clue is the counter
when you restart HA that will reset the counter
so 2 theories - (1) the bulb never replies if the encryption key is wrong. the encryption key changes on every transmission (based on the counter). so if there is any packet loss, the key will seem to be wrong, the device starts to ignore us
(2) the bulb crashes (something we know nanoleaf devices do). on restart, it doesn't remember its active encryption sessions. so it ignores our encrypted packets
we can't stop the bulb crashing, and we can't stop packet loss, so all we can do is try to recover if that happens. if we don't get a reply, we have to tear down the encryption session and start again.
TBH i thought it already did something like that.
will see what i can do
don't suppose you know how to edit HA code in place?
could probably give you a 1 or 2 line fix to try out if you did
on the off chance that you can, it would be in aiohomekit/controller/coap/connection.py, in the function async def post_bytes(self, payload: bytes, timeout: int = 16.0):
Is this something I can do from the Visual Studio Add-On? Or would I need to nano within Advanced SSH and Terminal?
async with self.lock:
payload = self.encrypt(payload)
try:
request = Message(code=Code.POST, payload=payload, uri=self.uri)
async with asyncio_timeout(timeout):
response = await self.coap_ctx.request(request).response
except (NetworkError, asyncio.TimeoutError):
+ if self.coap_ctx:
+ await self.coap_ctx.shutdown()
+ self.coap_ctx = None
raise AccessoryDisconnectedError("Request timeout")
if response.code == Code.NOT_FOUND:
# maybe the accessory lost power or was otherwise rebooted
logger.debug("CoAP POST returned 404, our session is gone.")
await self.coap_ctx.shutdown()
self.coap_ctx = None
elif response.code != Code.CHANGED:
logger.warning(f"CoAP POST returned unexpected code {response}")
return await self._decrypt_response(response)```
i don't know - not a HAOS user myself
you can do it with nano if you are somewhere where which hass return a path
if you have an ssh that can do docker ps and see the HA container, that would work too
and then yeah, we just need to add the 3 lines i marked with +'s
(obviously without the +s)
So Docker ps works from my SSH shell, but when I ls I don't see a directory for aiohomekit
Nevermind it was in /usr/local/lib/python3.11
Okay, I think it's done. Do I need to restart something?
Yeah you’ll need to restart
So far so good. I'll let it go overnight and see if it's broken by the morning!
Does it normally break fairly often?
Ah you said it was every 5 mins
Ok
If it’s stable in morning that’s a good sign then
I suspect there's some larger issue with my infrastruture because it happened again this morning where EVERYTHING popped off all at once, but then it slowly repaired itself which has not been happening. So the fix works!
Was reading the code and there is a case where the remote device returns a 404 Not Found which means the encryption session is invalid, and we handle that.
And we see that when a device restarts
So I’m confused about what happens that makes the device not return anything
And to all devices..
Very weird
There is a risk with this fix that it might cause some problems for battery powered devices
The underlying library does automatic retries
And I don’t know how they interact with the timeout that we are hitting here
I think it’s fine, but might have to back it out if any problems crop up in the beta
I think it may be my router bugging out? Generally everything on my internet connection stops down for a couple of seconds and then when I finally get HASS to load up the lights are down
I have AdGuard Home running in an LXC through Proxmox, could DNS routing impact it at all?
It shouldn’t do
I ask because the internet connection issues I experience feel more like a DNS record taking a long time to finalize as opposed to a straight 404 connection
I've got 4 homepods and 2 apple TVs. I can check what the main BR is from HomeKit, right?
No
The primary hub in the Apple app is something else
You know how HA is like a server app ? Apple seem to have made something like that, that runs on the HomePods or AppleTV
But we don’t (and can’t) access that
We actually use a random BR
Are the Apple TVs both wired?
One of them is.
I just unplugged all of the HomePod Minis and turned woke the wired ATV. Hoping to force that to become the BR
Common complaint in HomeKit discussion that we can't select which device is the BR 🙄
In terms of BR, all of them are BR at once
And when a packet comes from the mesh to your wifi it should come out of the Br nearest that device they sent it
Oh okay so I don't need to worry if my non-wired HomePod Mini is showing as the main Home Hub?
In theory if all of your BRs have trel it doesn’t matter at all
(Look for _trel._udp in an mdns/zeroconf tool)
The HomePods should do if firmware up to date
I think so for Apple TVs too but slightly less certain
Yeah I've got 5 devices in that
What I’m interested in is whether your router can mess it up
Do you have switches and dedicated WiFi APs or just a combo router
I actually just did a writeup on my network infrastructure to ask advice for OpenWrt: https://forum.openwrt.org/t/worth-buying-a-new-router-for-my-stack/184776?u=mrsaiclops
Oh
Funnily enough my Unifi gateway died recently
I’m running openwrt
It feels like thread is behaving better with openwrt
But time will tell
Bought an N100 on a deal yesterday and as soon as it arrives I'll be using it as the router with everything else as dumb APs, so hopefully that helps!
Pretty much my setup
Already I'm pissed that TP-Link limits you at 64 address reservations
Reading the OpenWrt forums it seems like gigabit + homelabbing just requires a full computer as a router. Consumer grade and even entry level enterprise grade hardware is not necessarily built for the level of demand we're pushing
Without entering into the thousands of dollars, that is
I could saturate my fibre link with the unifi gateway but only if I turned features off
Ok if you are replacing the router anyway won’t dig too deep into the root cause just yet
But will try and get a PR open with the fix in in the next few hours
Yeah let me know if I can help identify battery related issues too. I've got Eve Thread items that are flawless on the network which helps me feel sane when Nanoleaf shits the bed
ah nice
what Eve devices?
first PR online here - https://github.com/Jc2k/aiohomekit/pull/358. will merge and tag this evening unless bdraco has anything to add.
I've got 3 BT Motion sensors, 2 Thread, and one door and window Thread Sensor
Yup!
ok
can you make one further change to your HA installation? if you look at https://github.com/Jc2k/aiohomekit/pull/358/files you'll see i've added a debug line
can you make yours look like that, restart ha, turn on the debug logs, then use the "identify" button on the eve devices and see if it (a) works and (b) check to see if that log message is in there
if the identify button works and we don't trigger that new log message we are good to go
Oh shit wait, I did upgrade the Motion Sensors to Matter. Does that impact what I'm about to test?
No I made that Matter too. How about an Onvis 5-Button Thread switch?
Wait, shit that's paired to HomeKit directly, not HASS
Damn, I guess I have less devices on thread than I thought
Lol
The onvis would probably do it if you could temporarily move it
I imagine bit of a pain with automations
Not that big of a pain. If you haven't found anyone else to do it I can try tonight. Don't want to mess around with remotely from work
Entering toddler o clock here so you will probably be the one
Sounds good. I'll ping you tonight then
@nimble snow So sorry about the long delay! Would it still be valuable for me to test this?
ALso, if I update my HASS will I need to redo the changes that you suggested?
They are in 2024.2.0 😱
So just hoping battery powered devices still work 😂
If you can test that would be great
Alright, does it have the debug you added here? https://github.com/Jc2k/aiohomekit/pull/358/files
I'm updating now, will do the connection dance with the Onvis button after
Yes it does 🙂