#Matter OTA Fails for Thread devices (Msg Retransmission to 1:0000000000000008 failure max retries:4)

1 messages · Page 1 of 1 (latest)

daring pagoda
#

Hello!

I'm trying to update the firmware of 2 Matter over Thread HEIMAN sensors (https://www.heimantech.com/product/smart-human-infrared-detector-m1-series and https://www.heimantech.com/product/smart-door-magnetic-detector-d1-series), there's an update available from version 1.0 and 0X10 to 1.2. The update is downloaded but fails to apply with the following error:

Here's the Matter Server Addon log:
https://dpaste.org/ctOYT

My setup:
Home Assistant Green
Home Assistant Skyconnect ZBT-1 as Thread Border Router
HEIMAN contact sensor as Thread End Device
HEIMAN motion sensor as Thread End Device
IPV6 enabled

I tried to update both sensors multiple times during different days and the result is always the same. I also tried performing the update while keeping the sensors awake, the result was the same

Matter Server is on version 8.1.0
OTBR is on version 2.13.0

languid ocean
#

Unfortunately a number of 1st gen Matter devices have trouble performing updates correctly (a bit of a chicken-egg situation for your device at hand). I have had better luck with other devices after bringing them closer to the border router and making sure that connectivity is on a sufficient level. e.g. when a ping to the device will respond in typically under 60ms.

#

By the way: you get more specific logs when you ssh into Home Assistant, and cd into the updates folder of the matter add-on. there you will find for each node that has had an update attempt a separate folder (with the name of the node number). Inside such folder you'll find more fine-grained log information. E.g. showing chunks transmitted/re-transmitted, etc.

daring pagoda
#

Hmmm when I ping the device I have lots of missing packets... I'll try to move the border router and sensor to another area of the house and play with the channel to make sure I get a more stable connection! Thanks

Just a FYI, I'm getting pings from 500ms to 2000ms, that might be the issue

dusky schooner
#

that probably means that adding more routing capable devices to your network (stuff like light bulbs, smart outlets which are permanently powered) will improve your network quality.

languid ocean
# daring pagoda Hmmm when I ping the device I have lots of missing packets... I'll try to move t...

Yeah, sleepy end devices answer pings usually not immediately, because they are sleepy. And I just came to realize that this is most probably the case with your Motion Sensor. You will probably see the pings answered in chunks like so:

64 bytes from fd2e:d139:e716x: icmp_seq=1 ttl=62 time=1603 ms
64 bytes from fd2e:d139:e716:x: icmp_seq=2 ttl=62 time=619 ms
64 bytes from fd2e:d139:e716:x: icmp_seq=3 ttl=62 time=2223 ms
64 bytes from fd2e:d139:e716:x: icmp_seq=4 ttl=62 time=1209 ms
64 bytes from fd2e:d139:e716:x: icmp_seq=5 ttl=62 time=193 ms
64 bytes from fd2e:d139:e716:x: icmp_seq=6 ttl=62 time=1625 ms
64 bytes from fd2e:d139:e716:x: icmp_seq=7 ttl=62 time=605 ms

The best you can do then is make sure that you're not dropping packets. Also, the ping behavior would probably change while the update is running. At least I guess that the devices would not sleep during the update.

daring pagoda
#

I tried to add more routers and border routers to the mix, tried on a different Thread network using Aqara's border routers only and through HA I still get the Msg Retransmission message.

During the update I could ping the devices and the response times were much better (between 40 to 150ms in some cases).

Looking at the logs in the update folder it seems that it just hangs at the end of the process: https://dpaste.org/nkO02

[1757703513.540035][272:272] CHIP:DIS: SRV record already actively processed.
[1757703516.472866][272:272] CHIP:EM: <<3 [E:45100r with Node: <000000000000000B, 1> S:936 M:216794548] (S) Msg Retransmission to 1:000000000000000B
[1757703516.473331][272:272] CHIP:EM: ??4 [E:45100r with Node: <000000000000000B, 1> S:936 M:216794548] (S) Msg Retransmission to 1:000000000000000B in 7643ms [State:Idle II:16800 AI:2200 AT:300]
[1757703524.117112][272:272] CHIP:EM: <<4 [E:45100r with Node: <000000000000000B, 1> S:936 M:216794548] (S) Msg Retransmission to 1:000000000000000B
[1757703524.117606][272:272] CHIP:EM: ??5 [E:45100r with Node: <000000000000000B, 1> S:936 M:216794548] (S) Msg Retransmission to 1:000000000000000B in 12162ms [State:Idle II:16800 AI:2200 AT:300]
[1757703536.280263][272:272] CHIP:EM: <<5 [E:45100r with Node: <000000000000000B, 1> S:936 M:216794548] (S) Msg Retransmission to 1:000000000000000B
[1757703536.280747][272:272] CHIP:EM: ??6 [E:45100r with Node: <000000000000000B, 1> S:936 M:216794548] (S) Msg Retransmission to 1:000000000000000B in 16795ms [State:Idle II:16800 AI:2200 AT:300]
[1757703552.810798][272:272] CHIP:DL: Select failed: src/system/SystemLayerImplSelect.cpp:714: OS Error 0x02000004: Interrupted system call
[1757703552.810949][272:272] CHIP:ZCL: Emitting ShutDown event 
#

Unfortunately I can't test if the sensor can be updated in other platforms. Apple Home never reports a new update and I believe Aqara only updates their devices and won't communicate with the DCL

languid ocean
#

interesting. to me this log seems relatively short. I would have expected a log showing 50 to 100 transmitted chunks with a couple of log lines for each chunk. I have seen ota files of 800k and larger.