#Upgraded my HA Yellow PoE from CM4 to CM5, it works for hours then shuts off

1 messages · Page 1 of 1 (latest)

woven sparrow
#

I lose data from this instance in my graphs on an external device or notice my Zwave devices aren't operational, and when I check on the yellow, it's unavailable/offline, with no LED indicator lights on hardware.

  • I can cycle POE on the switch port or unplug/re-plug the cable and it comes back up.
  • I'm not seeing anything that stands out that suggest a failure/error in the host/supervisor logs around the time stamp of my last data point of the graph.
  • Checked seating, CM5 is locked in and parallel to the yellow board, and the heat sink is seated and warm to the touch.
  • I swapped from a POE injector I'd been using without issue for year(s?) to direct connection to my POE switch that is 802.3at PoE+, but same result
  • CM5 is 4GB RAM, 16GB eMMC - CM5004016
  • NVMe drive and Zwave HAT are the only other things installed on the Yellow
  • Running Zigbee2MQTT and ZwaveJS UI to send to my external device via MQTT.

Anything else I can check before I put the CM4 back in?

strange bay
#

Does is work reliably, if you power it with a compatible power supply via the barrel jack?

woven sparrow
#

I have a power supply that is 12v 3.5A power supply, that should work for this? Answering my own question, 12 V / 2 A is in the specs, 3.5A is fine.

#

It has booted up, I'll let it run and see if it drops off. The most recent time it lasted ~36 hours before going offline, will follow-up

woven sparrow
#

Just went off ~13 minutes ago.

strange bay
#

Restart it in safe mode (in the advanced options of the restart menu) - that disables all custom integrations and cards. If it is running fine in that mode, one of the custom things is causing the crashes.

woven sparrow
#

Thanks, I'll give that a try this weekend. I swapped it back to the cm4 after it dropped yesterday and so far stable

woven sparrow
#

following up-- just reinstalled the cm5 and have it restarted in safe mode. i'll let it bake and see how it behaves

woven sparrow
#

it lasted about 11.5 hours before it went offline in safe mode

strange bay
woven sparrow
#

will do, thanks for the assistance!

gloomy wyvern
#

coming in late here but could it be heat and its thermal shutoff?

woven sparrow
#

Would I see anything in the logs if it shut down for thermal issues?

gloomy wyvern
#

not necessarily. try leaving the top off and stick a fan next to it and see if it runs for longer

#

if it works then you may need to adjust your cooling for it

#

old school low tech "stick a desk fan next to it" test

woven sparrow
#

I have a ceiling fan but not desk fan handy. I'll order something small to blow directly at the Yellow and try that. I'll get some graphing of the temp as well to see if t here are any obvious trends, are there other sensors besides the cpu temp to watch?

gloomy wyvern
#

not that i can think of, but i am not sur what temp sensors the cm5 has/exposes

#

cm4 haas cpu and composite i think so might be stuff on the cm5 too. cpu is gunna be the main one in any case

strange bay
#

The CM5 should throttle, when it gets too hot.
But it produces more heat than the CM4.

gloomy wyvern
#

yeah it might well be something else. but sticking a fan on it is an easy thing to test

strange bay
#

Is it crashing with the enclosue or did you do the tests without it?

woven sparrow
#

I've only tested with the enclosure on so far. I can try again without the enclosure first then the fan when it arrives.

woven sparrow
#

didn't last long with case off, and temps stayed stable while it was on.

gloomy wyvern
#

probably not thermal then. that's not particularly high. was worth a shot

woven sparrow
#

it was worth ruling that out as a possible cause, thank you for the suggestion!

woven sparrow
#

I got a response pretty quickly to my request. I've sent over some log files and answered a few questions, still working on it. In the meantime my CM4 is back up and running without issue. I'm wondering if the CM5 is part of the problem and would like to test is without taking HA down, think I'd have similar problems if I plugged it into an IO board and let it bake?

strange bay
#

The IO board design is different. Due to that the Yellow has its own HAOS variant.
But if you have an IO board around, it might be worth a try.

woven sparrow
#

It's less expensive and more available than a CM5 🙂 I'll give it a try!

fickle hill
#

Hi @woven sparrow .. did you ever figured this out? I have a CM4 that seems to go away once a day but only during summer months ... so even though it might not be heat on the CM/CPU there still might be heat issues on yellow itself ... (I would assume PoE-HW could do something like that ...)

woven sparrow
#

Sadly no, the response I got from support didn't get me to resolution with the CM5, so I reverted to the CM4.

Unfortunately I don't have any further ideas. I've been keeping an eye out on another issue with CM5 where the variant with 16GB RAM is only showing 8GB available in HA OS and these have also been accompanied by reports of it running hot. However, the CM5 runs hotter in general, and it could be that it requires a more active cooling solution for some use cases...

#

I bought a different IO board to test the CM5 with HA and the device stayed on without issue, so I'm assuming something with the yellow board.