Been using NodeRed with HomeAssistant for several years now. I have 679 HA nodes via websocket, and 12 via MQTT. HA itself has 362 devices on a RaspPi5.
Recently the system has started to fail - manifested by Error 3 No websocket, but I suspect the error is before that because processor usage peaks at 25% (i.e. one core) and the temp goes from around 55° to a runaway 80°.
In the logs CronTab complains that the time has been changed by some busy process, there are mentions of unresolved async waits.
Obviously I've tried to isolate if there is any one flow that triggers this behaviour. For a while I moved one chunk of NodeRed flows onto my PC (yes I was surprised that you could do that !) and that solved the problem - but I'm not sure if it was just a matter of time before the bigger faster PC would get into the same problem, perhaps weeks or months. The Raspi5 has the problem every few hours. Not a long term solution anyway to leave big PC onto 24x7.
Attached 1 is a screen grab of processor use and temperature. Also shown is memory use, but it doesn't show a memory leak .
Attached 2 is a recent set of logs which gives version numbers and configuration details too.
Questions:
I am aware of node.warn and node.log but are there any other ways to track node performance ?
The HA monitor aggregates by 5 min chunks. So being precise as to when the spikes occur is difficult. If I could see an exact time a node started (and finished) and compare it to a fine grained cpu % that might help. Or a usage chart showing total time and calls in each node during last hour say.
Optimistic with so many devices on the Pi5 ? (almost all Zigbee, most local, little cloud)
Arriving soon is uprated fan for the pi - but I don't think it is a thermal problem to start with, more a case of that being a consequence.
I have added an actual HA Automation which buzzes a wee siren in my office so I can restart the Node-Red app - hopefully for 10 or 12 hours.
TIA