02_06_05 auto-update ALPHA, BETA

This maintenance release fixes a problem with uploading influxDB (1 and 2) during or after an interruption in upload caused by communication failure or server stoppage. The problem has existed since the introduction of 02_06_00.

Expect this to fast track to MINOR and MAJOR as it contains minimal changes since 02_06_02.

07/13/2021 ALPHA
07/19/2021 BETA
08/07/2021 MINOR

2 Likes

Hi there.

Yesterday I upgraded from 02_05_12 to 02_06_05 as it had been in MINOR for a while now and in the past 20 hours I have had 136 restarts where was previously almost all unattended restarts were the six week routine restart. I am have now switched back to MAJOR.

I am using the influxDB_v1 uploader to a local server, a script polling the status endpoint every 10 seconds, and in the last couple of works I have also set up the Home Assistant addon.

20 of the restarts have been logged as “Heap memory has degraded below safe minimum, restarting.” and the remaining 116 have a restart reason of “Exception” but these seem to be heap related too. Plotting the heap free from my custom logger shows it was sitting around 22k free before the update and post-update it is rapidly depleting. Heap starts around 25k free after the restart and drops linearly to ~10k over about 9 minutes. This then repeats.

There’s nothing standing out to me as unusual in the logs other than the restarts. Here’s logs for the 2 restarts before I downgraded:

** Restart **

SD initialized.
9/26/21 20:26:34z Real Time Clock is running. Unix time 1632687994 
9/26/21 20:26:34z Reset reason: Exception
9/26/21 20:26:34z Trace:  1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:1[2], 1:2[3], 9:0[3], 9:0, 9:1, 8:4, 8:6, 8:8, 8:9, 9:3, 9:5, 9:9, 1:2, 1:3, 10:13, 15:0[32], 15:3[32], 15:10, 15:14[32], 15:15[2], 15:16, 15:4
9/26/21 20:26:34z ESP8266 ChipID: 6920572
9/26/21 20:26:34z IoTaWatt 4.x, Firmware version 02_06_05
9/26/21 20:26:34z SPIFFS mounted.
9/26/21 20:26:34z Local time zone: +0:00
9/26/21 20:26:34z device name: IotaWatt
9/26/21 20:26:34z HTTP server started
9/26/21 20:26:34z influxDB_v1: Starting, interval:5, url:http://sentry.lan.backtick.org:8086
9/26/21 20:26:34z timeSync: service started.
9/26/21 20:26:34z statService: started.
9/26/21 20:26:34z dataLog: service started.
9/26/21 20:26:36z dataLog: Last log entry 09/26/21 20:26:30
9/26/21 20:26:39z historyLog: service started.
9/26/21 20:26:39z historyLog: Last log entry 09/26/21 20:26:00
9/26/21 20:26:39z WiFi connected. SSID=🍓, IP=192.168.14.229, channel=1, RSSI -50db
9/26/21 20:26:39z MDNS responder started for hostname IotaWatt
9/26/21 20:26:39z LLMNR responder started for hostname IotaWatt
9/26/21 20:26:39z Updater: service started. Auto-update class is NONE
9/26/21 20:27:19z influxDB_v1: Start posting at 09/26/21 20:26:30
9/26/21 20:35:18z Heap memory has degraded below safe minimum, restarting.

** Restart **

SD initialized.
9/26/21 20:35:19z Real Time Clock is running. Unix time 1632688519 
9/26/21 20:35:19z Reset reason: Software/System restart
9/26/21 20:35:19z Trace:  1:2, 1:3, 10:13, 15:0[32], 15:3[32], 15:10, 15:14[32], 15:15[2], 15:16, 15:4, 1:3, 1:1[9], 1:2, 9:0, 9:0, 8:4, 8:6, 8:8, 8:9, 1:2, 1:3, 1:3, 1:6[1], 1:6[2], 1:6[2], 1:6[3], 1:5[21], 1:6[4], 21:0, 21:1, 21:10, 21:10
9/26/21 20:35:19z ESP8266 ChipID: 6920572
9/26/21 20:35:19z IoTaWatt 4.x, Firmware version 02_06_05
9/26/21 20:35:19z SPIFFS mounted.
9/26/21 20:35:19z Local time zone: +0:00
9/26/21 20:35:19z device name: IotaWatt
9/26/21 20:35:19z HTTP server started
9/26/21 20:35:19z influxDB_v1: Starting, interval:5, url:http://sentry.lan.backtick.org:8086
9/26/21 20:35:19z timeSync: service started.
9/26/21 20:35:19z statService: started.
9/26/21 20:35:19z dataLog: service started.
9/26/21 20:35:21z dataLog: Last log entry 09/26/21 20:35:15
9/26/21 20:35:24z historyLog: service started.
9/26/21 20:35:24z historyLog: Last log entry 09/26/21 20:35:00
9/26/21 20:35:24z WiFi connected. SSID=🍓, IP=192.168.14.229, channel=1, RSSI -45db
9/26/21 20:35:24z MDNS responder started for hostname IotaWatt
9/26/21 20:35:24z LLMNR responder started for hostname IotaWatt
9/26/21 20:35:24z Updater: service started. Auto-update class is NONE
9/26/21 20:36:44z influxDB_v1: Start posting at 09/26/21 20:35:15
9/26/21 20:43:48z Heap memory has degraded below safe minimum, restarting.

I’ll wait for your thoughts before I do anything further. Happy to do testing such as trying other versions, disabling influx v1 and/or Home Assistant integration, etc. to see if that stops the issue.

Cheers,

Jason

The exceptions are heap related. 02_06_05 has been rock solid for a couple of months now, so there must be more to this. Heap degradation is almost always related to WiFi problems. Your WiFi looks solid, so I have no immediate answer.

Also, be advised that yesterday, 02_06_05 became the MAJOR release as well. Your unit should have 02_05_12 in it’s /download directory, and so you can always manually install it using:

http://iotawatt.local/update?release=02_05_12

But to take you up on your offer, I would like to see what happens if you stop influx. I’d also like to take a look at your config.txt file if you would upload it to me in a Private Message.

1 Like

Hi again,

tl;dr: I think there’s a leak somewhere in the HTTP connection handling.

Thanks for the tip on updating to a previously downloaded version. That could come in handy. Given this option I have updated to 02_06_05 again for some more testing.

I stopped the InfluxDB uploader and free heap still seems to be going down at roughly the same rate.

I stopped the InfluxDB uploader again after the IoTaWatt watchdog triggered and then disabled the Home Assistant integration too. No apparent difference.

Finally, that leaves me custom script which polls /config.txt and /status?stats=yes&state=yes&inputs=yes&outputs=yes every 10 seconds. Some debugging reveals this is opening a new HTTP connection for each request which is not ideal but these connections are closed immediately afterwards.

After disabling all 3 of these the heap loss seems to stop. I enabled the hidden heap logging to InfluxDB and restarted IoTaWatt with both InfluxDB and Home Assistant enabled.

With this setup it seems free heap is stabilising around 14k free for 10 minutes. Restarted and saw the same thing: Starts at ~25k free heap and goes down to ~15k over about 9 minutes and then remains stable.

It certainly looks like my Q&D script doing lots of HTTP connections is probably the cause. Is there a leak in the HTTP connection handling? Let’s do some more testing.

Requesting /config.txt with curl in a loop seems to be enough to trigger the the heap watchdog. It about 100 requests. Testing again after the restart and everything has stabilized shows 50 requests in a loop drops the free memory from ~15.3k to ~10.5k and then it stabilizes at ~12.6k. Based on those numbers, another 50 requests would bring it under the 10k safe level. That roughly matches what I was doing with my script: 2 requests every 10 seconds is about 100 requests in 8 minutes.

I have digest authentication enabled if that’s relevant. I’ll send you my config file via private message.

Cheers.

The ESP8266 always closes connections. It is possible to use WebSockets but that’s a whole different approach, and the number of open connections is limited to 4 I believe.

The status transactions should not be causing this. They are simple and quick and also the config app does them at 1 second intervals with no problems.

Reading the config file that often might be the cause. There is a lot of lower level code between the http stream and the SDcard routines. My home systems were going on week 5 with 02_06_05 a few days ago with no heap issues whatsoever. One of them runs all 4 uploaders. So I do think it’s the config.txt reads. Why are you doing that? maybe there is a better way to get the information you need.

Agreed. It’s definitely caused by many requests for data from the file system.

I was using the data in the config file to convert the indexes for input and outputs to names. The configured names certainly can and should be cached though.

Now that I know about the undocumented heap logging feature in the InfluxDB uploader and I have the newish Home Assistant addon feeding my automations, I no longer have any need for this custom script so I’ve now disabled it.

I expect loading the web UI many times will result in eventual heap exhaustion too. While this does seem to be a regression, it may not be worth spending the effort to fix it. Especially since the low heap check should trigger a restart when needed.

Thanks for the excellent support, as usual.