Iotawatt self-reboots often and then influxdb doesn't restart automatically (since latest MINOR release)


#1

Since release 02_03_20, I have iotawatt self-rebooting from time to time for “Heap memory has degraded below safe minimum, restarting”, and then additionally influxdb stops working, but emoncms continues working. Here the logs:

Here latest status message in Status/influxdb: (and that time is not correct, latest status was 1/20/2019 02:38:00.

Stopped, Last update 12/29/2018 12:00:00 AM

while EmonCMS status is running:

Running, Last update 1/20/2019 7:28:10 PM

Here the full logs since installation end of last year. Notice how since update, it self-restarts often, and how each time i have to re-save the influxdb settings to restart it (this time hit the Start button and it started fine, adding latest logs as second quotation):

`12/28/18 23:38:55 dataLog: service started.
12/28/18 23:38:55 dataLog: Last log entry 12/28/18 19:07:15
12/28/18 23:38:55 historyLog: service started.
12/28/18 23:38:55 historyLog: Last log entry 12/28/18 19:07:00
12/28/18 23:39:07 Updater: Auto-update is current for class MINOR.
12/29/18 16:05:46 EmonService: started. url=10.2.0.200:9092, node=IotaWatt, interval=10
12/29/18 16:05:47 EmonService: Node doesn’t yet exist.
12/29/18 16:05:47 EmonService: Start posting at 12/29/18 00:00:10
12/29/18 17:58:20 EmonService: HTTP response -4, retrying.
12/29/18 17:58:27 EmonService: Retry successful after 15 attempts.
12/29/18 18:03:53 EmonService: HTTP response -4, retrying.
12/29/18 18:03:53 EmonService: Retry successful after 3 attempts.
12/29/18 21:51:19 influxDB: started, url=10.2.0.200:8086, db=bruyere, interval=10
12/29/18 21:51:20 influxDB: Start posting at 12/29/18 00:00:10
12/30/18 23:53:13 EmonService: HTTP response -14, retrying.
12/30/18 23:57:00 influxDB: Post Failed: -4
12/31/18 00:04:04 EmonService: Retry successful after 49 attempts.
1/01/19 01:39:30 EmonService: HTTP response -4, retrying.
1/01/19 01:39:42 EmonService: Retry successful after 19 attempts.
1/01/19 02:33:40 EmonService: HTTP response -4, retrying.
1/01/19 02:34:26 EmonService: Retry successful after 34 attempts.
1/01/19 02:57:10 EmonService: HTTP response -4, retrying.
1/01/19 02:57:50 EmonService: Retry successful after 32 attempts.
1/01/19 22:13:22 EmonService: HTTP response -4, retrying.
1/01/19 22:14:47 EmonService: Retry successful after 7 attempts.
1/02/19 18:50:02 Updater: Update from 02_03_18 to 02_03_20
1/02/19 18:50:02 Updater: download 02_03_20
1/02/19 18:50:15 Updater: Release downloaded 13374ms, size 660192
1/02/19 18:50:24 Updater: Update downloaded and signature verified
1/02/19 18:50:30 Updater: firmware upgraded to version 02_03_20
1/02/19 18:50:30 Firmware updated, restarting.

** Restart **

SD initialized.
1/02/19 17:50:37z Real Time Clock is running. Unix time 1546451437
1/02/19 17:50:37z Version 02_03_20
1/02/19 17:50:37z Updater: Installing update files for version 02_03_20
1/02/19 17:50:37z Updater: Installing GRAPH.HTM
1/02/19 17:50:37z Updater: Installing GRAPH.JS
1/02/19 17:50:37z Updater: Installing INDEX.HTM
1/02/19 17:50:38z Updater: Installing TABLES.TXT
1/02/19 17:50:38z Updater: Installing CNFSTYLE.CSS
1/02/19 17:50:38z Updater: Installing EDIT.HTM
1/02/19 17:50:38z Updater: Installation complete.
1/02/19 17:50:38z Reset reason: Software/System restart
1/02/19 17:50:38z Trace: 1:2[2], 9:0[2], 9:0, 9:1, 8:4, 8:6, 8:8, 8:9, 9:3, 9:5, 9:9, 1:2, 1:3, 1:4, 1:5[19], 1:6, 1:1[2], 1:2[3], 9:0[3], 9:0, 9:1, 8:4, 8:6, 8:8, 8:9, 9:3, 9:5, 9:9, 1:2, 1:3, 1:4, 1:5[5]
1/02/19 17:50:38z ESP8266 ChipID: 8529514
1/02/19 17:50:39z SPIFFS mounted.
1/02/19 18:50:40 Local time zone: +1:00
1/02/19 18:50:40 device name: IotaWatt
1/02/19 18:50:40 MDNS responder started for hostname IotaWatt
1/02/19 18:50:40 LLMNR responder started for hostname IotaWatt
1/02/19 18:50:40 HTTP server started
1/02/19 18:50:40 WiFi connected. SSID=Bruyere, IP=10.1.0.219, channel=2, RSSI -79db
1/02/19 18:50:40 timeSync: service started.
1/02/19 18:50:40 statService: started.
1/02/19 18:50:40 Updater: service started. Auto-update class is MINOR
1/02/19 18:50:40 dataLog: service started.
1/02/19 18:50:40 dataLog: Last log entry 01/02/19 18:50:15
1/02/19 18:50:40 historyLog: service started.
1/02/19 18:50:40 historyLog: Last log entry 01/02/19 18:50:00
1/02/19 18:50:42 Updater: Auto-update is current for class MINOR.
1/02/19 18:50:45 EmonService: started. url=10.2.0.200:9092, node=IotaWatt, interval=10
1/02/19 18:50:45 influxDB: started, url=10.2.0.200:8086, db=bruyere, interval=10
1/02/19 18:50:46 EmonService: Start posting at 01/02/19 18:50:10
1/02/19 18:50:46 influxDB: Start posting at 01/02/19 18:50:10
1/12/19 22:51:42 EmonService: HTTP response -4, retrying.
1/12/19 22:52:00 influxDB: Post Failed: -4
1/13/19 02:18:23 EmonService: Retry successful after 501 attempts.
1/13/19 21:58:59 EmonService: HTTP response -11, retrying.
1/13/19 21:58:59 EmonService: Retry successful after 3 attempts.
1/14/19 02:27:25 EmonService: HTTP response -4, retrying.
1/14/19 02:28:50 EmonService: Retry successful after 7 attempts.
1/18/19 18:57:41 Heap memory has degraded below safe minimum, restarting.

** Restart **

SD initialized.
1/18/19 17:57:43z Real Time Clock is running. Unix time 1547834263
1/18/19 17:57:43z Version 02_03_20
1/18/19 17:57:43z Reset reason: Software/System restart
1/18/19 17:57:43z Trace: 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:4, 18:5, 1:6, 1:3, 1:4, 1:5[21]
1/18/19 17:57:43z ESP8266 ChipID: 8529514
1/18/19 17:57:43z SPIFFS mounted.
1/18/19 18:57:44 Local time zone: +1:00
1/18/19 18:57:44 device name: IotaWatt
1/18/19 18:57:44 MDNS responder started for hostname IotaWatt
1/18/19 18:57:44 LLMNR responder started for hostname IotaWatt
1/18/19 18:57:44 HTTP server started
1/18/19 18:57:44 WiFi connected. SSID=Bruyere, IP=10.1.0.219, channel=2, RSSI -92db
1/18/19 18:57:44 timeSync: service started.
1/18/19 18:57:44 statService: started.
1/18/19 18:57:44 Updater: service started. Auto-update class is MINOR
1/18/19 18:57:44 dataLog: service started.
1/18/19 18:57:44 dataLog: Last log entry 01/18/19 18:57:40
1/18/19 18:57:44 historyLog: service started.
1/18/19 18:57:44 historyLog: Last log entry 01/18/19 18:57:00
1/18/19 18:57:46 Updater: Auto-update is current for class MINOR.
1/18/19 18:57:49 EmonService: started. url=10.2.0.200:9092, node=IotaWatt, interval=10
1/18/19 18:57:49 influxDB: started, url=10.2.0.200:8086, db=bruyere, interval=10
1/18/19 18:57:55 EmonService: Start posting at 01/18/19 18:57:40
1/18/19 18:57:58 influxDB: Start posting at 01/18/19 18:57:00
1/18/19 19:03:54 Heap memory has degraded below safe minimum, restarting.

** Restart **

SD initialized.
1/18/19 18:03:56z Real Time Clock is running. Unix time 1547834636
1/18/19 18:03:56z Version 02_03_20
1/18/19 18:03:56z Reset reason: Software/System restart
1/18/19 18:03:56z Trace: 1:5[7], 7:0, 7:9, 1:6, 1:3, 1:4, 1:5[7], 7:0, 7:9, 1:6, 1:1[5], 1:2[6], 9:0[6], 9:0, 9:1, 8:4, 8:6, 8:8, 8:9, 9:3, 9:5, 9:9, 1:2, 1:3, 1:4, 1:5[7], 7:0, 7:9, 1:6, 1:3, 1:4, 1:5[21]
1/18/19 18:03:56z ESP8266 ChipID: 8529514
1/18/19 18:03:56z SPIFFS mounted.
1/18/19 19:03:57 Local time zone: +1:00
1/18/19 19:03:57 device name: IotaWatt
1/18/19 19:03:57 MDNS responder started for hostname IotaWatt
1/18/19 19:03:57 LLMNR responder started for hostname IotaWatt
1/18/19 19:03:57 HTTP server started
1/18/19 19:03:57 timeSync: service started.
1/18/19 19:03:58 statService: started.
1/18/19 19:03:58 dataLog: service started.
1/18/19 19:03:58 dataLog: Last log entry 01/18/19 19:03:50
1/18/19 19:03:58 historyLog: service started.
1/18/19 19:03:58 historyLog: Last log entry 01/18/19 19:03:00
1/18/19 19:04:02 EmonService: started. url=10.2.0.200:9092, node=IotaWatt, interval=10
1/18/19 19:04:02 influxDB: started, url=10.2.0.200:8086, db=bruyere, interval=10
1/18/19 19:05:08 WiFi connected. SSID=Bruyere, IP=10.1.0.219, channel=2, RSSI -94db
1/18/19 19:05:08 Updater: service started. Auto-update class is MINOR
1/18/19 19:05:12 EmonService: Start posting at 01/18/19 19:04:00
1/18/19 19:05:16 Updater: Auto-update is current for class MINOR.
1/18/19 19:05:45 influxDB: last entry query failed: -11
1/18/19 19:05:45 influxDB: Stopped. Last post 01/18/19 19:03:20
1/18/19 19:47:10 EmonService: HTTP response -4, retrying.
1/18/19 19:47:33 EmonService: Retry successful after 4 attempts.
1/18/19 19:49:18 EmonService: HTTP response -11, retrying.
1/18/19 19:49:18 EmonService: Retry successful after 3 attempts.
1/19/19 01:57:03 influxDB: started, url=10.2.0.200:8086, db=bruyere, interval=10
1/19/19 01:57:04 influxDB: Start posting at 01/18/19 19:03:30
1/20/19 02:41:06 Heap memory has degraded below safe minimum, restarting.

** Restart **

SD initialized.
1/20/19 01:41:08z Real Time Clock is running. Unix time 1547948468
1/20/19 01:41:08z Version 02_03_20
1/20/19 01:41:08z Reset reason: Software/System restart
1/20/19 01:41:08z Trace: 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:4, 18:5, 1:6, 1:3, 1:4, 1:5[21]
1/20/19 01:41:08z ESP8266 ChipID: 8529514
1/20/19 01:41:08z SPIFFS mounted.
1/20/19 02:41:09 Local time zone: +1:00
1/20/19 02:41:09 device name: IotaWatt
1/20/19 02:41:09 MDNS responder started for hostname IotaWatt
1/20/19 02:41:09 LLMNR responder started for hostname IotaWatt
1/20/19 02:41:09 HTTP server started
1/20/19 02:41:09 WiFi connected. SSID=Bruyere, IP=10.1.0.219, channel=2, RSSI -92db
1/20/19 02:41:09 timeSync: service started.
1/20/19 02:41:09 statService: started.
1/20/19 02:41:09 Updater: service started. Auto-update class is MINOR
1/20/19 02:41:09 dataLog: service started.
1/20/19 02:41:09 dataLog: Last log entry 01/20/19 02:41:05
1/20/19 02:41:09 historyLog: service started.
1/20/19 02:41:09 historyLog: Last log entry 01/20/19 02:41:00
1/20/19 02:41:14 EmonService: started. url=10.2.0.200:9092, node=IotaWatt, interval=10
1/20/19 02:41:14 influxDB: started, url=10.2.0.200:8086, db=bruyere, interval=10
1/20/19 02:41:20 EmonService: Start posting at 01/20/19 02:41:10
1/20/19 02:41:28 influxDB: last entry query failed: -11
1/20/19 02:41:28 influxDB: Stopped. Last post 12/29/18 00:00:00
1/20/19 02:41:32 Updater: Invalid response from server. HTTPcode: -4
1/20/19 03:01:13 EmonService: HTTP response -11, retrying.
1/20/19 03:01:17 EmonService: Retry successful after 3 attempts.
1/20/19 03:41:14 Updater: Auto-update is current for class MINOR.
`

1/20/19 19:34:59 influxDB: started, url=10.2.0.200:8086, db=bruyere, interval=10 1/20/19 19:35:00 influxDB: Start posting at 01/20/19 02:40:30

Any idea why it self-restarts so often ?

And why influxdb doesn’t auto-restart, while emonCMS does ?


#2

There are two issues here:

The restarts due to heap degradation is characteristic of a lot of WiFi error recovery. The memory leak is hard to nail down and I think it happens in a lower part of the IP stack that I have no control over. These things don’t happen with strong WiFi. Id suggest you use one of the mainstream channels 1, 6, 11 and sometimes 4 works OK.

The problem with influx stopping is caused by WiFi errors, but the recovery isn’t robust enough. The new release 02_03_21 due out in ALPHA this week has improvements in that area that I expect will resolve that if the WiFi can be improved somewhat.


#3

Thank you for your quick reply and planed improvement. The IotaWatt works awesome, other than this reboot and influxdb reconnect issue!

Wifi is indeed not very strong but not super-weak either in the basement. Will see if/how I can improve. An Ethernet connection or wired external USB-Ethernet adapter (as I understand no solution available at this stage?) would solve issue best in terms of connectivity and absence of electromagnetic radiations (electro-smog).