Several restarts after upgrade: "Reset reason: Exception"

Iotawatt device automatically upgraded itself to 2.06.02 9 days ago. Since then it has restarted three times, each time with the message: Reset reason: Exception.
I only noticed today. It is set to send data to local emoncms instance, and graphs weren’t showing up. I can now see that it is resending all historical data. Screenshot shows part of the “Status” screen.
Screenshot 2021-05-12 at 20.47.47

And here is the message log since the upgrade

5/03/21 19:41:28z Updater: Update from 02_05_12 to 02_06_02
5/03/21 19:41:28z Updater: download 02_06_02
5/03/21 19:41:34z Updater: Release downloaded 6374ms, size 830712
5/03/21 19:41:42z Updater: signature verified
5/03/21 19:41:51z Updater: firmware upgraded to version 02_06_02
5/03/21 19:41:51z Updater: Firmware updated, restarting.

** Restart **

SD initialized.
5/03/21 18:41:59z Real Time Clock is running. Unix time 1620067319 
5/03/21 18:41:59z Reset reason: Software/System restart
5/03/21 18:41:59z Trace:  8:4, 8:6, 8:8, 8:9, 1:2, 1:3, 1:4, 1:5[21], 21:0, 21:1, 21:10, 21:20, 21:21, 21:21[1], 21:30, 21:99, 1:6, 1:3, 1:4, 1:5[3], 3:0, 3:6, 1:6, 1:3, 1:4, 1:5[19], 1:6, 1:3, 1:4, 1:5[5], 5:0, 5:7
5/03/21 18:41:59z ESP8266 ChipID: 2428636
5/03/21 18:41:59z IoTaWatt 5.0, Firmware version 02_06_02
5/03/21 18:41:59z Updater: Installing update files for version 02_06_02
5/03/21 18:41:59z Updater: Installing GRAPH.JS
5/03/21 18:41:59z Updater: Installing GRAPH2.HTM
5/03/21 18:41:59z Updater: Installing GRAPH2.JS
5/03/21 18:41:59z Updater: Installing INDEX.HTM
5/03/21 18:41:59z Updater: Installing TABLES.TXT
5/03/21 18:41:59z Updater: Installing CNFSTYLE.CSS
5/03/21 18:42:00z Updater: Installing EDIT.HTM
5/03/21 18:42:00z Updater: Installing GRAPH.HTM
5/03/21 18:42:00z Updater: Installation complete.
5/03/21 18:42:00z SPIFFS mounted.
5/03/21 19:42:00z Local time zone: +0:00
5/03/21 19:42:00z Using Daylight Saving Time (BST) when in effect.
5/03/21 19:42:00z device name: energi
5/03/21 19:42:00z HTTP server started
5/03/21 19:42:00z emoncms: Starting, interval:10, url:http://emoncms.graabek.com
5/03/21 19:42:00z timeSync: service started.
5/03/21 19:42:00z statService: started.
5/03/21 19:42:00z dataLog: service started.
5/03/21 19:42:02z dataLog: Last log entry 05/03/21 19:41:30
5/03/21 19:42:02z historyLog: service started.
5/03/21 19:42:02z historyLog: Last log entry 05/03/21 19:41:00
5/03/21 19:42:05z WiFi connected. SSID=38ch, IP=172.16.1.9, channel=1, RSSI -71db
5/03/21 19:42:05z MDNS responder started for hostname energi
5/03/21 19:42:05z LLMNR responder started for hostname energi
5/03/21 19:42:05z Updater: service started. Auto-update class is MINOR
5/03/21 19:42:05z emoncms: Start posting at 05/03/21 19:41:20
5/03/21 19:42:07z Updater: Auto-update is current for class MINOR.
5/04/21 03:02:43z WiFi disconnected.
5/04/21 03:02:46z WiFi connected. SSID=38ch, IP=172.16.1.9, channel=6, RSSI -73db

** Restart **

SD initialized.
5/04/21 23:28:54z Real Time Clock is running. Unix time 1620170934 
5/04/21 23:28:54z Reset reason: Exception
5/04/21 23:28:54z Trace:  8:9, 9:3, 9:5, 9:9, 1:2, 1:3, 1:3, 1:6[1], 1:6[3], 1:5[31], 1:6[4], 31:0, 31:1, 31:2[6], 31:90, 21:110, 21:110, 31:91, 31:9, 31:1, 1:6[6], 1:3, 1:3, 1:6[1], 1:6[3], 1:5[31], 1:6[4], 31:0, 31:1, 31:2[4], 3:91, 15:0[32]
5/04/21 23:28:54z ESP8266 ChipID: 2428636
5/04/21 23:28:54z IoTaWatt 5.0, Firmware version 02_06_02
5/04/21 23:28:54z SPIFFS mounted.
5/05/21 00:28:54z Local time zone: +0:00
5/05/21 00:28:54z Using Daylight Saving Time (BST) when in effect.
5/05/21 00:28:54z device name: energi
5/05/21 00:28:54z HTTP server started
5/05/21 00:28:54z emoncms: Starting, interval:10, url:http://emoncms.graabek.com
5/05/21 00:28:54z timeSync: service started.
5/05/21 00:28:54z statService: started.
5/05/21 00:28:54z dataLog: service started.
5/05/21 00:28:55z dataLog: Last log entry 05/05/21 00:28:50
5/05/21 00:28:55z historyLog: service started.
5/05/21 00:28:56z historyLog: Last log entry 05/05/21 00:28:00
5/05/21 00:28:59z WiFi connected. SSID=38ch, IP=172.16.1.9, channel=6, RSSI -75db
5/05/21 00:28:59z MDNS responder started for hostname energi
5/05/21 00:28:59z LLMNR responder started for hostname energi
5/05/21 00:28:59z Updater: service started. Auto-update class is MINOR
5/05/21 00:28:59z emoncms: Start posting at 05/05/21 00:26:30
5/05/21 00:29:01z Updater: Auto-update is current for class MINOR.
5/05/21 03:00:21z WiFi disconnected.
5/05/21 03:00:24z WiFi connected. SSID=38ch, IP=172.16.1.9, channel=13, RSSI -73db
5/06/21 03:02:36z WiFi disconnected.
5/06/21 03:02:42z WiFi connected. SSID=38ch, IP=172.16.1.9, channel=13, RSSI -80db
5/10/21 03:01:25z WiFi disconnected.
5/10/21 03:01:26z WiFi connected. SSID=38ch, IP=172.16.1.9, channel=13, RSSI -75db

** Restart **

SD initialized.
5/10/21 09:50:44z Real Time Clock is running. Unix time 1620640244 
5/10/21 09:50:44z Reset reason: Exception
5/10/21 09:50:44z Trace:  8:9, 9:3, 9:5, 9:9, 1:2, 1:3, 1:3, 1:6[1], 1:6[3], 1:5[31], 1:6[4], 31:0, 31:1, 31:2[6], 31:90, 21:110, 21:110, 31:91, 31:9, 31:1, 1:6[6], 1:3, 1:3, 1:6[1], 1:6[3], 1:5[31], 1:6[4], 31:0, 31:1, 31:2[4], 3:91, 15:0[32]
5/10/21 09:50:44z ESP8266 ChipID: 2428636
5/10/21 09:50:44z IoTaWatt 5.0, Firmware version 02_06_02
5/10/21 09:50:44z SPIFFS mounted.
5/10/21 10:50:44z Local time zone: +0:00
5/10/21 10:50:44z Using Daylight Saving Time (BST) when in effect.
5/10/21 10:50:44z device name: energi
5/10/21 10:50:44z HTTP server started
5/10/21 10:50:44z emoncms: Starting, interval:10, url:http://emoncms.graabek.com
5/10/21 10:50:44z timeSync: service started.
5/10/21 10:50:44z statService: started.
5/10/21 10:50:44z dataLog: service started.
5/10/21 10:50:45z dataLog: Last log entry 05/10/21 10:50:40
5/10/21 10:50:45z historyLog: service started.
5/10/21 10:50:46z historyLog: Last log entry 05/10/21 10:50:00
5/10/21 10:50:49z WiFi connected. SSID=38ch, IP=172.16.1.9, channel=13, RSSI -75db
5/10/21 10:50:49z MDNS responder started for hostname energi
5/10/21 10:50:49z LLMNR responder started for hostname energi
5/10/21 10:50:49z Updater: service started. Auto-update class is MINOR
5/10/21 10:50:49z emoncms: Start posting at 05/10/21 10:48:20
5/10/21 10:50:51z Updater: Auto-update is current for class MINOR.
5/12/21 03:01:33z WiFi disconnected.
5/12/21 03:01:37z WiFi connected. SSID=38ch, IP=172.16.1.9, channel=1, RSSI -72db

** Restart **

SD initialized.
5/12/21 06:48:22z Real Time Clock is running. Unix time 1620802102 
5/12/21 06:48:22z Reset reason: Exception
5/12/21 06:48:22z Trace:  8:9, 9:3, 9:5, 9:9, 1:2, 1:3, 1:3, 1:6[1], 1:6[3], 1:5[31], 1:6[4], 31:0, 31:1, 31:2[6], 31:90, 21:110, 21:110, 31:91, 31:9, 31:1, 1:6[6], 1:3, 1:3, 1:6[1], 1:6[3], 1:5[31], 1:6[4], 31:0, 31:1, 31:2[4], 3:91, 15:0[32]
5/12/21 06:48:22z ESP8266 ChipID: 2428636
5/12/21 06:48:22z IoTaWatt 5.0, Firmware version 02_06_02
5/12/21 06:48:22z SPIFFS mounted.
5/12/21 07:48:22z Local time zone: +0:00
5/12/21 07:48:22z Using Daylight Saving Time (BST) when in effect.
5/12/21 07:48:22z device name: energi
5/12/21 07:48:22z HTTP server started
5/12/21 07:48:22z emoncms: Starting, interval:10, url:http://emoncms.graabek.com
5/12/21 07:48:22z timeSync: service started.
5/12/21 07:48:22z statService: started.
5/12/21 07:48:22z dataLog: service started.
5/12/21 07:48:24z dataLog: Last log entry 05/12/21 07:48:20
5/12/21 07:48:24z historyLog: service started.
5/12/21 07:48:24z historyLog: Last log entry 05/12/21 07:48:00
5/12/21 07:48:29z WiFi connected. SSID=38ch, IP=172.16.1.9, channel=1, RSSI -71db
5/12/21 07:48:29z MDNS responder started for hostname energi
5/12/21 07:48:29z LLMNR responder started for hostname energi
5/12/21 07:48:29z Updater: service started. Auto-update class is MINOR
5/12/21 07:48:29z emoncms: Start posting at 05/11/20 21:17:10
5/12/21 07:48:31z Updater: Auto-update is current for class MINOR.

Trying to sort this out. When your unit auto-updated on 5/3/21, the Emoncms update picked up where it left off prior to the update, losing about 30 seconds of upload. Since then, it has restarted three times on 5/5, 5/10 and 5/12.

The status display was at 20:47:45 today (5/12). The last log entry was 13 hours prior at 7:48:31. Those times are assuming you are in the UK or other UTC+0 time zone. Otherwise, they would need adjustment because your browser will be showing local times in the status display.

So it does appear to have been uploading to Emoncms, at least through 5/11 21:17:10. (UTC). Can you verify that?

I don’t know why the uploader status is saying 20/08/2020 Can you post the entire status display with all of the tabs exposed please?

I also see that you recently changed the WiFi channel and thus improved RSSI.

So:

  • Post full status display
  • verify time-zone
  • Determine where your Emoncms data upload stops.

Full status display:

Timezone:
I am indeed in the UK

The latest data visible in Emoncms is from today 7:47 am.

As can be seen from the status display now as compared with the earlier (partial) screenshot, has gotten through more data. With regards to the data upload, I’ve seen similar behaviour but that was caused by me. I don’t now recall why I did it but at the time there was a good reason. I removed the emoncms data upload configuration and then re-created it. Iotawatt then proceeded to upload all the historical data. Emoncms seemed to ignore the data it had already seen (only one dashboard couldn’t cope with the fact that data had now been uploaded a second time).
So I’m not too concerned about the fact I can’t see current data in Emoncms right now, If Iotawatt stays up I think it will catch up.
My concern is more with the three resets. Obviously, if another reset causes Iotawatt to start over, then it is even more of an issue.

The changed Wifi channel is caused by me having two access points (Ubiquiti), and the controller software occasionally changing the channel the APs are using. In hindsight I’m wondering if the occasionally changing Wifi channels and the two APs can have an influence on this. I turned the 2nd AP on April 15 (or thereabouts). However, the controller had been changing the Wifi channel of the one AP before then. I think the 2nd AP has actually caused the channels to be more stable. In other words, some of the channel jumping will be Iotawatt jumping to a different AP. Just thought I’d give you all potentially relevant information.

The restarts are related to WiFi issues. They are happening every few days and not really a big deal with respect to monitoring integrity. The heap issues that cause them are happening at a lower level than the IoTaWatt firmware. If you can get the IoTaWatt to stick to one AP, that might resolve it.

The issue of Emoncms upload restarts is a different issue. Yes, it’s aggravated by the restarts, but might also have something to do with the way your local instance uses the Reddis database cache.

When the Emoncms uploader starts, it queries the Emoncms inputs status to determine the time of last posting, with the intention of resuming from that point. Emoncms keeps that inputs database in memory using the reddis cache. If your Emoncms has restarted recently, that cache is lost, and IoTaWatt then sees the node as new and tries to upload from the earliest date in the current log, or the “upload history from” date in your IoTaWatt Emoncms setup. Whichever is most recent.

So the first thing I recommend for folks with a local Emoncms instance is to occasionally ratchet up that “upload history from” date to have a backstop.

The more effective way to deal with the issue is to add reddis persistance to the Emoncms reddis. Emoncms uses reddis to reduce SDcard wear that would result from updating the inputs database after every write. Reddis has several persistance modes, but the one I recommend makes an image of the database at regular intervals. I would recommend daily. With that, your restart would at most replay one day’s upload.

I don’t know exactly how to set the persistence, but it’s explained in the reddis site
here

.The Emoncms folks don’t setup their releases with reddis persistence because their products are real-time or near-time only. In the event of an extended communication or server outage, the data is simply lost and it resumes from the time of service restoration. Because IoTaWatt retains all of the historical data, it can take the extra step of querying Emoncms and resuming from the point of failure. Enterprise Emoncms (Emoncms.org) doesn’t have this reddit issue.