Rebooting every ~15 seconds - resolved

I have an Iotawatt that is at a remote location monitoring usage. I use it to calculate a meter reading to submit to the utility company every month as there’s rarely anyone there so no way to physically get a reading. Sometime in the last month the device started rebooting constantly. I just realized it tonight when trying to get this months reading. It also started re-uploading data to my emonCMS instance due to whatever corruption occurred as last month emon read ~9000kWh, and it’s currently reading over 22,000kWh which there’s no way that’s accurate. The 9000kWh is all that’s been used in the 13 months the Iota has been in place.

Is there anyway I can restore this without having to drive an hour and a half to the site? I suspect corruption occurred during a power outage a couple weeks ago. I don’t have the Iota connected to a battery backup as I read something that said it could be a hazard if the Iota was powered and there was no flow through the CTs? Is there a risk if I connect the Iota to a battery backup if there’s an hours long power outage? The site frequently has power flickers (it’s in the middle of the woods) and sometimes has outages that last 4+ hours. There isn’t much connected to the battery backup and it could easily last 2-3 hours during an extended outage.

Here is the log

SD initialized.
9/26/20 02:45:06z Real Time Clock is running. Unix time 1601088306
9/26/20 02:45:06z Reset reason: Software/System restart
9/26/20 02:45:06z Trace: 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:4, 18:5, 1:6, 1:3, 1:4, 1:5[21], 21:0, 21:1, 21:10, 21:10
9/26/20 02:45:06z ESP8266 ChipID: 2527501
9/26/20 02:45:06z IoTaWatt 5.0, Firmware version 02_05_09
9/26/20 02:45:06z SPIFFS mounted.
9/25/20 22:45:06 Local time zone: -5:00
9/25/20 22:45:06 Using Daylight Saving Time (BST) when in effect.
9/25/20 22:45:06 device name: IotaWatt
9/25/20 22:45:06 HTTP server started
9/25/20 22:45:06 timeSync: service started.
9/25/20 22:45:07 statService: started.
9/25/20 22:45:07 dataLog: service started.
9/25/20 22:45:09 dataLog: Last log entry 09/25/20 22:45:05
9/25/20 22:45:11 EmonService: started. url=HTTP://192.168.1.30:7080, node=MudRidge, interval=5, encrypted
9/25/20 22:45:12 Updater: service started. Auto-update class is MINOR
9/25/20 22:45:12 WiFi connected. SSID=Brandon, IP=192.168.128.105, channel=11, RSSI -66db
9/25/20 22:45:12 MDNS responder started for hostname IotaWatt
9/25/20 22:45:12 LLMNR responder started for hostname IotaWatt
9/25/20 22:45:13 EmonService: Start posting at 09/29/19 00:09:45
9/25/20 22:45:13 Updater: Auto-update is current for class MINOR.
9/25/20 22:45:16 Heap memory has degraded below safe minimum, restarting.

** Restart **

SD initialized.
9/26/20 02:45:17z Real Time Clock is running. Unix time 1601088317
9/26/20 02:45:17z Reset reason: Software/System restart
9/26/20 02:45:17z Trace: 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:4, 18:5, 1:6, 1:1[1], 1:2[2], 9:0[2], 9:0, 9:1, 8:4, 8:6, 8:8, 8:9, 9:3, 9:5, 9:9, 1:2, 1:3, 1:4, 1:5[21], 21:0, 21:1, 21:10, 21:10
9/26/20 02:45:17z ESP8266 ChipID: 2527501
9/26/20 02:45:17z IoTaWatt 5.0, Firmware version 02_05_09
9/26/20 02:45:17z SPIFFS mounted.
9/25/20 22:45:17 Local time zone: -5:00
9/25/20 22:45:17 Using Daylight Saving Time (BST) when in effect.
9/25/20 22:45:17 device name: IotaWatt
9/25/20 22:45:17 HTTP server started
9/25/20 22:45:17 timeSync: service started.
9/25/20 22:45:18 statService: started.
9/25/20 22:45:18 dataLog: service started.
9/25/20 22:45:20 dataLog: Last log entry 09/25/20 22:45:15
9/25/20 22:45:22 EmonService: started. url=HTTP://192.168.1.30:7080, node=MudRidge, interval=5, encrypted
9/25/20 22:45:23 Updater: service started. Auto-update class is MINOR
9/25/20 22:45:23 WiFi connected. SSID=Brandon, IP=192.168.128.105, channel=11, RSSI -66db
9/25/20 22:45:23 MDNS responder started for hostname IotaWatt
9/25/20 22:45:23 LLMNR responder started for hostname IotaWatt
9/25/20 22:45:24 EmonService: Start posting at 09/29/19 00:18:30
9/25/20 22:45:24 Updater: Auto-update is current for class MINOR.
9/25/20 22:45:27 Heap memory has degraded below safe minimum, restarting.

** Restart **

SD initialized.
9/26/20 02:45:29z Real Time Clock is running. Unix time 1601088329
9/26/20 02:45:29z Reset reason: Software/System restart
9/26/20 02:45:29z Trace: 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:4, 18:5, 1:6, 1:3, 1:4, 1:5[21], 21:0, 21:1, 21:10, 21:10
9/26/20 02:45:29z ESP8266 ChipID: 2527501
9/26/20 02:45:29z IoTaWatt 5.0, Firmware version 02_05_09
9/26/20 02:45:29z SPIFFS mounted.
9/25/20 22:45:29 Local time zone: -5:00
9/25/20 22:45:29 Using Daylight Saving Time (BST) when in effect.
9/25/20 22:45:29 device name: IotaWatt
9/25/20 22:45:29 HTTP server started
9/25/20 22:45:29 timeSync: service started.
9/25/20 22:45:30 statService: started.
9/25/20 22:45:30 dataLog: service started.
9/25/20 22:45:32 dataLog: Last log entry 09/25/20 22:45:25
9/25/20 22:45:34 EmonService: started. url=HTTP://192.168.1.30:7080, node=MudRidge, interval=5, encrypted
9/25/20 22:45:35 Updater: service started. Auto-update class is MINOR
9/25/20 22:45:35 WiFi connected. SSID=Brandon, IP=192.168.128.105, channel=11, RSSI -66db
9/25/20 22:45:35 MDNS responder started for hostname IotaWatt
9/25/20 22:45:35 LLMNR responder started for hostname IotaWatt
9/25/20 22:45:36 EmonService: Start posting at 09/29/19 00:27:15
9/25/20 22:45:36 Updater: Auto-update is current for class MINOR.
9/25/20 22:45:39 Heap memory has degraded below safe minimum, restarting.

** Restart **

SD initialized.
9/26/20 02:45:40z Real Time Clock is running. Unix time 1601088340
9/26/20 02:45:40z Reset reason: Software/System restart
9/26/20 02:45:40z Trace: 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:2, 18:3, 18:4, 18:5, 1:6, 1:3, 1:4, 1:5[21], 21:0, 21:1, 21:10, 21:10
9/26/20 02:45:40z ESP8266 ChipID: 2527501
9/26/20 02:45:40z IoTaWatt 5.0, Firmware version 02_05_09
9/26/20 02:45:40z SPIFFS mounted.
9/25/20 22:45:40 Local time zone: -5:00
9/25/20 22:45:40 Using Daylight Saving Time (BST) when in effect.
9/25/20 22:45:40 device name: IotaWatt
9/25/20 22:45:40 HTTP server started
9/25/20 22:45:40 timeSync: service started.
9/25/20 22:45:41 statService: started.
9/25/20 22:45:41 dataLog: service started.
9/25/20 22:45:43 dataLog: Last log entry 09/25/20 22:45:35
9/25/20 22:45:45 EmonService: started. url=HTTP://192.168.1.30:7080, node=MudRidge, interval=5, encrypted
9/25/20 22:45:46 Updater: service started. Auto-update class is MINOR
9/25/20 22:45:46 WiFi connected. SSID=Brandon, IP=192.168.128.105, channel=11, RSSI -66db
9/25/20 22:45:46 MDNS responder started for hostname IotaWatt
9/25/20 22:45:46 LLMNR responder started for hostname IotaWatt
9/25/20 22:45:47 EmonService: Start posting at 09/29/19 00:36:00
9/25/20 22:45:47 Updater: Auto-update is current for class MINOR.

Not a lot to go on here, but I do have some ideas. The restarts in the log above are caused by heap memory degradation. Most of the time that’s caused by issues with WiFi. Your RSSI is ok, so I wonder if it has to do with your Emoncms server.

There is an issue with emonpi in that the inputs are not preserved after restart. When IoTaWatt restarts, it queries Emoncms for the times of the last updates of the inputs. Emoncms returns zero. As a result, IoTaWatt starts uploading from the beginning of time. In your case it was last year. So IoTaWatt is uploading from then.

There are a few things you can try:

Set the upload history from date in the IoTaWatt Emoncms configuration to today.

If you can’t do that or it doesn’t fix the problem, try stopping your Emoncms if you can ssh into the pi.

If this stops the restarts, you can get the kWh usage directly from the IoTaWatt using Graph+.

If you get back to me with how these suggestions worked out, I might be able to better understand the problem and offer more informed advice.

As UPS is probably not the best solution.

I was going to try and stop the Emon service in between reboots but it appears full on corruption has occurred. It is no longer responding to pings and cannot access the web UI. I am running Emon in a docker container on a server, instead of a pi. Not sure if that makes a difference. I’m ok with losing the existing data and starting over, I assume I’ll have to reset the Iota to get it back to a working state.

Since it seems like I’m going to have to drive to the site, would it be worth while to connect the Iota to the UPS that is already there to prevent multiple reboots when the power flickers on and off? It regularly flickers on and off multiple times within a few seconds when there are issues here. I’m assuming the quick successive reboots are not good? Is there a danger in having it connected to a UPS?

I think the problem is with Emoncms. My guess is that when the Emoncms is stopped, the IoTaWatt restarts will stop. I’d like to know more about what is happening when the IoTaWatt sends data to Emoncms, but it doesn’t sound like that is possible.

This context issue with restarted Emoncms has been on my radar for a couple of months now. I talked to Trystan about it and the conclusion was that there is little they can do on their end to preserve the inputs state. It has to do with minimizing SD card wear.

Since you are running on a NAS and can write with impunity, there may be a setting that preserves the inputs. I’ll ask the question.

None of the OEM products attempt to upload history after failure. I think the Emonhub saves a limited amount of history in memory to bridge short communication failures, but not power failures. So this isn’t a problem that they have. You just get a hole in your data.

There is no known issue with frequent restarts of the IoTaWatt. If you want to plug the USB power into a UPS, that’s fine, but don’t put the AC adapter on a UPS or any kind of power conditioner. It needs the naked truth.

I don’t doubt you that it’s Emon, just seems odd to me that it sent to Emon for the last 10 months with no problems. I can’t access it at all now to try and stop the service to see if it stops rebooting. It won’t even respond to ping. I’ll probably have to drive down there and power cycle it. That’s good to hear on the frequent restarts. I won’t worry about moving the USB power to UPS then.

Here is the OEM thread with Trystan on the subject:

So the issue with restart of emoncms with IoTaWatt appears to be the Redis persistence, which in probably turned off in your copy. Here is a brief summary of the options. I leave it to you to discover where and how to set these in Emoncms/Docker.

I realize you may have bigger problems with your Emoncms right now, but this is an issue with resynching after power fail, and with HDD you can easily use one or both of the persistence methods. Note Trystan’s comment about RDB only and the undesirable side effect of inflating kWh feeds. So the AOF, in combination with RDB, is probably what you need.

I have no problem trashing my Emon data and starting over fresh. That’s easy enough. The problem I currently have is not being able to access the Iotawatt since last night when I discovered it was constantly rebooting. The Emon container runs in an ESXi cluster at my house and rarely ever loses power/connectivity. The Iota is the remote device and is currently not responding to Pings anymore nor is the webGUI accessible.

I will have to look at that thread though as after a while the reading I was getting would be slightly elevated from the actual meter. I always just assumed it was the Iota being slightly off in it’s reading though. Was never a problem, just meant every few months when I get an actual reading we end up with basically a $0 bill that month.

I can’t speak to what’s happening in Emoncms, but the IoTaWatt has that data and can report it out accurately with Graph+.

Here the time period is “last month”, but you can easily specify the period start date/time and end to match your utility reading. The data statistics show 677 kWh. Going forward IoTaWatt should be very close, although I don’t know the impact of these restarts on the last month. The average is average kWh/day as I’ve grouped “daily”.

To get the total, you also need to specify “accrue” for the wH plot:

Awesome, when I get back into the Iota I’ll have to compare the Graph+ output to what is in Emon.

Edit:

It was Emon. The device was connected to WiFi but didn’t have a valid IP, I was able to force a reconnect through the AP controller and it grabbed a proper IP from DHCP and when I connected it had an uptime of 9+ hours. Watching the Emon status, as soon as it started trying to upload it forced a reboot, I was able to get the process stopped and it’s been running stable the last few minutes. I’ll work on rebuilding my Emon feed and starting over. I assume it’s something corrupted in that feed and that’s the issue. If I create a fresh feed and have the Iota upload data from the past year it should work correctly?

1 Like

Is it possible for graph + to show me kWh when it exceeds 1000? Currently it’s converting automatically to mWh. When I hover on the line itself at the newest data point it’s showing 9.51mWh instead of 951x kWh. I typically take this kWh and manually add it to what the base meter reading is to get the new meter reading in kWh to submit to the utility. I’m also not sure why the data from when it was first setup is showing negative kWh?

Looks like the place has electric heat.

The units is Wh. It will show kWh in the scale when Wh exceeds 1,000. Likewise, it shows mWh when Wh exceeds 1,000,000. If you want to see the exact Wh, click the CSV data.

In your plot you are grouping auto which is every 24 hours. You can also reduce the size of the CSV by grouping weekly, monthly or all which shows just the total Wh.

That is odd. It also might have something to do with the rebooting when trying to upload data from Sept 2019.

It does have electric heat. It gets stranger, if I use the original graph the early data isn’t counted as negative and so the total is higher/I assume more accurate. It also gives the reading in kWh instead of converting the Wh to mWh. The device was setup Sept 1 2019 so I’m not sure why the line is decreasing to 0 before that. I assume because the data is missing? If I check the box for show missing that line prior to Sept 1 2019 disappears.