Incessant Restarts, Heap Memory Degraded

Log snippet:

** Restart **

SD initialized.
7/17/25 22:07:09z Real Time Clock is running. Unix time 1752790029 
7/17/25 22:07:09z Reset reason: Software/System restart
7/17/25 22:07:09z Trace:  1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:1[11], 1:2[12], 9:0[12], 9:0, 9:1, 8:4, 8:6, 8:8, 9:3, 9:5, 9:9, 1:3, 1:3, 1:6[1], 1:6[2], 1:6[3], 1:5[21], 1:6[4], 21:0, 21:1, 21:10, 21:10
7/17/25 22:07:09z ESP8266 ID: 15827424, RTC PCF8523 (68)
7/17/25 22:07:09z IoTaWatt 5.0, Firmware version 02_08_03
7/17/25 22:07:09z SPIFFS mounted.
7/17/25 18:07:09 Local time zone: -5:00, using DST/BST when in effect.
7/17/25 18:07:09 device name: IotaWatt
7/17/25 18:07:09 HTTP server started
7/17/25 18:07:09 timeSync: service started.
7/17/25 18:07:09 statService: started.
7/17/25 18:07:09 dataLog: service started.
7/17/25 18:07:12 dataLog: Last log entry 07/17/25 18:07:05
7/17/25 18:07:14 historyLog: service started.
7/17/25 18:07:15 historyLog: Last log entry 07/17/25 18:07:00
7/17/25 18:07:15 WiFi connected. SSID=NETGEAR22, IP=192.168.1.240, channel=11, RSSI -61db
7/17/25 18:07:15 Updater: service started. Auto-update class is MINOR
7/17/25 18:07:15 Heat: Started
7/17/25 18:07:16 Heat: Last log entry 07/17/25 18:07:05
7/17/25 18:07:17 Updater: Auto-update is current for class MINOR.
7/17/25 18:09:00 Heap memory has degraded below safe minimum, restarting.

** Restart **

SD initialized.
7/17/25 22:09:01z Real Time Clock is running. Unix time 1752790141 
7/17/25 22:09:01z Reset reason: Software/System restart
7/17/25 22:09:01z Trace:  9:9, 1:3, 1:3, 1:6[1], 1:6[2], 1:6[2], 1:6[3], 1:5[19], 1:6[4], 1:6[6], 1:1[4], 1:2[5], 9:0[5], 9:0, 9:1, 8:4, 8:6, 8:8, 9:3, 9:5, 9:9, 1:3, 1:3, 1:6[1], 1:6[2], 1:6[3], 1:5[21], 1:6[4], 21:0, 21:1, 21:10, 21:10
7/17/25 22:09:01z ESP8266 ID: 15827424, RTC PCF8523 (68)
7/17/25 22:09:01z IoTaWatt 5.0, Firmware version 02_08_03
7/17/25 22:09:01z SPIFFS mounted.
7/17/25 18:09:01 Local time zone: -5:00, using DST/BST when in effect.
7/17/25 18:09:01 device name: IotaWatt
7/17/25 18:09:01 HTTP server started
7/17/25 18:09:01 timeSync: service started.
7/17/25 18:09:01 statService: started.
7/17/25 18:09:01 dataLog: service started.
7/17/25 18:09:04 dataLog: Last log entry 07/17/25 18:09:00
7/17/25 18:09:06 historyLog: service started.
7/17/25 18:09:07 historyLog: Last log entry 07/17/25 18:09:00
7/17/25 18:09:07 WiFi connected. SSID=NETGEAR22, IP=192.168.1.240, channel=11, RSSI -62db
7/17/25 18:09:07 Updater: service started. Auto-update class is MINOR
7/17/25 18:09:07 Heat: Started
7/17/25 18:09:08 Heat: Last log entry 07/17/25 18:09:00
7/17/25 18:09:09 Updater: Auto-update is current for class MINOR.
7/17/25 18:09:27 Heap memory has degraded below safe minimum, restarting.

** Restart **

SD initialized.
7/17/25 22:09:28z Real Time Clock is running. Unix time 1752790168 
7/17/25 22:09:28z Reset reason: Software/System restart
7/17/25 22:09:28z Trace:  9:1, 8:4, 8:6, 8:8, 9:3, 9:5, 9:9, 1:3, 10:13, 1:3, 1:1[1], 1:2[2], 9:0[2], 9:0, 9:1, 8:4, 8:6, 8:8, 9:3, 9:5, 9:9, 1:3, 1:3, 1:6[1], 1:6[2], 1:6[3], 1:5[21], 1:6[4], 21:0, 21:1, 21:10, 21:10
7/17/25 22:09:28z ESP8266 ID: 15827424, RTC PCF8523 (68)
7/17/25 22:09:28z IoTaWatt 5.0, Firmware version 02_08_03
7/17/25 22:09:28z SPIFFS mounted.
7/17/25 18:09:28 Local time zone: -5:00, using DST/BST when in effect.
7/17/25 18:09:28 device name: IotaWatt
7/17/25 18:09:28 HTTP server started
7/17/25 18:09:28 timeSync: service started.
7/17/25 18:09:28 statService: started.
7/17/25 18:09:28 dataLog: service started.
7/17/25 18:09:31 dataLog: Last log entry 07/17/25 18:09:25
7/17/25 18:09:33 historyLog: service started.
7/17/25 18:09:34 historyLog: Last log entry 07/17/25 18:09:00
7/17/25 18:09:34 WiFi connected. SSID=NETGEAR22, IP=192.168.1.240, channel=11, RSSI -62db
7/17/25 18:09:34 Updater: service started. Auto-update class is MINOR
7/17/25 18:09:34 Updater: Auto-update is current for class MINOR.
7/17/25 18:09:34 Heat: Started
7/17/25 18:09:36 Heat: Last log entry 07/17/25 18:09:25 

I’m not uploading anything. I do keep a browser tab pointed at Graph+, but that’s been the case more-or-less forever, and I do the same with two other IoTaWatt units in different locations and don’t have this kind of issue with them.

Any suggestions would be appreciated.

Heap degradation is usually associated with WiFi issues and poor RSSI. As far as I can tell, there is a memory leak in the ESP8266 IP code. Usually it takes 15-20 minutes to deplete the heap to restart level, but you are restarting in 1 or 2 minutes and your RSSI is good. Nevertheless, my first suspect would be the WiFi. You don’t say if this unit is on the same WiFi. So I would try changing WiFi things starting with the channel. 1,6 and 11 are non-overlapping and are preferred. If you can connect it to another WiFi to see if the problem goes away or changes might shine some light on the issue.

The other possible cause would be SDcard. I can’t say memory leaks are the common failure mode, but SDcards are the only other cause of common IoTaWatt issues, so easy enough to replace it. I’d especially recommend that if your unit is more than 3 years old. Not that they wear out that fast, but that would put you into a period where old firmware bugs could have left you a time capsule.

1 Like

My other units are at other locations, so they’re on different WiFi networks, although all of them use the same type of Netgear router. I’ve already tried changing from channel 11 to channel 1. No difference. There’s very little else on that WiFi network; at the moment just a couple of phones, a Roku, and the IoTaWatt. There are a few more things on the wired side of that network; an Enphase gateway, a Cummins generator, as well as another access point at the other end of the house, but all that stuff has been there “forever”.

I’ll try swapping out the Wifi router this weekend. If that doesn’t solve the problem, I’ll try swapping out the SD card.

FWIW:
Current Log: 1.504 GB
History Log: 184.6 MB
Heat Log: 193.0 MB

Is there any chance these logs are simply too big and need to be pruned somehow? I don’t really care about data that’s more than a few weeks old.

The logs can’t be pruned. The current log holds a year and yours is maxed out. The history log can grow but only about 135Mb/year. If your SDcard is 4Gb or more (max allowed is 32Gb) you should be fine. I’d recommend an 8Gb if you swap. SDHC class 10.

1 Like

I’m confident I’ve found the problem. I had port 80 on the router open, forwarding outside http: requests to the IoTaWatt. I shut that down about 30 minutes ago, and she’s been steady as a rock with free Heap at ~24k ever since. So it seems something external was trying to hack in to the IoTaWatt server. Yes, it is password protected.

I also checked my other two (at other locations, via remote access). The both show Running Times in excess of two weeks and stable free Heaps. But clearly I need to find a better way to ward off unwanted access attempts than just having port 80 forwarding to an IoTaWatt.

I recommend setting up a local VPN server on each site’s router. Then, you can simply connect via a VPN client to each site and safely access all devices on those networks.

1 Like

If you put a trace on external requests I think you will quickly find bots pounding your IP address with every possible port, looking for any vulnerability. @ogiewon advice is well founded. That said port 80 is a particularly bad choice if you want to use port forwarding.

1 Like

Running tine: 1d 23h 45m…
free Heap: 24350

I’m now VERY confident the issue has been identified.

Thinking out loud here: I understand a VPN would be the optimal way to handle this, however, there are a number of reasons why that would be rather tedious For one thing, all three IoTaWatts are on three different 192.168.1.x subnets, so I’d have to reconfigure those. I’m also not sure a VPN client would handle three different connections simultaneously?

Maybe try something simple like using a nondescript port that forwards to the IoTaWatt port 80. The bots will give up quickly and move on whereas they will pound a port 80 with password attempts forever.

1 Like

You may want to consider having all three IoTaWatt devices send their data to a cloud server to collect and store everything in a single place. This way, you would not need to have these devices exposed at all to incoming traffic. I use InfluxDB to consolidate data from my two IoTaWatt devices, so I can view everything using Grafana. This may or may not be practical for your needs/requirements, though. I do have the luxury of having all of my devices running locally on my LAN, including my InfluxDB and Grafana servers.

If I recall correctly, others have successfully used a cloud-hosted instance of InfluxDB to collect IoTaWatt data. :thinking: @overeasy would know for sure if that is possible.

1 Like

Yes, I upload to both a local RPi instance of influxDB1 and the influxDB2 cloud service. I do this primarily as part of continuous testing, but I’ve found uploading my home IoTaWatt units to influx2 is a convenient way to both meld the data and make it accessible from anywhere.

Uploading the cloud service requires secure HTTPS, so you need a proxy on the LAN to add TLS. The IoTaWatt supports using an NGINX proxy to do this and is described in the docs. It’s pretty straightforward and you can copy the required NGINX configuration directly from the docs.

While many use Grafana with influx, I just use the integrated influx dashboard system which is simple to setup and suits my needs.

2 Likes

That’s the approach I’m taking for now; a port up in the 5xxxx range. I only turned this on a few hours ago and checked that it works as expected. We’ll see what happens.

I still have two other systems that I should do something with, but they’ve been running fine despite the the routers they’re on simply forwarding port 80.

But this whole thing has me curious: There’s no noticeable effect on the quality of the data when those incessant re-starts were happening. How is that even possible? I would think data during a re-start would be lost. Is there some kind hardware buffer for the data? Or is a re-start not really re-starting the whole device, just some of the upper-level threads?

The datalog is 5 second resolution and the restart only takes a couple of seconds to be sampling again. It takes a few more seconds to start the uploaders because it involves querying them, but the uploads work from the datalog which as you guessed are a sort of buffer.

1 Like