Degraded heap while uploading to InfluxDB v2

After some guidance here. I’ve had InfluxDB v1 running for a while without any issue, however since creating my new v2 instance I’m not able to upload without the heap degrading and the upload restarting from scratch each time.

I’ve tried to turn on “heap” logging but every time I try to save the config in the editor I’m presented with the following message ERROR[0]:

The WiFi connection to the IoTaWatt is around RSSI: -46, so no issues there. Please find below a list of my measurements trying to be sent.

Any assistance would be appreicated.

OK, so unpacking these problems:

That’s a pretty large upload. Can I see the whole influxDB2 setup? Is this the only uploader that you have running or are you also still running the influxDB1 uploader and possibly others? Heap leakage is associated with heavy WiFi activity. Yes poor connection exacerbates that, but just doing very high volume writing can cause it. By uploading this huge amount of data from history. I can’t speculate more without seeing the whole picture.

I don’t know why the upload would start from scratch each time. I’d need to see the message log for a few restart cycles to understand what is happening.

I’ve said before that the while I’ve divulged the existence of this debug tool, it’s not a strictly supported feature. It needs to be turned on by editing the config file, which is a dangerous thing to begin with.

That said, and assuming you are adding valid Json, the protocol for uploading the config file has changed with release 02_06_00. The edited file must be uploaded as config+1.txt. The IoTaWatt will vet that file and if OK, will replace the existing config.txt with the new file, saving the old config.txt as config-1.txt. This new approach was enabled by recent support for renaming files that was absent from the core previously. It adds a layer of protection from a variety of timing windows and syntactic problems that previously could leave you without a valid config.

You might want to pare down the size of the data. Most folks are content with Watts in their influx database. Amps are useful at high resolution to determine circuit loading, but in general Watts is a good proxy as they do correlate very high with Amps. The Amps are available directly from the IoTaWatt.

Uploading kWh to influx at 5 or 10 second intervals is, IMO, useless. It is uploaded to three decimal places. a 1000W appliance will use .002778 kWh in 10 seconds. Uploading Wh is actually the same metric with better resolution. The same appliance will use 2.778 Wh in 10 seconds. Grafana and influx itself will scale that to kWh over time when reporting.

Another possibility is to let influx integrate Watts to get Wh. While you may not want to do that in your dashboard queries, you can add a continuous query to do the integration and add that metric to your database, thus relieving the IoTaWatt from uploading that volume of essentially redundant data.

1 Like

Yes. I also have InfluxDB1 running though I’ve tried manually stopping v1 so v2 is the only one running, same problem. InfluxDB2 full setup below.

I believe if the upload could resume from the last data uploaded it would eventually catch up to the most recent data. Message log as requested below. I’m not sure why there is the error regarding InfluxDB2 being invalid?

Message Log
6/28/21 20:40:33z IoTaWatt 5.0, Firmware version 02_06_04
6/28/21 20:40:33z SPIFFS mounted.
6/29/21 06:40:33 Local time zone: +10:00
6/29/21 06:40:33 Using Daylight Saving Time (BST) when in effect.
6/29/21 06:40:33 device name: IoTaWatt
6/29/21 06:40:33 HTTP server started
6/29/21 06:40:33 influxDB_v1: Starting, interval:10, url:http://192.168.10.109:8086
6/29/21 06:40:33 influxDB_v2: Starting, interval:5, url:http://192.168.10.253:8086
6/29/21 06:40:34 timeSync: service started.
6/29/21 06:40:34 statService: started.
6/29/21 06:40:34 dataLog: service started.
6/29/21 06:40:36 dataLog: Last log entry 06/29/21 06:40:30
6/29/21 06:40:39 historyLog: service started.
6/29/21 06:40:39 historyLog: Last log entry 06/29/21 06:40:00
6/29/21 06:40:40 WiFi connected. SSID=WiFi, IP=192.168.10.196, channel=2, RSSI -41db
6/29/21 06:40:40 MDNS responder started for hostname IoTaWatt
6/29/21 06:40:40 LLMNR responder started for hostname IoTaWatt
6/29/21 06:40:40 Updater: service started. Auto-update class is ALPHA
6/29/21 06:40:42 Updater: Auto-update is current for class ALPHA.
6/29/21 06:40:42 influxDB_v2: Start posting 01/01/21 01:00:05
6/29/21 06:40:44 Heap memory has degraded below safe minimum, restarting.

** Restart **

SD initialized.
6/28/21 20:40:45z Real Time Clock is running. Unix time 1624912845 
6/28/21 20:40:45z Reset reason: Software/System restart
6/28/21 20:40:45z Trace:  20:7, 20:8, 20:9, 20:91, 1:6[6], 1:1, 1:2[1], 9:0[1], 9:0, 9:1, 8:4, 8:6, 8:8, 8:9, 9:3, 9:5, 9:9, 1:2, 1:3, 1:3, 1:6[1], 1:6[2], 1:6[2], 1:6[2], 1:6[2], 1:6[3], 1:5[21], 1:6[4], 21:0, 21:1, 21:10, 21:10
6/28/21 20:40:45z ESP8266 ChipID: 6196660
6/28/21 20:40:45z IoTaWatt 5.0, Firmware version 02_06_04
6/28/21 20:40:45z SPIFFS mounted.
6/29/21 06:40:45 Local time zone: +10:00
6/29/21 06:40:45 Using Daylight Saving Time (BST) when in effect.
6/29/21 06:40:45 device name: IoTaWatt
6/29/21 06:40:45 HTTP server started
6/29/21 06:40:45 influxDB_v1: Starting, interval:10, url:http://192.168.10.109:8086
6/29/21 06:40:45 influxDB_v2: Starting, interval:5, url:http://192.168.10.253:8086
6/29/21 06:40:46 timeSync: service started.
6/29/21 06:40:46 statService: started.
6/29/21 06:40:46 dataLog: service started.
6/29/21 06:40:48 dataLog: Last log entry 06/29/21 06:40:40
6/29/21 06:40:51 historyLog: service started.
6/29/21 06:40:51 historyLog: Last log entry 06/29/21 06:40:00
6/29/21 06:40:52 WiFi connected. SSID=WiFi, IP=192.168.10.196, channel=2, RSSI -42db
6/29/21 06:40:52 MDNS responder started for hostname IoTaWatt
6/29/21 06:40:52 LLMNR responder started for hostname IoTaWatt
6/29/21 06:40:52 Updater: service started. Auto-update class is ALPHA
6/29/21 06:40:53 influxDB_v2: Start posting 01/01/21 01:00:05
6/29/21 06:40:59 Updater: Auto-update is current for class ALPHA.
6/29/21 06:41:00 Heap memory has degraded below safe minimum, restarting.

** Restart **

SD initialized.
6/28/21 20:41:01z Real Time Clock is running. Unix time 1624912861 
6/28/21 20:41:01z Reset reason: Software/System restart
6/28/21 20:41:01z Trace:  31:120, 21:100[31], 31:1, 1:6[6], 1:3, 1:3, 1:6[1], 1:6[2], 1:6[2], 1:6[2], 1:6[3], 1:5[31], 1:6[4], 31:0, 31:1, 31:2[5], 31:120, 21:100[31], 31:1, 1:6[6], 1:3, 1:3, 1:6[1], 1:6[2], 1:6[2], 1:6[3], 1:5[21], 1:6[4], 21:0, 21:1, 21:10, 21:10
6/28/21 20:41:01z ESP8266 ChipID: 6196660
6/28/21 20:41:01z IoTaWatt 5.0, Firmware version 02_06_04
6/28/21 20:41:01z SPIFFS mounted.
6/29/21 06:41:01 Local time zone: +10:00
6/29/21 06:41:01 Using Daylight Saving Time (BST) when in effect.
6/29/21 06:41:01 device name: IoTaWatt
6/29/21 06:41:01 HTTP server started
6/29/21 06:41:01 influxDB_v1: Starting, interval:10, url:http://192.168.10.109:8086
6/29/21 06:41:02 influxDB_v2: Starting, interval:5, url:http://192.168.10.253:8086
6/29/21 06:41:02 timeSync: service started.
6/29/21 06:41:02 statService: started.
6/29/21 06:41:02 dataLog: service started.
6/29/21 06:41:04 dataLog: Last log entry 06/29/21 06:41:00
6/29/21 06:41:07 historyLog: service started.
6/29/21 06:41:07 historyLog: Last log entry 06/29/21 06:40:00
6/29/21 06:41:08 WiFi connected. SSID=WiFi, IP=192.168.10.196, channel=2, RSSI -42db
6/29/21 06:41:08 MDNS responder started for hostname IoTaWatt
6/29/21 06:41:08 LLMNR responder started for hostname IoTaWatt
6/29/21 06:41:08 Updater: service started. Auto-update class is ALPHA
6/29/21 06:41:09 influxDB_v2: Start posting 01/01/21 01:00:05
6/29/21 06:41:10 Updater: Auto-update is current for class ALPHA.
6/29/21 06:41:17 influxDB_v1: Start posting at 06/29/21 06:40:00
6/29/21 07:41:20 Heap memory has degraded below safe minimum, restarting.

** Restart **

SD initialized.
6/28/21 21:41:21z Real Time Clock is running. Unix time 1624916481 
6/28/21 21:41:21z Reset reason: Software/System restart
6/28/21 21:41:21z Trace:  9:9, 1:2, 1:3, 1:3, 1:6[1], 1:6[2], 1:6[2], 1:6[2], 1:6[2], 1:6[3], 1:5[31], 1:6[4], 31:0, 31:1, 31:2[5], 31:120, 21:100[31], 31:1, 1:6[6], 1:3, 1:3, 1:6[1], 1:6[2], 1:6[2], 1:6[2], 1:6[3], 1:5[21], 1:6[4], 21:0, 21:1, 21:10, 21:10
6/28/21 21:41:21z ESP8266 ChipID: 6196660
6/28/21 21:41:21z IoTaWatt 5.0, Firmware version 02_06_04
6/28/21 21:41:21z SPIFFS mounted.
6/29/21 07:41:21 Local time zone: +10:00
6/29/21 07:41:21 Using Daylight Saving Time (BST) when in effect.
6/29/21 07:41:21 device name: IoTaWatt
6/29/21 07:41:21 HTTP server started
6/29/21 07:41:21 influxDB_v1: Starting, interval:10, url:http://192.168.10.109:8086
6/29/21 07:41:22 influxDB_v2: Starting, interval:5, url:http://192.168.10.253:8086
6/29/21 07:41:22 timeSync: service started.
6/29/21 07:41:22 statService: started.
6/29/21 07:41:22 dataLog: service started.
6/29/21 07:41:24 dataLog: Last log entry 06/29/21 07:41:20
6/29/21 07:41:27 historyLog: service started.
6/29/21 07:41:27 historyLog: Last log entry 06/29/21 07:41:00
6/29/21 07:41:28 WiFi connected. SSID=WiFi, IP=192.168.10.196, channel=2, RSSI -42db
6/29/21 07:41:28 MDNS responder started for hostname IoTaWatt
6/29/21 07:41:28 LLMNR responder started for hostname IoTaWatt
6/29/21 07:41:28 Updater: service started. Auto-update class is ALPHA
6/29/21 07:41:29 influxDB_v2: Start posting 01/01/21 01:00:05
6/29/21 07:41:30 Updater: Auto-update is current for class ALPHA.
6/29/21 07:41:32 Heap memory has degraded below safe minimum, restarting.

** Restart **

SD initialized.
6/28/21 21:41:33z Real Time Clock is running. Unix time 1624916493 
6/28/21 21:41:33z Reset reason: Software/System restart
6/28/21 21:41:33z Trace:  20:9, 20:10, 20:12, 20:14, 1:6[6], 1:1[6], 1:2[7], 9:0[7], 9:0, 9:1, 8:4, 8:6, 8:8, 8:9, 9:3, 9:5, 9:9, 1:2, 1:3, 1:3, 1:6[1], 1:6[2], 1:6[2], 1:6[2], 1:6[2], 1:6[3], 1:5[21], 1:6[4], 21:0, 21:1, 21:10, 21:10
6/28/21 21:41:33z ESP8266 ChipID: 6196660
6/28/21 21:41:33z IoTaWatt 5.0, Firmware version 02_06_04
6/28/21 21:41:33z SPIFFS mounted.
6/29/21 07:41:33 Local time zone: +10:00
6/29/21 07:41:33 Using Daylight Saving Time (BST) when in effect.
6/29/21 07:41:33 device name: IoTaWatt
6/29/21 07:41:33 HTTP server started
6/29/21 07:41:33 influxDB_v1: Starting, interval:10, url:http://192.168.10.109:8086
6/29/21 07:41:34 influxDB_v2: Starting, interval:5, url:http://192.168.10.253:8086
6/29/21 07:41:34 timeSync: service started.
6/29/21 07:41:34 statService: started.
6/29/21 07:41:34 dataLog: service started.
6/29/21 07:41:36 dataLog: Last log entry 06/29/21 07:41:30
6/29/21 07:41:39 historyLog: service started.
6/29/21 07:41:39 historyLog: Last log entry 06/29/21 07:41:00
6/29/21 07:41:40 WiFi connected. SSID=WiFi, IP=192.168.10.196, channel=2, RSSI -41db
6/29/21 07:41:40 MDNS responder started for hostname IoTaWatt
6/29/21 07:41:40 LLMNR responder started for hostname IoTaWatt
6/29/21 07:41:40 Updater: service started. Auto-update class is ALPHA
6/29/21 07:41:41 influxDB_v2: Start posting 01/01/21 01:00:05
6/29/21 07:41:44 Updater: Auto-update is current for class ALPHA.
6/29/21 07:41:51 influxDB_v1: Start posting at 06/29/21 07:41:00
6/29/21 10:42:11 Heap memory has degraded below safe minimum, restarting.

** Restart **

SD initialized.
6/29/21 00:42:12z Real Time Clock is running. Unix time 1624927332 
6/29/21 00:42:12z Reset reason: Software/System restart
6/29/21 00:42:12z Trace:  1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:3, 1:6[1], 1:6[2], 1:6[2], 1:6[3], 1:5[21], 1:6[4], 21:0, 21:1, 21:10, 21:10
6/29/21 00:42:12z ESP8266 ChipID: 6196660
6/29/21 00:42:12z IoTaWatt 5.0, Firmware version 02_06_04
6/29/21 00:42:12z SPIFFS mounted.
6/29/21 10:42:12 Local time zone: +10:00
6/29/21 10:42:12 Using Daylight Saving Time (BST) when in effect.
6/29/21 10:42:12 device name: IoTaWatt
6/29/21 10:42:12 HTTP server started
6/29/21 10:42:12 influxDB_v1: Starting, interval:10, url:http://192.168.10.109:8086
6/29/21 10:42:13 influxDB_v2: Starting, interval:5, url:http://192.168.10.253:8086
6/29/21 10:42:13 timeSync: service started.
6/29/21 10:42:13 statService: started.
6/29/21 10:42:13 dataLog: service started.
6/29/21 10:42:15 dataLog: Last log entry 06/29/21 10:42:10
6/29/21 10:42:18 historyLog: service started.
6/29/21 10:42:18 historyLog: Last log entry 06/29/21 10:42:00
6/29/21 10:42:18 WiFi connected. SSID=WiFi, IP=192.168.10.196, channel=2, RSSI -41db
6/29/21 10:42:18 MDNS responder started for hostname IoTaWatt
6/29/21 10:42:18 LLMNR responder started for hostname IoTaWatt
6/29/21 10:42:19 Updater: service started. Auto-update class is ALPHA
6/29/21 10:42:20 influxDB_v2: Start posting 01/01/21 01:00:05
6/29/21 10:42:22 Updater: Auto-update is current for class ALPHA.
6/29/21 10:42:27 influxDB_v1: Start posting at 06/29/21 10:41:40
6/29/21 10:53:45 influxDB_v2: Config parse failed
6/29/21 10:53:45 influxDB_v2: Invalid configuration.
6/29/21 10:53:46 influxDB_v2: stopped, Last post 01/01/21 06:18:35

Ah, I see! I tried to read through the documentation to see if I was missing something but perhaps this isn’t documented yet?

That is something I have thought of doing to resolve the problem, though my InfluxDB1 config is only slightly smaller and it doesn’t have any issue uploading, even stopped it for a few weeks while I was testing and when resumed experienced no issue uploading from the IoTaWatt.

InfluxDB1 Setup

I do like having all the data possible in Influx so I can graph it in Grafana. But fair point regarding uploading kWh. I’d ideally like to upload the Watts and Amps for each measurement.

Thanks for your assistance @overeasy.

@overeasy Just wondering if you are able to provide some feedback on my last reply.

My primary point was that it looked like the upload is just too big for the memory available. I understand that you ran without the influxV1 running for a few weeks, but the log that you posted has both running. So while that may not be the problem, I don’t know that and the symptoms are very indicative of simply using too much heap.

I understand that this appears to be very similar to your influx1 specification, but actually, it’s significantly different. There are 50% more measurements, and the inclusion of the units key increases the size of a single frame (timestamp) considerably.

IoTaWatt has uploaders for four different external servers. I don’t place any restrictions on number that can be configured or size of content to be uploaded. The way heap is used is complex and it’s not possible to predict what will happen with multiple uploaders. You are free to just try it and if it chokes, as yours appears to be doing, then it’s too much. I could avoid controversy and limit to one uploader running and even limit the number of measurements. 90% of users would be unaffected. But I don’t. You are free to push the limits, with the understanding that if you exceed the unit’s capability, it will fail and you need to back off.

I’ve suggested that IoTaWatt can natively supply the kind of data you are uploading, and metrics like Amps at 5 second resolution have little value in a cloud database when the data is available quickly and easily from the IoTaWatt itself.

With influxDB2, I may find it necessary to limit the number of measurements for another reason. The flux query to determine last upload time is a single query that essentially lists all of the measurements along with their related keys. In this case, the query alone is probably excessively long.

There is a lot of redundancy in your setup as well. You have the units encoded into the measurement name as well as in a key field. Is there a reason why you need both? If the measurement name is the same for different units, the units key is removed, and the field-key is specified as $units, then you would have Watts, Amps, and even Wh combined into a single measurement and reduce the frame size by nearly a factor of three.

Hey @overeasy could this also cause the web interface to die?

I haven’t see the logs mentioned but same sort of situation where influxV1 was fine but V2 crashes in as little as 2hrs

How can I delete influx V1? I hit delete and it exits but is still there when I go to status page or check uploaders

If it’s not going away, then your updated config is not being accepted. Can you show the message log after you tried to delete influxDB1?

Hey Bob,

I tried to do it all thought the web interface

Do I have to delete via the file manager?

same here - had V1 up then switched to V2. V1 won’t delete via the web interface.

1 Like

I will have to search but does anyone have link to process of deleting manually. I had a look via file manager and can’t see a save button or does it auto save?

@victripper @dheatherly, see my post here:

That fixed the influxdb V1 configured issue but even with influx V2 stopped the web interface seems to timeout.

It’s still able to be pinged but seems the first ping is majorly delayed

ping 192.168.1.15 PING 192.168.1.15 (192.168.1.15): 56 data bytes 64 bytes from 192.168.1.15: seq=0 ttl=255 time=457.032 ms 64 bytes from 192.168.1.15: seq=1 ttl=255 time=61.502 ms 64 bytes from 192.168.1.15: seq=2 ttl=255 time=23.181 ms 64 bytes from 192.168.1.15: seq=3 ttl=255 time=10.163 ms 64 bytes from 192.168.1.15: seq=4 ttl=255 time=33.357 ms 64 bytes from 192.168.1.15: seq=5 ttl=255 time=31.130 ms ^C — 192.168.1.15 ping statistics — 6 packets transmitted, 6 packets received, 0% packet loss round-trip min/avg/max = 10.163/102.727/457.032 ms

Once I ping the device responds again. Checking arp before and after doesn’t change as exists.

I can also still see the data flowing to home assistant fine

Down again. Ping recovered the device. Only happened since V2 has been installed.

I had influxdb and homeassistant integration turned off. Not sure if should look for a way to factory reset and start over?

SD initialized.
10/02/21 03:53:53z Real Time Clock is running. Unix time 1633146833
10/02/21 03:53:53z Reset reason: Software/System restart
10/02/21 03:53:53z Trace: 1:3, 1:1, 1:2[1], 9:0[1], 9:0, 9:1, 8:4, 8:6, 8:8, 8:9, 9:3, 9:5, 9:9, 1:2, 1:3, 1:3, 1:1[1], 1:2[2], 9:0[2], 9:0, 9:1, 8:4, 8:6, 8:8, 8:9, 9:3, 9:5, 9:9, 1:2, 1:3, 10:2, 10:3
10/02/21 03:53:53z ESP8266 ChipID: 6147440
10/02/21 03:53:53z IoTaWatt 5.0, Firmware version 02_06_05
10/02/21 03:53:53z SPIFFS mounted.
10/02/21 13:53:53 influxDB_v1: invalid URL
10/02/21 13:53:53 influxDB_v1: Invalid configuration.
10/02/21 13:53:53 Local time zone: +10:00
10/02/21 13:53:53 Using Daylight Saving Time (BST) when in effect.
10/02/21 13:53:53 device name: iotawatt
10/02/21 13:53:53 HTTP server started
10/02/21 13:53:53 influxDB_v2: Starting, interval:10, url:http://xxx xxx.xxx.xxx:8086
10/02/21 13:53:53 timeSync: service started.
10/02/21 13:53:53 statService: started.
10/02/21 13:53:53 dataLog: service started.
10/02/21 13:53:54 dataLog: Last log entry 10/02/21 13:53:50
10/02/21 13:53:57 WiFi connected. SSID=xxx, IP=192.168.1.15, channel=1, RSSI -52db
10/02/21 13:53:57 MDNS responder started for hostname iotawatt
10/02/21 13:53:57 LLMNR responder started for hostname iotawatt
10/02/21 13:53:57 Updater: service started. Auto-update class is MINOR
10/02/21 13:53:58 historyLog: service started.
10/02/21 13:53:58 historyLog: Last log entry 10/02/21 13:53:00
10/02/21 13:54:00 Updater: Auto-update is current for class MINOR.
10/02/21 13:54:00 influxDB_v2: stopped, Last post 10/01/21 13:46:50

This has all the characteristics of a WIFI/browser issue. Try fixing the IoTaWatt IP in your router and accessing the IoTaWatt using the IP address.

I thought the same thing so have moved the device to my main wifi ssid which runs both 2.4 and 5ghz radios from my dedicated 2,4gz only network.

I have a static IP for the device which has been the same since installed. I have the same for a lot of my network and then have dhcp only outside the static range.

Weirdest part it all started when influxdb V2 was setup.

I have multiple PC and phones that are attempting connection but all fail when the issue is current. I can. However ping the device from the PCs I have quick access to a terminal on to do this

Dunno what to tell you. You report the problem with influx stopped. Perhaps it has something to do with the underlying core being different. Are you saying that it was working fine under 02_06_05 with influxDB1?

Is it easy to back influx config, delete influx V2 and see what happens?

Does it still keep running while stopped killing CPU or memory etc?

Edit: V2 deleted easy via web interface. Will see if it plays better now

Just stopping with the stop button in the status display will stop all related WiFi and release the substantial memory used for buffers.

Can I ssh in to the device to check more stats?