I have two IoTaWatt units which are doing the same thing.
The heap decreases with time over about a 25min period and then a heap too low message gets put in the log and they restart.
I have read the other threads on this issue but they do not seem to be directly applicable.
I am not communication with any data uploaders or servers only HA on both of them using the HA integration with the default setup for the integration. Neither have any integrators configured or anything else just providing the standard outputs from the IoTaWatt to HA.
The WiFi on both is good -65 to -60 on one and -55 to -50 on the other.
The WIFI does not seem to be the issue here and it has also been reported that communication with HA at higher rates can cause issues but that is not the case here either so not sure what is creating this issue.
This has not always been the case but not sure when it started but maybe an update on either side could have caused it.
Open to suggestions and whatever information is required to help trouble shoot this issue.
Thanks
Since this happens every ~25 minutes, maybe you could simply disable the Home Assistant integration, restart both IoTaWatt devices, and then see if the problem persists or not. At least this way you will know for sure if HA is causing the issue or not.
Also, what version are your IoTaWatt devices running? What version of HA are you running?
Good points I meant to add the versions but forgot.
Disabling the HA IoTaWatt integration does not change anything as the heap errors still occur. So it seems to be in the IotaWatt itself and communicating with HA is not the issue.
For reference The firmware on both IoTaWatts: 02_08_03.
HA core: 2025.4.4, HAOS: 15.2
So what could cause the heap degradation in both the IoTaWatts without any external influence?
Thanks
I am running the same version on both of my IoTaWatts.
Great question. Unfortunately, I don’t even have a hypothesis to offer at this time.
Here is what my busier IoTaWatt heap looks like over the past 90 days. As you can see, its heap is pretty stable. I do not have it connected to home assistant. I do have it uploading data to InfluxDB. I am also periodically (every 30s) querying it from my Hubitat Elevation home automation hub.
Hopefully @overeasy will have some suggestions for where to look…
Full message log might answer that. Please download from IoTaWatt and upload to forum.
I wish my heap looked that stable.
I just changed to a different access point before I extracted the message file to see if that had any effect but it did not. This from one of the two IoTaWatts that the heap is degrading on.
Than
iotamsgs.txt (1.2 MB)
ks
The problem started on the morning of April 25 - so about 8 days ago. It is solidly restarting about every 30 minutes. The unit appears to be working OK despite the 8-10 second lapse caused by the restart. For more than a year prior to April 25, it typically ran for the six weeks to the routine restart (needed to avoid issues with millisecond clock overflow). So something changed on April 25
The trace indicates that the IoTaWatt threw an exception about that time while handling a query. It was in the middle of trying to allocate buffers for the response. From your description of the environment, the only source of that query would be the HA integration. However, you say that this problem persists even when the HA integration is suspended, so I believe that exception was more of an effect than a cause - i.e. the heap was degrading already and the buffer allocation failed because of degraded heap.
Stepping back, the only known problem related to heap degradation is poor WiFi. This is most likely caused by a memory leak in the LWIP code that is a black box to applications running on top. In other words, it can’t be fixed except to remedy the underlying cause of the errors.
In your case, while you state good WiFi, it’s interesting to note that your RSSI went from a typical value in -59 to around -68 to -70 on April 4. It ran OK at that level until April 25 when this restart cycle started. So the question is what happened on the morning of April 4 to cause the WiFi to lose 10db (which is a lot, db is not a linear scale), and then what complicated it on the 25th to cause the WiFi errors that lead to heap memory leakage.
Environment can be a big factor. The IoTaWatt WiFi operates in the 2.4GHz range. So does most other WiFi in the neighborhood as well as many other IOT devices. It’s a pretty busy space. The first thing I would do is switch that AP to one of the non-overlapping channels 1,6 and 11. If that doesn’t help, you might want to consider relocating your AP to gain beter RSSI. The RSSI is displayed in the WiFi tab of the status display so you can do a “can you hear me now” test as you try new locations.
More sophisticated APs have the ability to set the radio power level. You can look at that as well. But bottom line is that I’m pretty confident it’s the WiFi. If you want to verify that, you can disconnect the WiFi for say an hour (TOOLS->WIFI->DISCONNECT), which is twice the MBTF, and see if it stays up. You can tell what happened after you reconnect by looking at the message log.
To reconnect, simply power cycle the IoTaWatt and you will get the RGG led for three minutes as with a new unit indicating it’s configuration AP is active. Connect to it as per the new unit docs and then reenter your WiFi credentials. Be aware that the passkey to connect to the IoTaWatt AP is the device name - in his case IotaDown.
I ran the test you suggested using the disconnect button and during the disconnect period no heap errors were produced so when WiFi is connected it produces the heap errors but when disconnected it does not.
I hear what you are saying about WiFI signal but it is also a lot like vodoo.
You can move one item in the house and the signal will change so there needs to be a reasonable tolerance for what signal level is acceptable.
Another point is if you look at the message file I had sent at the end of the file I changed to a temporary different WiFi access point and the signal level was -55 and the heap degradation persisted. So this did produce a much better WiFi signal and based on what I have read in the forum the -55 should be very good but there was no improvement in the heap issue.
The second IoTaWatt unit that we have not really addressed has the same heap issue with a WiFI signal that is always -55 to -50 as I stated before.
So it seems that the WiFi communications is causing the heap issue but even with an improvement in WiFi signal it has not resolved it.
Any other ideas on this to try?
Thanks
OK, I guess I misread the original problem. So it is occurring on both units. Given that disconnecting from WiFi on one unit stops the degradation, I think that is strong evidence that the problem is with the WiFi. I’m not a big believer in coincidence when it comes to technical problems. That it is occurring in both units is strong evidence that the problem is precipitated by some change in your WiFi environment. That may be a version change in your AP, or it may be the introduction of some other device in the airspace.
I wouldn’t call it voodoo, but despite the various applicable standards it is a kludge of independent efforts to build and supply technology to interconnect. It doesn’t always work, especially in the IOT arena where IoTaWatt lives. The IoTaWatt firmware has visibility to only the top of the stack, which is built on hardware and software that is adapted to the ESP8266 and supplied as part of the IDE.
Thousands of IoTaWatt, around the world, are coping with their WiFi environments without this problem. Yours did so for several years as well. I can only get you to the point of pinpointing the cause. Now its up to you to make changes to try to resolve the problem.
It’s worth pointing out that the IoTaWatt firmware you are running has not changed for more than a year.
Did the other unit also styart doing this around April 25?
Has your AP firmware changed?
Are there any new IoT devices introduced around April 25 that may be misbehaving?
Have you tried my suggestion of switching to one of the non-overlapping WiFi channels 1,6 or 11?
Do you have or can you borrow a different router to see if the problem goes away, which would suggest it is the router rather than another device in the airspace?
Just to try and add a hypothesis… I know that my IoTaWatt devices seem to be somewhat affected by high levels of broadcast traffic. I eventually found a misbehaving SiliconDust HD HomeRune device that was spewing a ton a mDNS multicast traffic on my UniFi home network. This was the source of many issues that I observed. Simply unplugging this device resolved all of the issues.
Thus, @antimatter, one possible troubleshooting step you can try is to power off fully about half of your home network attached Ethernet and WiFi devices. Then restart the pair of IoTaWatts and see of the heap degradation persists. If yes, power off/disconnect the other half of the devices and power back on the first group. Repeat this process until you can hopefully zero in on a misbehaving device on your home network.
This is very good advice. I would add to it that you should also limit the 2.4GHz WiFi channel width to 20MHz. This ensures minimal interference from other 2.4GHz traffic.
If the problem persists on 2.4GHz WiFi channel 1, switch to channel 6, and then channel 11 if necessary to see if that helps.
What type of network hardware are you using for your access points? If Ubiquiti UniFi, I recall having some IoTaWatt issues on specific AP firmware versions a few years ago.
I have not added any new network devices and I can not think of any changes around that time. My network has been unchanged for quite a while…
Yes I have and there was no difference.
My hardware is Netgear WiFi 7 except for one older Netgear that serves as the access point for the IoTaWatts because unfortunately the IoTaWatts will not connect to the newer Netgear router.
I have not tried this yet but there could be something there based on my latest test.
What I did do was used my laptop and setup an adhoc network which was isolated from my main network and only connected to one IoTaWatt.
So laptop to IoTaWatt only.
Doing this resulted in NO degradation of the heap.
I moved the laptop around to adjust the WiFi signal strength and got it as low as -78 with no issues or heap degradation. So under these conditions it was not sensitive to the WiFi signal strength and resulted in no heap degradation.
What does this mean, not sure maybe a network traffic issue? The main router that is doing all the work is rated to handle a large number of clients, much more than what I have on the network and the router that the IoTaWatts are connected to is lightly loaded and is acting only as an access point and is hardwired to the main router.
At this point the path forward may be the above suggestion and basically strip down the network to a minimum of devices and see what happens.
It would be nice if the IoTaWatts could connect directly to my main router but I know the only way that will work with an ESP8266 based device is to force the ESP8266 WiFi into G mode using this command “WiFi.setPhyMode(WIFI_PHY_MODE_11G)” but I am guessing there is no way to do that within the confines of the IoTaWatt code.
Any other suggestions are appreciated.
Thanks
.