No time update issue

Hi community,

I have had a problem that I have not seen mentioned in the forums since 2020. Some specifics about my IoTa. I got it on that “faulty” SD bunch, but I had set updates to minor. Now the problem:

The IoTa went silent about the 16th of March. Checking the logs I note a series of problems.

  1. My IoTa seems to be constantly updating between version 02_07_05 and 02_08_02. This has happened at least 10 times through early March.
  2. Once it did stay on version 02_08_02, I started getting a “Heap memory has degraded below safe minimum, restarting” message multiple times a day. I saw that in the earlier versions of the firmware, this was caused by poor wifi strength, but it seems that it is consistently on the -40 to -50 dB range.
  3. Then, starting on the 16th through to the 27th of March, I have been getting a “No time update in last 24 hours.” This issue also seems to not be discussed in the community since about 2020.

Now is the cause of my problems the SD card? I had two identical IoTas (same batch, early Dec 2022) for which one I had to change the SD and since I have not had problems with it. Is it also possible to transfer my data (iotalog.log) unto the new SD if I go that route? Or can you only copy over the config file.

iotamsgs.txt (170.7 KB)

Cheers,
Elias

Prior to about March 15, the IoTaWatt was connecting to IP 192.168.50.113, which is a typical local LAN IP address assigned by a DHCP server. After March 15, the IoTaWatt has been getting IP address 169.254.46.141. This group of IP addresses usually are assigned when there is something going wrong with the DHCP server. It may not be working, or it may not have any IP addresses left to assign, or something else. In any event, the problem is with the WiFi.

There were also a few of incidences of getting that IP prior to March 15.
3/10/23 17:36:42z WiFi connected. SSID=Nanogrid_2.4G, IP=169.254.46.141, channel=7, RSSI -49db
3/10/23 17:30:22z WiFi connected. SSID=Nanogrid_2.4G, IP=169.254.46.141, channel=7, RSSI -51db
3/10/23 17:24:13z WiFi connected. SSID=Nanogrid_2.4G, IP=169.254.46.141, channel=7, RSSI -50db
3/05/23 11:50:11z WiFi connected. SSID=Nanogrid_2.4G, IP=169.254.46.141, channel=9, RSSI -50db
3/05/23 11:42:25z WiFi connected. SSID=Nanogrid_2.4G, IP=169.254.46.141, channel=9, RSSI -51db
3/05/23 11:36:05z WiFi connected. SSID=Nanogrid_2.4G, IP=169.254.46.141, channel=9, RSSI -50db
3/05/23 11:29:46z WiFi connected. SSID=Nanogrid_2.4G, IP=169.254.46.141, channel=9, RSSI -51db
3/05/23 11:22:01z WiFi connected. SSID=Nanogrid_2.4G, IP=169.254.46.141, channel=9, RSSI -50db
3/05/23 11:15:42z WiFi connected. SSID=Nanogrid_2.4G, IP=169.254.46.141, channel=9, RSSI -50db
and about a dozen more. All of these were followed within a few minutes by a low-heap restart.

At first glance it appears to start after the 02_08_02 upgrade on March 5, but looking back it occured on 02_07_05 as well:
3/04/23 01:25:00z WiFi connected. SSID=Nanogrid_2.4G, IP=169.254.46.141, channel=9, RSSI -47db
3/01/23 09:56:16z WiFi connected. SSID=Nanogrid_2.4G, IP=169.254.46.141, channel=9, RSSI -46db
2/13/23 19:16:28z WiFi connected. SSID=Nanogrid_2.4G, IP=192.168.50.113, channel=9, RSSI -51db
2/15/23 05:31:13z Updater: Invalid response from server. HTTPcode: -4
2/16/23 15:16:17z WiFi disconnected.
2/16/23 15:20:26z WiFi connected. SSID=Nanogrid_2.4G, IP=169.254.46.141, channel=8, RSSI -49db
2/22/23 19:08:10z WiFi disconnected.
2/22/23 19:08:52z WiFi connected. SSID=Nanogrid_2.4G, IP=192.168.50.113, channel=7, RSSI -50db
2/25/23 02:50:34z timeSync: No time update in last 24 hours.
2/26/23 02:51:04z timeSync: No time update in last 24 hours.
2/27/23 06:28:57z timeSync: No time update in last 24 hours.
2/28/23 06:08:39z WiFi disconnected.
2/28/23 06:12:51z WiFi connected. SSID=Nanogrid_2.4G, IP=169.254.46.141, channel=9, RSSI -47db
2/28/23 12:08:41z WiFi disconnected.
2/28/23 12:12:48z WiFi connected. SSID=Nanogrid_2.4G, IP=169.254.46.141, channel=9, RSSI -52db

So, you have WiFi issues. It’s true that te ESP8266 is not as robust as other systems when it comes to dealing with WiFi problems, but it is rock solid when the WiFi is rock solid. I always recommend setting up the router to always assign the IoTaWatt the same IP, set it to connect to the same AP if possible, and fix the channel of that IP to 1, 6 or 11.

This doesn’t appear to be SD related, but if you have one of the units from the order number range, I would recommend changing it out. If you provide your order number, I will send you a replacement industrial card.

Thanks for this, I went ahead and swapped out the SD but the issue is persisting. I will try disconnecting from the wifi and connecting back up because the wifi connection has been quite intermittent. I presume the configuration is saved right?

Any thoughts on how to set up the Wifi so that it always assigns the same IP on the IoTa? Any programmatic solutions would be great so as to not have to involve the residents in the smart house I’ve installed them.

The problem isn’t so much getting assigned different IP addresses, it’s that it is not getting assigned any IP address. That’s done in the DHCP server in the router. It could be something as simple as the pool of available DHCP IP’s configured in the router is exhausted.

The IoTaWatt uses DHCP protocol for IP address. That is, when the IoTaWatt connects, it gets assigned an IP by the router, and the router also passes other necessary topology like the subnet mask and gateway IP. The old method of every device being able to declare what IP address it wants to use is obsolete and problematic, so IoTaWatt does not do that.

Another issue that I see is that the router is using various channels. Again, the channel can usually be fixed in the router configuration. The recommended channels are 1, 6 and 11 because they do not overlap bandwidth as the other channels do.

1 Like

Got it, my background is in mechanical engineering (primarily HVAC) so my understanding of networks is limited and as I have deployed the system in a research house I wanted to see if there was a programming fix. Alas, I got access to the local router and fix the IP. This seems to have fixed all the connectivity issues as shown below:

4/13/23 20:55:46z WiFi disconnected.
4/13/23 20:55:51z WiFi connected. SSID=Nanogrid_2.4G, IP=192.168.50.115, channel=10, RSSI -51db
4/13/23 20:56:18z Updater: Invalid response from server. HTTPcode: -4
4/13/23 20:59:45z WiFi disconnected.
4/13/23 20:59:49z WiFi connected. SSID=Nanogrid_2.4G, IP=192.168.50.115, channel=5, RSSI -51db
4/13/23 21:45:37z influxDB_v2: stopped, Last post 01/01/70 00:00:00
4/13/23 22:32:19z influxDB_v2: stopped, Last post 01/01/70 00:00:00
4/13/23 22:32:50z influxDB_v2: stopped, Last post 01/01/70 00:00:00

** Restart **

SD initialized.
4/13/23 22:50:54z Real Time Clock is running. Unix time 1681426254
4/13/23 22:50:54z Reset Reason: Power-fail restart.
4/13/23 22:50:54z ESP8266 ID: 14650669, RTC PCF8523 (68)
4/13/23 22:50:54z IoTaWatt 5.0, Firmware version 02_08_02
4/13/23 22:50:54z SPIFFS mounted.
4/13/23 22:50:54z Local time zone: +0:00
4/13/23 22:50:54z device name: IotaWatr
4/13/23 22:50:57z Connecting with WiFiManager.
4/13/23 22:51:00z HTTP server started
4/13/23 22:51:00z influxDB_v2: Starting, interval:5, url:http://68.183.159.42:8086
4/13/23 22:51:00z WiFi connected. SSID=Nanogrid_2.4G, IP=192.168.50.115, channel=9, RSSI -48db
4/13/23 22:51:00z timeSync: service started.
4/13/23 22:51:00z statService: started.
4/13/23 22:51:00z Updater: service started. Auto-update class is MINOR
4/13/23 22:51:00z dataLog: service started.
4/13/23 22:51:00z dataLog: Last log entry 04/13/23 22:50:35
4/13/23 22:51:02z Updater: Auto-update is current for class MINOR.
4/13/23 22:51:02z influxDB_v2: Resume posting 04/04/23 19:58:35
4/13/23 22:51:05z historyLog: service started.
4/13/23 22:51:05z historyLog: Last log entry 04/13/23 22:50:00
4/13/23 23:51:26z Updater: Invalid response from server. HTTPcode: -4

Posting this I see that I have not fixed the channel. However, I do not think that is the source of the new problem arising, from InfluxDB. Going through the source code for IoTaWatt, it seems that the HTTPcode: -4, is incorrect. There is no such HTTP code. The error I got from the influx server was: " ts=2023-04-13T20:56:45.567982Z lvl=info msg=Unauthorized log_id=0gxQmRi0000 error="token required ". However, the token hasn’t changed, and by rebooting the IoTaWatt, it is now sending messages to influx just fine. Any thoughts on this? I presume it’s some software issue on the IoTa side since no changes have been made to Influx.

Cheers,
Elias

I want to further follow this up with today’s log. The issue is persisting even though I have fixed the IP.

SD initialized.
4/14/23 16:52:57z Real Time Clock is running. Unix time 1681491177
4/14/23 16:52:57z Reset Reason: Power-fail restart.
4/14/23 16:52:57z ESP8266 ID: 14650669, RTC PCF8523 (68)
4/14/23 16:52:57z IoTaWatt 5.0, Firmware version 02_08_02
4/14/23 16:52:57z SPIFFS mounted.
4/14/23 16:52:57z Local time zone: +0:00
4/14/23 16:52:57z device name: IotaWatr
4/14/23 16:53:00z Connecting with WiFiManager.
4/14/23 16:55:20z Did not connect after power-fail. Restarting to reset WiFi.

** Restart **

SD initialized.
4/14/23 16:55:21z Real Time Clock is running. Unix time 1681491321
4/14/23 16:55:21z Reset reason: Software/System restart
4/14/23 16:55:21z Trace: 34:5, 34:6[1], 34:10[2], 34:5, 34:5, 34:5, 34:6[1], 34:10[3], 34:5, 34:5, 34:5, 34:6[1], 34:10[10], 34:5, 34:5, 34:5, 34:6[1], 34:10, 29:101, 29:101, 29:101, 29:101, 29:102, 29:102, 29:103, 29:103, 31:105, 31:105, 31:106, 11:50, 11:55, 11:70
4/14/23 16:55:21z ESP8266 ID: 14650669, RTC PCF8523 (68)
4/14/23 16:55:21z IoTaWatt 5.0, Firmware version 02_08_02
4/14/23 16:55:21z SPIFFS mounted.
4/14/23 16:55:21z Local time zone: +0:00
4/14/23 16:55:21z device name: IotaWatr
4/14/23 16:55:21z HTTP server started
4/14/23 16:55:21z influxDB_v2: Starting, interval:5
4/14/23 16:55:21z timeSync: service started.
4/14/23 16:55:21z statService: started.
4/14/23 16:55:21z dataLog: service started.
4/14/23 16:55:21z dataLog: Last log entry 04/14/23 15:47:05
4/14/23 16:55:25z WiFi connected. SSID=Nanogrid_2.4G, IP=192.168.50.115, channel=10, RSSI -49db
4/14/23 16:55:25z Updater: service started. Auto-update class is MINOR
4/14/23 16:55:26z historyLog: service started.
4/14/23 16:55:26z historyLog: Last log entry 04/14/23 15:47:00
4/14/23 16:55:26z influxDB_v2: Resume posting 04/05/23 07:11:30
4/14/23 16:55:27z Updater: Auto-update is current for class MINOR.
4/14/23 18:55:55z Updater: Invalid response from server. HTTPcode: -4

I am at a loss at what is happening. I am certain there is no token typo, etc., since the IoTaWatt had been posting on the same location for months. The wifi speed does not seem to be a problem.

I don’t see a problem. From the logs it looks like it is uploading, albeit very slowly. You have 5 second frames, so if you are uploading a lot of measurements, that’s a lot of data. The WiFi name suggests it’s a 4G hotspot. Could it be that it’s just real slow? What does the status display show in the uploaders tab?

Hey Bob,

I don’t think that’s quite what’s happening here. Remember that both the speed (dB), and process has not changed over the last couple of months, I went through a whole Winter’s testing without as much of an issue. The process happening right now is the following:

  1. The IoTaWatt connects to wifi (after let’s say a reboot).
  2. It starts uploading data for about one hour.
  3. Then the IoTa gives a wrong token message ( HTTP -4), but we know the token is working since it was just uploading a little while ago.

I am unable to also find the IoTas on the local network (though they have green lights). Once I did the IP change and fixed it, they were discoverable for about 5,6 hours after which they went in this state (HTTP - 4 and cannot be found on the local network).

Any thoughts?

Cheers,
Elias

HTTP code -4, I admit is poorly documented, but simply means that the IoTaWatt is failing to connect to the endpoint. It doesn’t have anything to do with an invalid token.

Any proposed course of action? I am mostly surprised by the short period the fixed IP helped.

I have some important tests coming up this week so I don’t wanna remove them too much from the location. I will try to speak with the wifi provider of the house I have deployed my IoTas to see if they can run some basic diagnostics. I will also power cycle a few times more to see if I can get the interface back online to manually download the data I want.

Cheers,
Elias

All I can offer is that from what you are telling me, the IoTaWatt does not appear to be connecting to the influx server. No idea how the two are connected to the WiFi network. It’s possible the whole problem could be with the influx server. I’m pretty much in the dark here.

The problem is definitely on the IoTa side, the server is hosted remotely (digitalocean) and I have multiple devices and sensors pushing data to it.

I am mostly trying to understand the connectivity issues. Why on a network that multiple devices are on and have good connectivity, PIs, PCs, etc., do the IoTas struggle? I am also interested in the fact why this happened after the push of the newest version in March, until then I had a very consistent operation from both of them. I am also surprised that the push to a dedicated IP was working as a charm for a couple of hours and then stopped.

Just change your auto-update class to MAJOR to revert to the prior release. I’d be interested to know if your problems vanish.

Me too. You really think that is significant?

So I’ll ask again for the third time, are you using a 4G hotspot for WiFi?

Duplicate post removed

It really does sound like “what you have is a failure to communicate”. My experience with my Iotawatt devices has been good or better, but I have a lot of experience with wireless issues with other devices. It can be challenging. While it seems like the signal is at a decent level when Iotawatt connects, that doesn’t mean it stays that way. I record the signal level (RSSI) of many of my devices and they definitely vary by +/-10bB or more over the course of hours/days. My environment is extremely clean (hundreds of meters to nearest other signal sources) and I can still see my neighbors Wi-Fi. If your neighborhood/site is more congested, it might be there is significant interference from other sources that are causing disconnects.

I recently got a new water heater (same model as before, but 5 years of innovations from the manufacturer). I have a microcontroller talking to it and sending data to MQTT. This worked flawlessly on the old one. But, on the new one I was getting failures with similar frequency to yours. I did a site survey and discovered the signal was marginal. Funny thing is this is the same location as my Iotawatt devices. They continue working just fine. I added additional retry and reset code to my application and additional logging. It helped, but the real answer is going to be to improve the quality of the signal.

I would attach a serial terminal and log the data coming from Iotawatt. I believe it will tell you something about what is happening.

1 Like

Thanks for these! Funny that you mentioned a WH, because I have installed the IoTas in a residential building for the purpose of optimal controls, where I also do an MPC-based optimal controller for a WH and an HP. The HP connection worked seamlessly but I am also struggling with the WH communication being intermittent. My plan is to speak with the internet provider today and see what is happening with these intermittency issues.

It’s actually a heat pump water heater, so even funnier.

Hahaha same, we are testing both electric and HP operations for the WH. Go electrification.