InfluxDb issues

overeasy · October 8, 2018, 3:12pm

https://docs.influxdata.com/influxdb/v1.6/

The other thing you can do is change the id in the first tag and start over with a smaller set of measurements.

daniweb · October 8, 2018, 3:42pm

I did not expect delete from influxDB…
I created a new DB (also a new id) and started to report there … again stop at 10:40:40

overeasy · October 8, 2018, 5:36pm

Sorry, this isn’t going well. I just reviewed the log you posted earlier today. I can see that Emoncms managed to get past 10:40:00 but it may be that there is current log damage causing the problem with influx.

Are you specifying a starting date of 10/8/18 in your influx configuration? If so, could you try deleting that database and restarting influx without a starting date? That should cause it to start posting as of the current time, unless I missed something.

If that fails to solve it, I may have more time this evening to look deeper into it. I would need the message log and the config.

daniweb · October 8, 2018, 6:45pm

I reduced all measure name len, removed device ID and started to Las all from scratch I will tell you…

daniweb · October 10, 2018, 8:08am

Just to understand, when you are making the query of the last data, are you asking all measurements of the database, or only the one of this device?

daniweb · October 10, 2018, 9:09am

I had a look to the code, and probably important to put back the tag and unique value.

I think that if this is not done we may get all measures, also the one from other iotawatt or any other device reporting to this influxDB.

overeasy · October 10, 2018, 11:36am

That’s right. All measurements for the most recent time in the database. Since you looked up the query, you can try it using the CLI and see what comes back. It can be very large.

Sometime in the future I will rewrite that to do multiple queries of the actual measurements using the last() function.

daniweb · October 12, 2018, 7:59am

I tis still re-loading all data… but sometimes it stops…

Have a look there

The low heap could generate this error on the sent CLI ?

daniweb · October 12, 2018, 8:30am

And when it hangs like that I press Stop and this warning comes:
20180622_iotawatt_messageWhenStop
I press OK and the it reloads the page, from now I can tra to stop and restart

daniweb · October 12, 2018, 9:46am

In some other cases you replace the inf… but I have the impression not when sending to influxDB
Looks like that inf is not allowed for InfluxDb.

I was looking a bit on the code…
in influxDB.cpp line 315 you have … if(value == value) -> I do not know very good cpp but is this usefull and not always true

overeasy · October 12, 2018, 1:53pm

No, this is caused by corruption in your datalog. That data is for August 23, 9:24am UTC. Not sure all of the ways to get an inf value, but I think divide by zero can generate that. It looks like you’re trying to post Amps, and that’s a factor of VA so IoTaWatt gets the value by dividing VA by voltage, so I suspect there is a zero voltage value in there somewhere. I can bulletproof that.

The app uses a hash to insure that the config file you edited is the same as the one in IoTaWatt.

It’s not true if the value is NaN.

daniweb · October 12, 2018, 1:58pm

if it is inf, what is the result?

Somehow an inf is arrived to the CLI…

overeasy · October 12, 2018, 5:05pm

I don’t know, I’ll need to research that.

daniweb · October 18, 2018, 8:56am

I’m interested on the PF of the different devices (sometimes not expected values).
The AMPS are also interesting, but I have to make a choice as I can not send W, AMPS and PF for all channels…
AND influxDB does not permit to make calculations between different measurements, therefore I can not simply have it calculated.
A way would be to put all values on 1 measurement but not the best from influxDb performance.

But by writing this I woukld see 2 independent ways of possible enhancement on influxDb sending:

Permit to select multiple units for a measurement
Currently you send tagId, measurement name, timestamp and value for each measurement.
When I want to have Watt and Amps for the same input we need to create 2 measurements, like that:
BUAABRP,id=1 value=0.544 1539851315 BUAABRW,id=1 value=35.43 1539851315

We may consider to optimize and permit to select multiple units for a measurement,
for example like that you need less CLI string length and you also send in addition the amps:
BUAABR,id=1 w=35.43 pf=0.544 a=0.279 1539851315

or more extrème you also send in addition the volts, amps and va with less space:
BUAABR,id=1 w=35.43 pf=0.544 a=0.279 va=64.17 v=233.1 1539851315

Pro: considerable CLI len optimization and keeps good performance approach for influxDb
Cons: maybe a bit more effort on implementation than 2)

Report all as 1 single measurement using the tagIs as measurement, example with only 4 measurements to 1:
Before:
BUAABRP,id=1 value=0.544 1539851315 BUAABRW,id=1 value=35.43 1539851315 CUIMENP,id=1 value=0.034 1539851315 CUIMENW,id=1 value=0.02 1539851315
After:
itw01, BUAABRP=0.544 BUAABRW=35.43 CUIMENP=0.034 CUIMENW=0.02 1539851315

Pro: considerable CLI len optimization and I thnk it is small implementation effort
Cons: Not optimized for influxDb

I know that you have other priorities on development, but please take this in consideration (even if with low priority)

overeasy · October 18, 2018, 1:04pm

This is effectively the same as sending three measurements:

BUAABR,id=1 w=35.43 1539851315
BUAABR,id=1 pf=0.544 1539851315
BUAABR,id=1 a=0.279 1539851315

Of course the problem is that the resulting post message is just too long. The solution as I see it is to use a compressed message which influx supports. As you can see, the above measurements have very little unique content, and so will compress very well. Unfortunately there is no compression capability available in the ESP8266, so I will have to write one. Years ago I did an LZW codec, and gzip is quite a bit simpler, so I’ll put that on the list for a future project.

daniweb · October 18, 2018, 1:43pm

Technically my example BUAABR,id=1 w=35.43 pf=0.544 a=0.279 1539851315
is one measurement with 1 tag and 3 fields.

The
BUAABR,id=1 w=35.43 1539851315
BUAABR,id=1 pf=0.544 1539851315
BUAABR,id=1 a=0.279 1539851315

I do not know if it is allowed and what would be the result to update the same measurement with multiple times with a different field and same tag.

PS: anyway at the moment you are sending alvais the field name value

overeasy · October 18, 2018, 9:36pm

name: BUAABR
time Amps PF Watts iota

1539897860000000000 0.014 0.004 0.01 demox
1539897870000000000 0.019 0.025 0.06 demox
1539897880000000000 0 0.031 0 demox
1539897890000000000 0 0.004 0 demox
1539897900000000000 0.035 0.02 0.08 demox
1539897910000000000 0.002 0.123 0.03 demox
1539897920000000000 0.019 0.091 0.21 demox
1539897930000000000 0.011 0.034 0.05 demox
1539897940000000000 0.007 0.048 0.04 demox
1539897950000000000 0.002 0.099 0.03 demox
1539897960000000000 0.009 0.069 0.08 demox
1539897970000000000 0.004 0.1 0.04 demox
1539897980000000000 0.004 0.101 0.05 demox
1539897990000000000 0.013 0.078 0.12 demox
1539898000000000000 0.003 0.158 0.05 demox
1539898010000000000 0.008 0.061 0.06 demox
1539898020000000000 0.002 0.125 0.03 demox
1539898030000000000 0.002 0.129 0.03 demox
1539898040000000000 0 0.01 0 demox
1539898050000000000 0.011 0.056 0.07 demox

This is a series where three separate measurements were sent for each time value with different field names and values. So it is allowed and has the desired result.

Admittedly, the present design allows you to do this with only a single measurement name, but that can be easily changed. The config utility requires that output names be unique, and if you enter a second with the same name, the new output simply replaces the original. That rule can be changed to define unique as different name or units, so that multiple outputs could be entered, one for each unit.

With compression, that would not result in excessive message length. I’ll put that on the list.

Jam · October 21, 2018, 1:56pm

I got also that: influxDB: last entry query failed: -11
Looks like my server is too slow answering (takes 16s, cpu 100%) for that query. One easy way to ‘fix’ it, is rising timeout (make it configurable?) for slow servers.
But when testing that query, when I also select with time (where), look that where is performance problems with older data. For my case, older than about 144h (something about shards config?) seems to take that 16s, newer comes immediately. Using begdate option (influxBeginPosting) should be easy way to control select performance. Here is patch I wrote. I havent’t set env for compiling image yet, but it should work

influx-time.diff.txt (1.1 KB)

overeasy · October 21, 2018, 6:16pm

I know it looks easy to just change the HTTP request timeout, but in the big picture, there are some unintended consequences. The ESP8266 has a lot of limitations, and IoTaWatt is up against the boundaries of a few of them. In particular, WiFi activity and the associated TCP/IP activity at lower levels is limited in capacity. IoTaWatt manages that by controlling the resources used by services in order to keep the amount of TCP activity below recommended limits. 20 seconds, or even 10 seconds, is a long time to leave those resources blocked. Better to look at the reason why a transaction takes 20 seconds and fix that.

I should have anticipated that the influx service would be overloaded and/or the influx server underpowered. As more and more IoTaWatt are put into service, every limit is tested. This last posting query works great for a relatively small number of measurements, but quickly grows out of range with increasing measurements.

So the solution suggested in the previous issue also would apply here. Rather than use a single query to check all measurements for the last posting, it would be better to issue multiple queries for individual measurements using the last() function. That’s the action item for a future release.

Jam · October 21, 2018, 7:22pm

Yes, best option is to make querys fast, just that now some people (like me) have ongoing problem to feed influx. Rising timeout is easy way to get feed back up&running.
I got code compiled and I don’t see any problems with just bigger timeout. Maybe because it normally happens just once at boot. I also tested adding time to where-part and it looked working ok, but when opened web-page to there, it crashes with Exception 3 (it somehow corrupts memory??).

I ran test that queried each measurement last time separately from influxdb (with wget) and looks like 12 of those takes 1.44s, rest just 0.02s. Adding ‘where time >= 1539986400s’ drops all query times to 0.01s.

Anyway, for a time being I just run custom fw with longer timeout