influxDB support

matt · April 10, 2018, 1:07am

Great Thanks! This is a helpful tip; the error from iota at the time was actually that this query kept failing. This helped me resolve that.

During my experimentation I also somehow ended up with new series named…

There must be a concatenation bug somewhere…

This was a bit rhetorical – I intended to raise a question regarding why do we have a static value here that isn’t valuable in any queries. I also intended it to question you for other use cases that may not be clear to me (i.e. was there tooling that isn’t obvious already built elsewhere that required this…)

This comment makes me think we are talking past each other here. I don’t understand the issue your raising with cardinality (I’m only proposing tags for the sensor input names – those are strings and shouldn’t change).

Your concerns about storage are answered here (in which they recommend device identifiers are added as tags )

Setup background: I am running Influx in a Docker instance on an intel i5 quad (not an rPi!) consuming lots of local IoT metrics (not just IoTaWatt) at a rate of (100 events/s @ 2% cpu usage and only 5MB/12hr) so I agree that I’m not a normal user… never claimed to be.

I also acknowledge your point about the limits of the ESP device – that’s why I’d love to have all of this data in influx with as little CPU/BW as possible and avoid complicated script execution!! JFYI – to level-set context – I interact with global-scale Influx-like environments on a daily basis, I probably am making assumptions on shared-experiences that aren’t there, I’m just trying to steer these metrics with some lessons learned from a different life.

Anyway – let’s ground a use-case here.

My examples in the previous post were – bad – and lacked context. I have a lot of half-baked ideas for the IoTaWatt (this is why I love the OSS format, thank you again for all your work) but here are the ones I am currently working on (happy to contribute back if others are interested), that I’d love to see full support for with the new influx format.

Goals for influx data:

Calculate un-monitored components (W/A/VA/PF/AC Cycles on channel).
Calculate total usage across Right/Left of 220v input (W/VA/PF)
Integrate for KWH across specific components
Calculate VA/PF derivative across specific components/sensors
Estimate W shifts/error based on V shifts on non-shifting components (i.e. LED 100khz smps constant current pwr supply won’t shift W usage based on voltage changes so these effectively detect real IoTaWatt measurement errors).
Other Data-is-fun projects using ML which require lots of “signal” instead of just /watts/.

Setup:

I use 12 inputs at the moment, and I have a web-scraper that monitors some of the internal IoTaWatt metrics (from the json); side-bar - Publishing those cycle counts/total counts to variables that could make it to influx would be great.
2 inputs are on the 120v US split-220v-main; right and left.

Acknowledging our goals are different… and recognizing scripting could make this easier AND also to avoid talking past each other, lets try to solve my first goal – purely in InfluxQL with 15m increments (with the intent to graph in grafana) using the various data-models we have discussed and that are built today:

BETA InfluxQL:


In influx-client:

create continuous query combined_cq on iotawatt

RESAMPLE EVERY 15m

begin

select sum(“watts”) AS “total_watts” into combined from (

select integral(value, 15m) as “watts” from /^power_.*/ GROUP BY time(15m) fill(null)

) GROUP BY time(15m) fill(null)

end;
create continuous query combined_cqv on power

resample every 15m

begin

select mean(value) as “volts” into combined from voltage group by time(15m) fill(null)

END;
create continuous query combined_cq_other on power

resample every 15m

begin

select integral(value,15m) as “counted_watts” into combined from /^sensor_.*/ group by time(15m) fill(null)

END;
In Grafana:

Watts Graph

Calculate Unlabeled Power:

select total_watts - counted_watts from combined WHERE ($timeFilter)
Calculated Labeled Power:

select total_watts from combined where ($timeFilter)
Calculate Total Power:

select integral(value, 15m) as “watts” from /^power_.*/ where ($timeFilter) GROUP BY time(15m) fill(null)

Amps Graph:

Unlabeled:

select (total_watts - counted_watts)/volts from combined WHERE ($timeFilter)
Labeled:

select total_watts/volts from combined where ($timeFilter)
Total:

select integral(value, 15m)/volts from /^power_.*/ where ($timeFilter) GROUP BY time(15m) fill(null)

In Alpha


Watts Graph:


Unlabeled:

select (integral(power_right_w,15m)+integral(power_left_w,15m)-(integral(“sensor1”,15m)+integral(“sensor2”,15m)+…) from iotawatt WHERE ($timeFilter) GROUP BY time(15m)

a) List each sensor here in expansion … regexes not supported in integral function ;(
Labeled:

select (integral(power_right_w,15m)+integral(power_left_w,15m) from iotawatt WHERE ($timeFilter) GROUP BY time(15m)
Sum Labeled:

select integral(“sensor1”,15m)+integral(“sensor2”,15m)+… from iotawatt where ($timeFilter) GROUP BY time(15m) fill(null)

a) List each sensor here in expansion … regexes not supported in integral function ;(

Total: select (integral(power_right_w,15m)+integral(power_left_w,15m) from iotawatt WHERE ($timeFilter) GROUP BY time(15m)
Removes the need for CQ’s – this is great and a huge improvement above the current usage of series, still somewhat complicated queries that aren’t definable within Grafana (harder for someone to just use grafana to set this up) and requires maintenance/mapping manually.

Using tagging:

Where the output tag is the sensor_output name, and watt/amp/hz is a field name.


Watts Graph:


Unlabeled:

select integral(watt,15m) -

( select integral(watt,15m) from iotawatt where output=/^sensor_.*/ and ($timeFilter)  GROUP BY time(15) fill(0))

from iotawatt where output=/^main_.*/ and ($timeFilter)  GROUP  BY time(15) fill(null)


Labeled:

select integral(watt) from iotawatt where ($timeFilter) GROUP by time(15m) fill(null)

a) Each sensor: select integral(watt) from iotawatt where ($timeFilter) GROUP by output, time(15m) fill(null)


Sum Labeled:

select integral(watt) from iotawatt where output=/^sensor_.*/ and ($timeFilter)  GROUP by output, time(15m) fill(null)


Total:

select integral(watt,15m) from iotawatt where output=/^main_.*/ and ($timeFilter) GROUP BY time(15m) fill(null)

Almost fully data-driven – only one query is difficult to replicate (the subquery) within grafana. It can probably be achieved in another way with Grafana. Grafana’s alias by tagging is really nice combined with this tag grouping [$tag_output]

There may be easier ways to do each of these things, just trying to highlight the benefits of tagging when writing data-driven queries.

overeasy · April 10, 2018, 3:00am

First the easier stuff. There is a use case and I’ll try to explain it. There is interest in a lot of sectors (energy auditing, property management, solar PV networks) to monitor multiple installations. In that scenario, the measurements are the collection of power sensor outputs (fields) related to each installation, at each time interval. The individual sites are defined and categorized using tags. All of the data is uploaded to a single influxDB database. Depending on the industry the data is sliced and diced into region, user, organization, facility type, whatever. I have users who need this capability, so I’ve been leaning toward this relatively flat layout because it’s simple to understand, deploy, and use. Once the points are atomized by unique tags, these users would need to fall back on regular expressions, as you have, to accomplish what are simple tasks under the current scheme. With respect to influxDB I see the IoTaWatt as more a node in a network of measuring locations than a single home appliance.

I admit that I’m over my head but that’s my point. When it gets that complicated, it’s time to question if that’s what the device should be doing. As you say, that’s the beauty of OSS, you can fork it and rework the influxService to your liking.

So thinking out load about how that works… In the old implementation, the input name was the measurement name. IMO the IoTaWatt measures power every 5 seconds. I like to equate that to a measurement. So maybe each of the (14) inputs becomes a different point within the same measurement? as in:
power,output="main_left" value=932.61 152236784
That produces a tag output with a cardinality equal to the number of outputs, a relatively low value. But it also forces the data for one 5 second interval to be stored as 14 points (thanks for the technical references).

true that. Not going to dump my resume, but let’s just say I’m rusty. I’m wearing a lot of hats (all the hats) trying to get this in line and working reliably. I appreciate that you understand I have limitations. Right now I’m in the middle of sourcing accessories from Asia, regulatory and safety compliance testing, developing distribution channels, and organizing this influx/grafana component to make the device more useful. I’d love to just program and maybe port to the ESP32, but that’s not the job right now.

Thanks for all your input.

overeasy · April 10, 2018, 2:03pm

@matt,

I’ve been working with this some more and looking at it from a different perspective. Here’s what I’m thinking:

For each “output” from IotaWatt - produce a measurement that:

Has a measurement name of the output.
Has a field called “value” that is the output metric.
Contains the global tag-set specified for that IotaWatt.

This is pretty much like the original implementation except for the addition of the tag-set, which will allow for identification of nodes, regions, etc.

What drove me to this was my inability to get grafana to produce a composite graph from a multi value variable of fields. I could easily make a query to do it, but I couldn’t get grafana to do that. Their forum was unresponsive.

What I would like to know is if this would better accommodate your needs?

fredrike · April 10, 2018, 8:42pm

This sounds exactly what grafana is experiencing (and the way HomeAssistant is uploading data to influxdb).

May I suggest that the measurement part follow the unit of the measurement too (Watt, VA, PF, Wh, etc.).
I know that the bandwidth will increase with this format but it is apparently the “correct way”. (And if all measurements for all input could be sent as a long string that is newline separated it should not be that big issue.)

overeasy · April 10, 2018, 9:41pm

I was thinking of doing that. So you are saying instead of using the field key “value” use “watts” or “amps” etc? That makes sense. I was trying to come up with a way to specify multiple units for one measurement, so for instance you could ask for watts and pf for a circuit and the entry would have two entries in the field set for those units. In any event, the format leaves the door open for that in the future.

That’s how it works now. When it gets behind, or if bulk-send is greater than 1, it will aggregate the measurements into a larger HTTP transactions.

fredrike · April 11, 2018, 6:46am

According to this https://docs.influxdata.com/influxdb/v1/guides/write_data/ I think the following format should be perfect:

watts,input=1,name=MyFirstInput value=0.55 1422568543702900257
amps,input=1,name=MyFirstInput value=2.0 1422568543702900257

etc…

matt · April 12, 2018, 3:44am

@overeasy – @fredrike – I’m just happy we’re discussing this. I haven’t submitted any PR’s yet, just a happy user, will look into doing so in the future. (Would you prefer PR’s in GitHub over the forums here?)

Influx isn’t your only consumer and this data-model complexity may not be worth it (I can always reorganize locally). Just popped in since I saw the database format was changing. I think my queries previously capture the format/query goals I was trying to advocate.

If i had to come up with a UI for this, perhaps it would be a IoTaWatt input label + some check boxes:
INPUT_1_LABEL [ amps ] [ watts ] [ va ] [ pf ] ...
OUTPUT_X configured_output_unit [ enable ]

Which would then correspond to an influx “line protocol” of something very simlar to @fredrike’s proposed format:
<unit>,name=INPUT_1_LABEL,input=1 <configured value>=XX.XXf <time>
^^ i really liked the name and the input here – allows some improved mapping from the measurement to the database without double de-aliasing.

matt · April 12, 2018, 3:50am

Just to highlight the tags here – Grafana’s alias via tags is a very powerful feature that I’ve grown to love.

fredrike · April 12, 2018, 8:28am

matt:

<unit>,name=INPUT_1_LABEL,input=1 <configured value>=XX.XXf <time>
^^ i really liked the name and the input here – allows some improved mapping from the measurement to the database without double de-aliasing.

We should also support user custom tags here to uniquely identify each IoTaWatt unit.

Should we really have <configured value> should that not “always” just be value?

I created an issue on github a couple of days ago and in the comment I show how data from Home-Assistant is represented in influxdb vs. data from IoTaWatt (Better support for outputs · Issue #145 · boblemaire/IoTaWatt · GitHub), allowing grouping by (in the Home-Assistant case) friendly_name allows for really easy and powerful graphs in grafana.

Don’t forget the ability to use Variable syntax | Grafana documentation for selecting subset of “inputs”.

overeasy · April 12, 2018, 3:05pm

With some pressure to finalize this, I’ve had to make some decisions. My sense is that while it doesn’t seem possible to accommodate all of the suggestions above, the input has helped shape an approach that should be workable for everyone.

I take some comfort in seeing that the influData folks themselves have different approaches in the several examples that they use. For instance the basic CPU load measurement examples and the entomological census example. It all bears on how the data will be used.

So here’s what I have in first level testing:

As hinted at a few days ago, the basic line protocol measurement is:

<output-name>[,<1st-tag-key=1st-tag-value[,2nd-tag-key=2nd-tag-value[,....]]] value=<output-value> <time>

I liked the idea of using the measurement units instead on “value” in the field-set-key, but this doesn’t seem to play well with grafana in specifying aggregation functions with regular expressions. The basic configuration screen looks like this:

Notice that there are some new specifiers:

retention policy has been added. I’m not sure why this hasn’t been mentioned before by anybody, but it needed to added and if specified will be associated with the measurements. If left off, influx will use the default policy.
upload history from: If specified, establishes the starting point for uploading available history. If there are any entries in the database, qualified by the first tag, the time of the last entry will override. This is mostly useful when first setting up an influx server, or for cleaning existing data and uploading a modified data view.

The status display now has a tab for influxDB:

The influx service can now be started and stopped. This way, you don’t need to discard your configuration to suspend updating. The current running/stopped state is displayed, and the time of the most recent posted measurements is continuously displayed. When the service is stopped, it will query the database for most recent updates when restarted. So if you drop the most recent measurements, it’s possible to upload fresh measurements using a modified specification.

I hope to have this available for ALPHA by the weekend. The standard disclaimers apply to ALPHA subscribers.

fredrike · April 12, 2018, 6:41pm

I’m sorry to say but this is not a very good solution either. <output-name> should be a “tag” and the first parameter should be “unit” as this allows really sweet grouping in grafana.

<unit>,name=<output-name>[,1st-tag-key=1st-tag-value[,2nd-tag-key=2nd-tag-value[,....]]] value=<output-value> <time>

Should however be perfect.

Edit:
This allows for queries like this in grafana (where $temp_id is a variable set from a dropdown menu).

overeasy · April 12, 2018, 8:07pm

I was working with a different view:

I guess it depends on what you are trying to do. Your temperature example does reinforce the notion that these measurements can be introduced into a database including other forms of energy. There’s no perfect solution. If it’s any consolation, it would be possible to offer different influx schemes in the future. Id rather keep that in reserve than try to make one user interface do all things for all people.

Thanks for the input.

overeasy · April 13, 2018, 9:17pm

CCCAs I worked to finalize this today, I added the ability to override the defaults using variables. It’s possible that this will accommodate all of the schemes that have been advanced. I’ve defined three variables:

$device = the “Device name” in the setup/device section.
$name = the name of the current output.
$units = the units associated with the current output.

I’ve added two new specifications to the influxDB configuration:

measurement = The name to be used for each measurement. Default is ‘$name’
field-key = the string to be used as the key in the field set. Default is ‘value’

The new variables can be used in the following specifications:

measurement
tag value
field-key

So now, using the defaults, we get the following line protocol measurements:

<output name>[,tag-key1=tag-value1[,tag-key2=...]] value=<output value> <time>

If we specify measurement=$units and add a tag name=$name we get:

<output units>,name=<output-name> value=<output value> <time>

Watts,iota=test,name=input1 value=156.38 1523648480
Watts,iota=test,name=input2 value=467.56 1523648480
Watts,iota=test,name=input3 value=155.14 1523648480
Volts,iota=test,name=voltage value=118.96 1523648480

That should look familiar to @fredrike .

I think @matt case has measurement=iotawatt, a tag name=$name, and field-key set to $units:

iotawatt,name=<output name> Watts=<output value> <time>

iotawatt,iota=test,name=input1 Watts=155.46 1523648580
iotawatt,iota=test,name=input2 Watts=468.35 1523648580
iotawatt,iota=test,name=input3 Watts=154.72 1523648580
iotawatt,iota=test,name=voltage Volts=118.93 1523648580

In both examples there is an extra tag ‘iota’ which could be removed, just trying to make it easier to clean up after posting these examples to my database.

The rules are:

The variable names are case sensitive.
Variable names are replaced as they are encountered in a string, left to right. So if you have a tag-value of ‘$name.$units’, for output mains with units Watts the tag value would be ‘mains.Watts’ if the tag-value were ‘circuit.$name.02’ you would get ‘circuit.mains.02’.
All of the values that influx defines as strings can be specified in these fields with any characters except: white-space, " (double quote), ’ (single quote). They will be inserted into the relevant influx expressions unquoted.
If the $ is used outside of one of the variable names, it will be treated like any other character.

I’m thinking that this will allow complete flexibility in uploading data without injecting a lot of complexity. I’ll build dashboards around the default case, and it could end up that down the road contributed dashboards will require a specific data view. That’s something for anyone to consider before deviating from the defaults.

Here’s what I hope will be the final configuration page:

matt · April 15, 2018, 10:11pm

I look forward to trying this out in 05 sometime this week! Thanks for all of your work @overeasy!

philb · June 24, 2018, 9:05am

I am attempting to get uploads to InfluxDB working, but no luck yet. The logs show I am getting an 400 from ‘last entry query failed’. Logs attached, the multiple 400 blocks are me attempting various changes to the config. Also screenshot of config.

The database exists, I can send/query random data from the commandline from a machine elsewhere on the network so shouldn’t be privledges/firewall issues.

Any ideas what is going wrong here?

log.txt (2.1 KB)

overeasy · June 24, 2018, 4:43pm

I think so. The HTTP code 400 is in response to the initial query where IoTaWatt is trying to determine the time of the last successful post. I think it’s failing because of how you are specifying the first tag set in your configuration.

I’m sure you’ve read the influx section in the wiki, but maybe you want to take another look at the way variables are treated. I’m probably too close to it and maybe didn’t explain it that well.

You are using $name for the value of the first tag-set entry that is used to identify the influxDB series associated with this IoTaWatt (for cases where multiple IoTaWatt are posting to the same influx database). $name, by definition, resolves to the name associated with each particular output that is defined. So during the initial context query, there is no particular output and so $name is undefined. I’ll look into how that corrupts the query and how to make this problem more obvious, but right now, I want to get your issue resolved.

Looking at the approach you’ve taken to sending measurements, the grid and heatpump measurements will have the same measurement name (watts) and the same field name (value). So you seem to want to differentiate the two using unique tags, name=[grid, heatpump]. That’s fine, but the first tag set should be something like device=$device. That tag will be the same for all of your measurements from this IoTaWatt and that’s important because the initial context query will use that, along with wildcards, to produce an initial query that looks at all of the latest measurements from this IoTaWatt to determine the last posting time.

So to sum up, try changing the first tag set to be device=$device and adding your unique identifier name=$name as the second tag set.

fredrike · June 24, 2018, 8:15pm

I have another stange issue with influxdb.

Everything worked fine until my influxdb server crashed (had a disk failure) now I can’t get my iotawatt (running Firmware version: 02_03_08) to upload data. It’s stuck on Running. Last update 01/01/1970 01:00:00 no matter what I do.

	"influxdb": {
		"type": "influxdb",
		"revision": 72,
		"postInterval": 5,
		"bulksend": 2,
		"url": "http://192.168.1.2",
		"database": "IoTaWatt",
		"measurement": "$units",
		"tagset": [
			{
				"key": "device",
				"value": "$device"
			},
			{
				"key": "entity_id",
				"value": "$name"
			},
			{
				"key": "unit",
				"value": "$units"
			}
		],
		"outputs": [
			{
				"name": "Bathroom",
				"units": "Watts",
				"script": "@4"
			},
			{
				"name": "Bathroom.kWh",
				"units": "kWh",
				"script": "@4"
			},
			{
				"name": "Inverter",
				"units": "Watts",
				"script": "@7"
			},
			{
				"name": "Inverter.kWh",
				"units": "kWh",
				"script": "@7"
			},
			{
				"name": "Total",
				"units": "Watts",
				"script": "@1+@2+@3+@4+@7"
			},
			{
				"name": "Volt",
				"units": "VA",
				"script": "@0"
			},
			{
				"name": "Voltage",
				"units": "Volts",
				"script": "@0"
			},
			{
				"name": "VVB",
				"units": "Watts",
				"script": "@2+@3"
			},
			{
				"name": "VVB.kWh",
				"units": "kWh",
				"script": "@2+@3"
			},
			{
				"name": "Washer",
				"units": "Watts",
				"script": "@1"
			},
			{
				"name": "Washer.kWh",
				"units": "kWh",
				"script": "@1"
			}
		],
		"stop": false,
		"begdate": null
	}

overeasy · June 24, 2018, 8:32pm

Could you post the message log please?

fredrike · June 25, 2018, 5:10am

It does not seem like influx is running…
But I can’t click the stop button under Status, I just get a error alert saying that the configuration changed.

** Restart **

SD initialized.
6/24/18 18:42:59z Real Time Clock is running. Unix time 1529865779 
6/24/18 18:42:59z Power failure detected.
6/24/18 18:42:59z Version 02_03_08
6/24/18 18:42:59z Reset reason: External System
6/24/18 18:42:59z Trace:  254:177, 4:9, 246:4, 157:190, 243:216, 131:106, 172:223, 115:3, 206:89, 165:18, 93:199, 86:35, 35:171, 198:1, 86:222, 81:201, 45:242, 87:140, 238:206, 86:20, 166:137, 86:168, 254:107, 209:195, 36:187, 174:4, 7:147, 188:195, 168:173, 69:242, 255:124, 94:84
6/24/18 18:42:59z ESP8266 ChipID: 1847056
6/24/18 19:42:59 Local time zone: 1
6/24/18 19:42:59 device name: iotawatt, version: 3
6/24/18 19:43:02 Connecting with WiFiManager.
6/24/18 19:43:05 MDNS responder started
6/24/18 19:43:05 You can now connect to http://iotawatt.local
6/24/18 19:43:05 HTTP server started
6/24/18 19:43:05 dataLog: service started.
6/24/18 20:11:13 dataLog: Last log entry 2/6/6 07:28:15 
6/24/18 20:11:13 statService: started.
6/24/18 20:11:13 timeSync: service started.
6/24/18 20:11:13 WiFi connected. SSID: Fredrik_AP, IP: 192.168.1.22
6/24/18 20:11:13 Updater: service started. Auto-update class is ALPHA
6/24/18 20:11:14 updater: Auto-update is current for class ALPHA.

overeasy · June 25, 2018, 12:30pm

The problem appears to be that the datalog is not current. The message log suggests it is damaged. Without more information it’s not possible to determine what might have happened. You probably need to delete the current log:

HTTP://iotawatt.local/command?deletelog=current

The influx service will not start if the datalog isn’t opened and current.