Multiple IotaWatt data aggregation

dr.diesel · November 12, 2020, 9:35pm

Hello all,

I fully understand the complexity of what I’m about to ask, but it would make for a really slick new feature!

I have a main and 3 sub panels at my house and would like to suggest the following:

Input data aggregation from all IoTa’s to a single web interface. (I realize this can be done via Influx)
With the aggregated data flowing between all IoTa’s it would then be possible to create an Output from sensors across multiple devices. This would solve the issue of having to create complicated Influx/Graphing queries.

Many thanks to @overeasy for such a great device.

BR

frogmore · November 13, 2020, 2:42am

Yeah, I would like that too. I have 2 IotaWatts, but I suspect you and I are in the minority for having more than one.

I haven’t gotten around to setting them up to output to influx. But once I do I suspect it will give me what I need.

dr.diesel · November 13, 2020, 11:48am

I’m an .rpm Linux guy and Influx isn’t in the official Repos, so I’m really hesitant to go that route. Maybe I’ll try and use the API (wonder how hard I can hit this without causing issues?) to post to MariaDB which would give me the best scenario as I’m very comfortable and already have substantial DBs already in house.

A MySQL/MariaDB built in feature would also be handy. Searching around the forum I hoped to find someone that had already gone through it via python!

SandyB · November 13, 2020, 3:00pm

You could also consolidate into an EmonCMS instance - it’s intended to run on Linux/Apache/PHP and you can configure it to save feed data into MySQL/MariaDB (it needs one for it’s meta data anyway). The other question is how you are going to analyze/visualize the data, One of the main reasons for going down the Influx path is that it integrates to Grafana. EmonCMS has its own dashboard/graph modules.

promy · November 13, 2020, 3:22pm

I also have multiple (diy) iotawatt. I’m sending the info to a local emoncms (running on a Raspberry Pi) and also an influxDB/Grafana combo (on another RPi).
The only drawback i’m having is that the data from different Iotawatt is not “synced” in Iotawatt as far as i know.
I mean when you add different inputs from different iotawatt in emoncms and the Iotawatt is “only” sending it every 10 seconds you could be adding X_time0 withe Y_time10. Not a very big difference, but a slim one.
Not sure how grafana/influxDB is handeling this.

dr.diesel · November 13, 2020, 3:32pm

Thanks for all of the responses. I have extensive home built legacy automation stuff here that is all custom written for mariadb, then visualized with Grafana.

I have just written a very basic/simple python script to pull data and store via maria, please see the attached:

import requests, time
from bs4 import BeautifulSoup
import mysql.connector

while True:
	url_to_scrape = 'http://iotawatt/query?select=[Input_0.volts,Input_0.hz,MainPowerPhase1.watts,MainPowerPhase2.watts,AdditionPhase1.watts,AdditionPhase2.watts,ShopSubPhase1.watts,ShopSubPhase2.watts,FrontArray1.watts,FrontArray2.watts,LivingRoomHVAC.watts,DravensPC.watts,MomsOffice.watts,usedwatts.watts]&begin=s-10s&end=s-5s&group=5s&format=csv&header=no'
	plain_html_text = requests.get(url_to_scrape)
	soup = BeautifulSoup(plain_html_text.text, "html.parser")
	x = soup.prettify()
	y = x.strip('\n')
	y = y.split(", ")
	print(y)
	if y[0] != "null":
		mydb = mysql.connector.connect(
		  host="192.168.2.41",
		  user="ha",
		  password="blabblah",
		  database="homeautomation"
		)
		mycursor = mydb.cursor()
		sql = "INSERT INTO iotawatt1(voltage, hertz, mainsphase1, mainsphase2, additionphase1, additionphase2, shopsubphase1, shopsubphase2, frontarray1, frontarray2, livingroomhvac, dravenspc, momsoffice, usedwatts) VALUES(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"
		val = (y[0], y[1], y[2], y[3], y[4], y[5], y[6], y[7], y[8], y[9], y[10], y[11], y[12], y[13])
		mycursor.execute(sql, val)
		mydb.commit()
		mydb.close()
		print("Success")
	time.sleep(5)

I will continue to improve this, just wanted to create a very basic shell to see how hard/often I could hit the API.

frogmore · November 13, 2020, 4:43pm

It seems like you have the ability to do what you want. The line protocol of influxdb is documented. Influxdb is open source also, as I recall. You probably aren’t generating enough data that it matters that much what you use to store it.

I have been using influxdb and like it for its simplicity. But it took me a little while of studying it to get this point. You already use grafana which I think is great with visualization. There is a difference between a general purpose database (mariadb) and a time series based one (influxdb), but there is also overlap in what they can do. Influxdb has support for downsampling data, which you might not need right now, but might want after you have a few years of 5s data. There is also the new 2.0 version that includes Flux which is supposed to allow math between measurements. I will be looking into that.

But, it sounds like you have found something that works for you.

dr.diesel · November 13, 2020, 4:51pm

Thanks @frogmore

I’m kinda old school and as with anything legacy it takes quite a bit of effort to move forward. But I’d love to keep this kind of dialog open as it makes me realize I need to investigate/keep my eyes open as new tech comes along.

I will look into influx and why it’s not in the official repos, perhaps not fully open, however it’s easy enough to add a parallel datasets for eval.

overeasy · November 13, 2020, 5:25pm

If you have an internet connection, IoTaWatt uses NTP to synchronize time to better than 1 second resolution. Uploads to influx are time stamped with UTC. So there should be no issue with synchronization. Have you tried monitoring the same circuit with two IoTaWatt and then plotting with grafana from influx?

ogiewon · November 15, 2020, 5:20pm

@dr.diesel - one of the great benefits of using IoTaWAtt’s InfluxDB integration is that it automagically handles store and forward of data when the network is down. So, if you take your server offline, or your WiFi network for any reason, the IoTaWatt will back fill all of the missing data once the IoTaWatt can reach the InfluxDB server again.

promy · November 15, 2020, 7:06pm

There is no problem within IotaWatt of influxdb. The problem is within Openenergymonitor.
The data itself is ok, but if you make calculations with 2 inputs from different Iotawatt, it will use the data available, which is not always the current value.

frogmore · November 25, 2020, 2:51am

OEM doesn’t use influxdb. Influxdb prior to Flux does not do calculations across measurements. OEM is an alternate solution that allows you to do math on data that is coming in. I have not tried to do cross measurement math with it. When I need to do that, I typically use Node-RED with an MQTT source. But, IotaWatt does not speak MQTT and this makes some sense based on its goals.

Using MANY channels and trying to do math on them is going to be a hard problem to solve correctly 100% of the time. If that is your requirement, I don’t have a solution. I am not making billing accuracy decisions with any of the data, so being mostly right is okay for my needs. I would love it to be better, but I am probably not willing to pay what it would cost to enable that

overeasy · November 25, 2020, 3:39am

Couple of comments. I just resolved an issue with Node-RED where it was unable to handle IoTaWatt’s query response. That will be coming out in the next release. While working on that, it occurred to me that a Node-RED flow running on a timer could query IoTaWatt and publish MQTT messages. Not only would it be easier than doing so in IoTaWatt, but could also publish securely using TLS.

Along the same lines, query can be used to extract synchronized data from multiple IoTaWatt either by Node-RED, processing and writing to wherever.

frogmore · November 25, 2020, 6:22pm

Yes, that could work. As long as you limit the frequency of the query, the load on the esp8266 would probably not be too much. This would increase data/utilization of the Wi-Fi network, but should not be significant (and only slightly more than if IotaWatt was doing it on its own). There would be an increase in latency, especially if the query frequency was really slow, but for most things it would probably be fine.

overeasy · November 25, 2020, 6:59pm

Never used Node-Red, but when I get a few minutes I’m going to give it a try. From what I saw, I could issue a show=series query to get the names of the inputs and outputs, then pump that into a query select to get the data in Json and then parse that and blow out the mqtt messages.

Overheadwise, I think it’s pretty trivial with a single 5 second query.

frogmore · December 2, 2020, 2:05am

I have a proof of concept working for aggregation of data from two IotaWatts. To get it to work it is necessary to use Influx 1.8 and enable Flux. Then have to get Grafana v7.3.4 and set up the Influxdb datasource to use flux (which is in beta, but seems to work). Here is an example of a query that adds two values together.

# FinalOuput gets Grafana to use nicer names for the legend
FinalOutput = ["_field", "_time", "_value"]
# Flux generated an error when I changed the value of Sensor later (to get the 2nd channel, which is from a different IotaWatt)
SensorA = "MainA"
# database is set in the Grafana data source, measurement is iotawatt
# v.timeRangeStart/Stop is from Grafana so it follows what is set in the dashboard
MainA = from(bucket: "iotawatt/autogen")
  |> range(start: v.timeRangeStart, stop:v.timeRangeStop)
  |> filter(fn: (r) =>
    r._measurement == "iotawatt" and
    r._field == "Watts" and
    r.channel == SensorA
  )
  |> aggregateWindow(every: v.windowPeriod, fn: mean)
  |> set(key: "_field", value: SensorA)
  |> keep(columns:FinalOutput)
# the last two lines set a nice name (for _field) and remove extra stuff to keep Grafana from using it

# Now do the same thing for the other channel
SensorB = "MainB"
MainB = from(bucket: "iotawatt/autogen")
  |> range(start: v.timeRangeStart, stop:v.timeRangeStop)
  |> filter(fn: (r) =>
    r._measurement == "iotawatt" and
    r._field == "Watts" and
    r.channel == SensorB
  )
  |> aggregateWindow(every: v.windowPeriod, fn: mean)
  |> set(key: "_field", value: SensorB)
  |> keep(columns:FinalOutput)

# this is where the magic happens (math across measurements, which has not been possible before)
join(tables: {MainA: MainA, MainB: MainB}, on: ["_time"])
  |> map(fn:(r) => ({
        _time: r._time,
        _value: r._value_MainA + r._value_MainB,
    }))
  |> set(key: "_field", value: "Main")
  |> keep(columns:FinalOutput)

This generates a set of data for my Main. I have one IotaWatt on one leg and another one on the other leg. Since this is generating a graph, I could have simply told Grafana to stack the MainA and MainB measurements and it would give a similar visual result. But, what about a single stat box to show the maximum usage?

FinalOutput = ["_field", "_time", "_value"]
MainA = from(bucket: "iotawatt/autogen")
  |> range(start: v.timeRangeStart, stop:v.timeRangeStop)
  |> filter(fn: (r) =>
    r._measurement == "iotawatt" and
    r._field == "Watts" and
    r.channel == "MainA"
  )
  |> aggregateWindow(every: v.windowPeriod, fn: max)
  |> set(key: "_field", value: "MainA")
  |> keep(columns:FinalOutput)

MainB = from(bucket: "iotawatt/autogen")
  |> range(start: v.timeRangeStart, stop:v.timeRangeStop)
  |> filter(fn: (r) =>
    r._measurement == "iotawatt" and
    r._field == "Watts" and
    r.channel == "MainB"
  )
  |> aggregateWindow(every: v.windowPeriod, fn: max)
  |> set(key: "_field", value: "MainB")
  |> keep(columns:FinalOutput)

join(tables: {MainA: MainA, MainB: MainB}, on: ["_time"])
  |> map(fn:(r) => ({
        _time: r._time,
        _value: r._value_MainA + r._value_MainB,
    }))

With this query I then tell Grafana to give me the maximum value it gets from the query.