Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separating different data types #4

Open
hyperbart opened this issue Feb 12, 2017 · 22 comments
Open

Separating different data types #4

hyperbart opened this issue Feb 12, 2017 · 22 comments

Comments

@hyperbart
Copy link
Collaborator

More of a think excercise: how should we distinct data in InfluxDB/Grafana? Temp vs lux vs energy vs humidity etc.

  • Distinction by group address in query in Grafana?
  • Storing the DPT in the measurements table and thus storing redundant data?
  • Creating "DPT" like table and linking each measurement to DPT type in separate table? Populating this table by matching the GA and the DPT type described in the config file and iterating over the DPT table then?
@snowdd1
Copy link

snowdd1 commented Feb 18, 2017

Hello @hyperbart ,
Nice to "see" you again! Actually I started a similar project some days ago and now I found yours. I was planning to use a mariaDB database simply because I have one on my NAS, and found that grafana not yet supports time series data from MySQL/mariaDB, started writing my own Ajax for Chart.js...

But here is my comment: I would store the DPT in the measurements table, it's not redundant, it's a key. Otherwise you cannot convert % cleanly from the byte value sent over the bus, etc.

@hyperbart
Copy link
Collaborator Author

Aha, likewise! Just to set the record straight: all the work and effort goes to Firedrunk, I am just the client 😄.

Back on topic: I have had the same discussion with Firedrunk and one of the arguments of him was that the data stored in InfluxDB has already been processed and converted by a component before InfluxDB. Don't know the details, but I tend to take your side too on this issue, but if the data is already converted it seems a little but redundant. I know a little about KNX, but the whole technical story about DPT's and conversions not (yet).

@snowdd1
Copy link

snowdd1 commented Feb 19, 2017

Thing is, you simply can't distinguish between types that have the same digital signature in the DPT, like all 5.xxx types are 1 byte of payload. The interpretation "that's a percentage" and "that's a small integer" or "that's an arc degree" is up to the client. Can you provide the GA list to KNX.js to do the interpretation? I have never used that package as I run a knxd for hardware access.

@hyperbart
Copy link
Collaborator Author

That's the thing: the interpretation has already been done before the value gets stored in the InfluxDB.

In the environments JSON file you can specify which GA is which DPT.

Example:

https://github.com/FireDrunk/node-kninflux/blob/develop/example.environment.json

@snowdd1
Copy link

snowdd1 commented Feb 19, 2017

Ah that's cool. And da*n, I was just working on that, too. So I don't need that any more, I guess. Maybe I start a fork to make that package work with knxd too, I don't want to have two multicast services on one device. Or I would need an influxdb on my Synology box. mmmmh.
Btw what happens to dpt16 and other character telegrams?

@snowdd1
Copy link

snowdd1 commented Feb 19, 2017

Just looked at the evironment file. The only thing I didn't find is an automatic converter ETS5-GA-Export to evironment file, right?

I think that part I did yesterday for my own approach, with little changes it should be able to produce a file in that format, too.

@hyperbart
Copy link
Collaborator Author

Haha, no; that's something you would have to do manually. But by all means go ahead 😄.

@snowdd1
Copy link

snowdd1 commented Feb 19, 2017

I hate "manually". Always thought of getting it more automized for homebridge-knx as well, but apart from reading the GAs the exports were no good (especially if the schema is messy). With ETS5 you kann get a clean list of GAs with DPTs as XML, which didn't work well with ETS4 for me (might have been my fault, after all).
I load it up to GitHub next days for inspection!

@snowdd1
Copy link

snowdd1 commented Feb 19, 2017

I am still wondering how it tries to cast from DPST-9-1 to DPT9.001, because the .match(regexp) results are quite different. And

var m = 'DPST-9-1'.toUpperCase().match(/(\d+)(\.(\d+))?/)

has no m[3] as required some lines later for the subtype ID, as it doesn't contain the literal dot \. but a dash, and two of those. That should efficiently prevent all subtypes different from the default to be recognized if in ETS format.

==> @FireDrunk ?!

Are you sending READ telegrams at the interval for ALL addresses in the environment file? Wouldn't it be better to specify which to read in the file (or somewhere else)? Frankly I don't dare to test, I don't have a test house available...

@hyperbart
Copy link
Collaborator Author

Always thought of getting it more automized for homebridge-knx as well, but apart from reading the GAs the exports were no good (especially if the schema is messy).

I think the guys from Thinka facilitate that by allowing an upload of the ETS Proj File and then filtering out the stuff or renaming it in a Web GUI and further specify the kind of device which is behind a certain GA in the WebGUI.

Are you sending READ telegrams at the interval for ALL addresses in the environment file? Wouldn't it be better to specify which to read in the file (or somewhere else)? Frankly I don't dare to test, I don't have a test house available...

We do have some problems regarding that, because since the hole KNX traffic thing is async, it's not like TCP where you know that fragment 3 is a response to fragment 2. So, yeah, basically we are now flooding the bus in one big spike by sending out requests for all the addresses listed in the JSON file. The timeout kninflux waits might be too short so that we miss some responses (I have seen this in ETS, but if wanted I can test some things here and I am pretty sure some guys at Dutch tech-site Tweakers.net would like to test this too).

@snowdd1
Copy link

snowdd1 commented Feb 19, 2017

I wouldn't recommend waiting for a specific response. If you need to, send out the request, and teh answer will just happen. And try not to flood the bus with to many telegrams:

  • the bus speed (TP) is only 9600 baud
  • given that "normal traffic" is also present on the bus, flooding it will definitely suppress telegrams of the normal traffic
  • flooding it will also prevent the answers to come through! My bus has ~900 addresses, if I would flood it (would probably take around 15 secs to send at about 60 telegrams per second) I will probably not get any answers in that first 15 seconds, and probably loose READ telegrams as well as their repetition counter has expired, resulting in more loss of (expected) answers.

I recommend

  • Defining the required read requests and their timing in a config somewhere
  • Sending the read requests without actively waiting for an answer to that specific read request. That's simply not the way KNX is designed.
  • Listen to all addresses in the evironment regardless of READ requirements. Good KNX sensors actively send the values if something has changed (configure as active object in ETS)

Just things I have learned in tweaking my own house and trying to keep the WAF above ground zero 😄 for 6 years now.

BTW: I have pushed the stub at https://github.com/snowdd1/knx-ets5-ga-reader

@hyperbart
Copy link
Collaborator Author

hyperbart commented Feb 19, 2017

@snowdd1 : Yeah I know this isn't best practice, but @FireDrunk had his reasons which I am not able to explain here and I came up with the same arguments.

Offtopic: damn what? 600 addresses? 😮 EDIT: just checked mine, around 300, puts the 600 in perspective since there a lot of "To do's" here which will increase it probably too ;) .

I think your third bullet point is the most important one: just listen on the bus when a value passes by and put it in the DB. But we arrived at the point were @FireDrunk should jump in to give his view on things.

Aha, the magical all deciding WAF 😉

@snowdd1
Copy link

snowdd1 commented Feb 19, 2017

Offtopic: damn what? 600 addresses? 😮

Well, there goes lightning, roller shutters, floor heating, window contacts, glass break alarm, gas boiler, solar thermic buffer, fire place w/ boiler, garage doors, motion detectors, and: I love to know the details, so my thermostats send a lot of statistics (target values, current values, valve percentages etc.), as does the boilers etc.
I even built a small one-wire-to-knx raspberry to get more data from the water pipes from the heating ecosystem, and I am thinking to do something similar for garden (earth/soil) temperature and humidity.

@FireDrunk
Copy link
Owner

FireDrunk commented Feb 20, 2017

Hi @snowdd1, nice to see someone interested!

But here is my comment: I would store the DPT in the measurements table, it's not redundant, it's a key. Otherwise you cannot convert % cleanly from the byte value sent over the bus, etc.

Not entirly true, the value is stored in the interpreted value, not the byte value, so you don't need the DPT type anymore for value interpretation, you only need it for visualization (like knowing a number represents a temperature or a percentage in case of humidity).

Thing is, you simply can't distinguish between types that have the same digital signature in the DPT, like all 5.xxx types are 1 byte of payload. The interpretation "that's a percentage" and "that's a small integer" or "that's an arc degree" is up to the client. Can you provide the GA list to KNX.js to do the interpretation? I have never used that package as I run a knxd for hardware access.

From my understanding (which is little), you should be able to know what type of integer comes from a device, given the DPT type. So in my honost opinion it's NOT up to the client...

I am still wondering how it tries to cast from DPST-9-1 to DPT9.001, because the .match(regexp) results are quite different. And

I honestly don't know, i'm not familiar with the internals of the KNX DPTLib.

Flooding it will also prevent the answers to come through! My bus has ~900 addresses, if I would flood it (would probably take around 15 secs to send at about 60 telegrams per second) I will probably not get any answers in that first 15 seconds, and probably loose READ telegrams as well as their repetition counter has expired, resulting in more loss of (expected) answers.
You are quite correct, but there's a catch. NodeJS and the KNX module are fully asynchronous, so reading responses from the first reads will happen parallel to sending new requests.
So as long as the bus can keep up, it's not an issue.

To further explore the subject, i've been thinking about a 'load balancing' algorithm to spread out the readings in the specified interval.
The basic interval of 30 seconds is almost pointless, but it's good for testing. I think something like 5-10 minutes makes more sense.
But the current flow is still, that the application will create all the read requests at the same time. It might be good to spread out the creation of those requests in the total interval time.
But it's a tough nut to crack programming wise.

@snowdd1
Copy link

snowdd1 commented Feb 25, 2017

That's the thing: the interpretation has already been done before the value gets stored in the InfluxDB.
In the environments JSON file you can specify which GA is which DPT.
Example:
https://github.com/FireDrunk/node-kninflux/blob/develop/example.environment.json

So I've done it. You can now create a json file from an ETS5 XML export. The format is not completely like the one you'd need here, but the code should easily be adoptable to KNX.js environment format.

https://github.com/snowdd1/knx-ets5-ga-reader

I've also worked on a generic DPT parser that can read ANY DPT around (uses templates from the ETS). The results are a bit different from a pure numeric parser, as it also transports semantic meaning. Which, of course, you do not need to put in a database. If you have the original telegram (the bytes) and know the DPT you can get any information later at runtime.

And now it is published: https://www.npmjs.com/package/knx-dpt-parser

@snowdd1
Copy link

snowdd1 commented Feb 25, 2017

Back to the original issue: It would be sufficient to store the group addresses in the database. I do not know enough about InfluxDB yet, but it should be feasible in any database system to create a view or query that combines the data table with the metadata table.

@FireDrunk
Copy link
Owner

FireDrunk commented Feb 26, 2017

Theoretically yes, but I don't know if it's any good if we make Influx 'add' the type dynamically to all results according to relation logic. That's inefficient. Also it's more like storing data aiding in the visual representation of data inside the data, which is (a common) bad practice.

It's usually better to let the application that visualizes the data, to represent the data in a manner which properly understood and easily readable by the end user.

Perhaps we can introduce some form of 'grouping' sensors, so that we can query all the data more easily by selecting an entire group. If the group contains only sensors with the same DPT Types, it's easier to visualize the data all at once.

If you guys [@snowdd1] and [@hyperbart] still think this is a good idea, can you guys please come up with a real world example (detailed) of how you would like to visualize the data.

@hyperbart
Copy link
Collaborator Author

hyperbart commented Feb 28, 2017

@FireDrunk to me I could "live" with having just write a query that contains all the GA's I want visualized.
I am now completely with you about the concept of storing data to aid in visual rep and the bad practice.

Furthermore: since each and every KNX project can and will be very different there's no other way than just picking each and every individual GA and put it in the right graph/visu by adding it to the WHERE clause of a query I think...

@snowdd1 : what do you think?

@FireDrunk : however that does not mean I wouldn't like to have just copy of the raw data, basically the ingress table and another table with the processed data, just for troubleshooting purposes sometime and this would be a good "logbook" of having EVERYTHING that passes on the bus...

@snowdd1
Copy link

snowdd1 commented Feb 28, 2017

I hadn't had the time to look into grafana yet. How do I pull meta information to accompany the value data from the query? I am with you that storing meta data with each record is a waste of storage.

@hyperbart
Copy link
Collaborator Author

hyperbart commented Feb 28, 2017

You don't... You open your ETS file and go "aha right GA 1/3/1 is temperature bedroom, let me add that to my query". Next! "Aha, right GA 2/4/3 is temperature living room, let me add that to the query".

I have a select * right now since the only GA's I am storing in Influx are my temp values...

@snowdd1
Copy link

snowdd1 commented Feb 28, 2017

Buh, I thought one could just pump all values in there.... probably firedrunk is right then, a separated table or measure is required per type. Which, if what I've read about grafana influxdb now shouldn't be a problem nor a waste of storage, as grafana is a schemaless database which does not use fixed column datastore.

Edit: wrote "grafana" meant "influxdb"

@FireDrunk
Copy link
Owner

Grafana is not a database, it's just visualization. It can use multiple datasources and has some nice logic to incorporate multiple queries into a single graph.

To be precise, my 'test' setup (which is basically @hyperbart's home :) ), has an installation of Grafana linked to InfluxDB. Influx takes care of all the storage, while grafana queries Influx realtime.

I haven't played around with a 'type' table, but theoretically you could create a query in Grafana which queries all the values given a specific type.
This means you can create a "Temparture" graph collection and select all values (with the associated addresses) given a specific DPT value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants