Working with time series data

These two articles are interesting.

These articles deal with two realities of time-series data in IoT systems:

  1. Devices with bandwidth limited connections (LTE Cat-M) often report data at varying rates to conserve bandwidth and storage.
  2. When you are graphing data over a wide span of time, you often have more points than your graph width in pixels. Thus, you need to compress the data. The simplest is to break the data up into time windows and take the average of all points in the window. However, if you are trying to make sure the temperature for food storage is within range, you may be interested in the Min or Max values and this average can hide what you are interested in.

I typically use InfluxDB to store time-series data, and it appears to have a time weighted averaging function:

https://docs.influxdata.com/flux/v0.x/stdlib/universe/timeweightedavg/

However, we can do even better than a time weighted average for variable sample rates. The device sampling the data can report at varying sample rates, but instead of just reporting instantaneous values, it can average the data it receives during its sample window, and even record min/max values. This extra information (duration, mix, max) is reported with each sample. If you are doing something like integrating total flow from flow rate (the area under the curve), then you will get a more accurate result. This is why the Simple IoT point data structure has Duration, Min, and Max fields. From this data, you can extract a more complete picture of what is going on with your sensor data.