Currently reading through the section on Graph-like models. A few quotes:
For example, Facebook maintains a single graph with many different types of vertices and edges: vertices represent people, locations, events, checkins, and comments made by users; edges indicate which people are friends with each other, which checkin happened in which location, who commented on which post, who attended which event, and so on [35].
Property Graphs
In the property graph model, each vertex consists of:
- A unique identifier
- A set of outgoing edges
- A set of incoming edges
- A collection of properties (key-value pairs)
Each edge consists of:
- A unique identifier
- The vertex at which the edge starts (the tail vertex)
- The vertex at which the edge ends (the head vertex)
- A label to describe the kind of relationship between the two vertices
- A collection of properties (key-value pairs)
Some important aspects of this model are:
- Any vertex can have an edge connecting it with any other vertex. There is no schema that restricts which kinds of things can or cannot be associated.
- Given any vertex, you can efficiently find both its incoming and its outgoing edges, and thus traverse the graph—i.e., follow a path through a chain of vertices—both forward and backward. (That’s why Example 2-2 has indexes on both the
tail_vertex
and head_vertex
columns.)
- By using different labels for different kinds of relationships, you can store several different kinds of information in a single graph, while still maintaining a clean data model.
This is very similar to the Simple IoT data model. Below are the data structures used:
// Node represents the state of a device. UUID is recommended
// for ID to prevent collisions is distributed instances.
type Node struct {
ID string
Type string
Points Points
}
// Edge is used to describe the relationship
// between two nodes
type Edge struct {
ID string
Up string
Down string
Points Points
Hash []byte
}
In this case we are using an array of Points
to represent the properties instead of key/value pairs.
// Point is a flexible data structure that can be used to represent
// a sensor value or a configuration parameter.
// ID, Type, and Index uniquely identify a point in a device
type Point struct {
// ID of the sensor that provided the point
ID string
// Type of point (voltage, current, key, etc)
Type string
// Index is used to specify a position in an array such as
// which pump, temp sensor, etc.
Index int
// Time the point was taken
Time time.Time
// Duration over which the point was taken. This is useful
// for averaged values to know what time period the value applies
// to.
Duration time.Duration
// Average OR
// Instantaneous analog or digital value of the point.
// 0 and 1 are used to represent digital values
Value float64
// Optional text value of the point for data that is best represented
// as a string rather than a number.
Text string
// statistical values that may be calculated over the duration of the point
Min float64
Max float64
}
The Point
data-structure is working out very well. Most points only use a few of the fields, but having the extra fields typically does not cost much with encoding algorithms like protobuf. The time field is especially critical for sensor data and synchronization. Even for configuration values, it is very handy to know when the value was last changed. And, if you data properties are all expressed as points, it is very simple to capture all data changes in a historian like InluxDB. The Point.Index
field also allows us to express arrays in a property. Simply use the same Point.Type
for a series of points and increment Point.Index
to express the point position in an array. These points can be easily turned into an array when it comes time to use the data.
One thing I had not considered yet is adding a Label
or Type
field to the Edge
struct. This would allow us to describe different types of relationships – may be useful in the future.