2018-04-14

graph data structures

data modeling experiments with a graph relations

note about deletion: possible rule: when a node is not in a relation then it can be deleted

assumed graph features

directed graph

relations can have labels and weight

nodes can store data

node data is distinct

notation

nodes: integers

node-data: "content"

relation: left right, left label right, left label weight right

node data

data attached to nodes directly. sometimes called node properties

example data types: numbers, arrays, character strings, custom data structures

node id array

data: node-id ...

benefit: when retrieving the data for the id-array node, the ids are immediately available as a sequence with possible repetition

downsides: if the sequence changes the whole node data likely has to be replaced. the id-array is not indexed by the contained ids via relations

record

data: custom

record type, field names, field types and field accessors can be stored within the accessing program or using a special data format for the node data

name

data: string

a node with a string identifier as data. the string can be used instead of the associated id, for example in program source code. the string can be translated to the id to be used in relations

benefit: usually there is no guarantee that the same id is associated with the same node data in different database instances (other database or recreated). string identifiers can help make code be valid across different database instances while serving as a symbolic identifier for humans

intermediate

data-less node used as a connector of sub-graphs. relates nodes without creating direct relations between them. this can be used for abstraction, to address and relate data structures of multiple nodes with one identifier

example: nodes -> id-node -> other-nodes

set

consists of a set identifier node and set element nodes related to it in one direction

relation: > set-id node-ids

example

set-id: 1

element-ids: 2 8 5

relations: 1 2, 1 8, 1 5

query: > set-id *

2 8 5

set operations

union

nodes in direction related to at least one of given set-identifiers

query: > (or set-id ...) *

intersection

nodes in direction related to all of given set-identifiers

query: > (and set-id ...) *

nested set

element ids can be set ids

ordered set

numbered

the relation weight or label becomes a sortable element position value

this works best when the data is actually physically stored in sorted order

inserting a new element might require the update of many position values

remarkable property: if nodes are read in order then set unions and intersections will eventually be read ordered by multiple set-ids and decreasing priority as specified

chained

relations: set-id "next" element-1, element-1 "next" element-2, element-2 "next" element-3

this only works if there can be multiple nodes with the same data, or if only distinct elements are used. otherwise one elements "next" label may identify multiple elements if there are multiple sets containing it

the nodes could carry array data to reduce the number of relations

this is not recommended for use in sph-dg, as there the same node data always has the same id

id array

node data: id ...

see the other id-array section above

manual sorting

relation: > set-id node-id

node data: position data ...

the sortable value can also be stored somewhere in the element data, and after elements of a set have been retrieved, be sorted in an extra step

this does not allow for indexed access. basically, all elements have to be read to reconstruct the sequence

dictionary

one node as an identifier for the collection with relations that have keys as labels to values

relation: > dictionary-id key value

example

dictionary-id: 22

22 "username" "tester"
22 "email" "test@example.com"

tags: start document guide computer data-structure sph-dg graph database