2023-02-27

dg filesystem layer

discontinued. design and implementation a tagged userspace filesystem

design for a tag filesystem

features

  • instead of storing files in directory hierarchies, files can be in multiple directories, and queries can be made based on which files are included in which directories
  • works for many programs that expect traditional filesystem semantics, particulary for cases of reading, browsing and writing
  • supports nested set relation queries in path names, the ones specified in #(link-c-one ("path-find"))
  • copying files from other filesystems into the filesystem and copying from this filesystem into others
  • all regular-file properties of the underlying filesystem, for example permissions and other stat information

installation

it is part of sph-lib-dg. install sph-lib-dg and the "dg" command-line program. the "dg" program has a "mount" option

dg mount --help

semantics

browsing

entering a directory appends to the path, which selects all files that have every path-segment as tag

root

list all existent tags, not files

sub-level

  • tags and files
  • tags are directories and shown are all tags of filtered files

paths

  • can identify one file if ending with a file name or be a filter query
  • directories are not identified by paths with multiple elements
  • directory names are unique

file properties

  • files do not have persistent customisable names, they only have associated tags
  • the display format for regular files is comma followed by a hexadecimal number
  • custom named files can be created, in which case the full given path to the file becomes a temporary alias for the id name. the custom named files are not listed because they would conflict with tags. the alias is removed with the next mount
  • a stat system call returns appropriate information, except the inode is currently the database identifier and not the underlying filesystem inode

system calls

rmdir

/a

delete tag

/a/b

  • delete relation (> b (> a *))
  • deletes tag "b" from all files that have "a" and "b"

/a.b.c

delete (> a b c)

/a/b.c

delete (> (> b c) (> a *))

mkdir

/a

create tag

/a/b

create relation

(> b (> a *))

adds tag "b" to all files that have "a"

rename

/a /b

modify tag name

/a/b/c /a/b/d

relink

(> c (> (a b) *)) (> (> (a b) *))

/a/b/c /d/e/f

  • replaces tags for files matched by the first path

    (> c (> (a b) *)) (f (> (d e) *))

/a /b/c

  • replaces tags for files matched by the first path

    (> a *) (> (b c) *)

/a/b /b

  • "mv" will give error about existence
  • removes tags from files matched by the first path
  • changes (> (a b) *) (> b *)

create

  • create file with given directory/tag relations
  • accepts arbitrary filenames - there is still a new unique file element created, but the custom named files stay accessible via the given file-path as long as the filesystem is mounted

unlink

delete the target file completely

read

read from file

write

write to file

readdir

list right elements

parsed-path syntax

  • parsed-path: parsed-path-element ["/" parsed-path-element]
  • parsed-path-element: condition/tag/file
  • file: ,hexadecimal-number
  • tag: string
  • condition: (condition-name tag ...)

caveats

  • be careful when deleting, only use rmdir to delete directories. file managers delete directories and the contained files, yet files could be part of other directories. if you want to delete all files then use find with maxdepth 2
  • tag permissions would have to be managed in the database by other means
  • tagged: filesystem-layer uses dg-tagged because sph-dg supports integer and binary values in interns, which may coexist with strings for tags

excluded

hierarchical tag relations

  • there has been an idea for using a dot as a part separator in addition to /, to designate a hierachical relation. this has not been implemented for the following reasons
  • currently too much work because of more complicated parsing and evaluation
  • while sph-dg supports hierarchical tags, benefits are not clear considering that tags are unique

parenthesised display format for files

  • "(identifier tag ...)"
  • round brackets would have to be quoted on typical shell command-lines

program support

programs that usually work well with this system

  • file managers (except recursive directory operations)
  • media players
  • editors and viewers

programs that can create problems

  • more complicated document editors like libreoffice that do not just save, but try to create backups, locks and whatnot in the same directory
  • versioning tools - symlinks or hardlinks are not supported, as well as duplicate file names if they may occur

possible enhancements

  • exclude tags that every matched file has from the listing
  • tag permissions
  • setxattr support for setting relation weight

alternatives

  • using "dg link" to create direct tag-named links to the database files, though this does only allow editing and the list of files needs to be refreshed
  • tagsistant