2025-09-06

filesystem organization

examples for organizing large file collections

general observations

a directory name can represent a category, with its entries belonging to that category. directories can also be seen as graph nodes or relation labels.

for example, if there is a directory "text-files", it is simpler if it only contains text files. this way, the directory acts as a placeholder or entry point for a homogeneous category, instead of a mixed set that later requires extra differentiation.

a traditional filesystem forms a hierarchy of directories (nodes) and files (leaf nodes). there is a root path superior to all others. typically, no circular relationships or multiple edges between two nodes are allowed.

filesystem paths themselves can carry information. for example, the path "artist/artist.album/artist.album.1.flac" contains redundant data. it could be reduced to "artist/album/1.flac". the drawback is that parent directories must be considered to reconstruct the full context.

rating

files can be sorted by rating or quality using numeric directory names, where contained files inherit the rating of the directory. for example: "movies/1/good-movie.mkv", "movies/2/less-good-movie.mkv". this works well when files are frequently accessed by importance, such as favorite music or preferred movies/images.

nesting numeric directories like 1/3 or 2/1 loses clarity and approaches decimal-like ambiguity. subcategories may be more helpful. for unrated files, a directory named "0" is convenient, since it is also numeric.

custom commands in file managers can simplify rating changes, for example a command that moves "1/x/y/z" to "2/x/y/z" while preserving the rest of the path. an implementation of that is rate.

examples for music files

this works well for a 80000 files music collection.

rating > instrument-class > loudness/rhythm-class > artist > album.release-year > track-number.title
0 1 2 3 4
  electronic guitar ochestra jazz piano
    beat calm noisy other
      {artist}
        {release-year}.{album} other
          {track-number}.{title}
      various-artists
        {album}.{release-year}
          {track-number}.{artist}.{title}
music/1/electronic/calm/murcof/2007.cosmos/murcof.cosmos.03 cosmos i.flac
0
  unrated-album ...
1
  electronic
    beat
      drumnbass
      techno
      trance
      other
    calm
      murcof
        2002.martes
        2007.cosmos
      vangelis
        "opera sauvage"
    other
  guitar
  jazz
  orchestra
  piano
2
  electronic
  guitar
  jazz
other
  • instrument-class describes the general texture: music with guitar and band usually sounds fundamentally different from orchestral music or electronic music
  • loudness/rhythm describes how "driving" the piece of music is: music with a beat usually has a different effect on the listener than ambient droning
  • the aforementioned characteristics could also be used for a way of automatic classification
  • things like audiobooks and comedy albums i put into a different parent directory under "other" on into a different outside directory

example for video files

class > rating > title > season > episode
tv-show/1/futurama/s05/2.mkv
movie
  0
  1
  2
tv-show
  1
    curb your enthusiasm
    futurama
      s01
      s02
      s03
      s04
      s05
        1.mkv 2.mkv 3.mkv 4.mkv 5.mkv 6.mkv
    monkey dust
  2
other
  clips
  stand-up

alternatively, what is common on the internet is naming files like s05e01.mkv, where s stands for season and e for episode.

examples for software projects

exe license other readme.md src tmp
{project-name}
  exe
    compile
    compile-watch
    install
    test
    compiled/
  src
    {language-extension}
    sc
      foreign
      main
      test
    scm
      sph
      test
    precompiled
  modules
    sph
    test
      sph
  client
    {language-extension} ...
  server
    {language-extension} ...
  submodules
    {repository-name}
  other
    docs
    examples
  tmp
    lib.so
  license
  readme.md
  • "exe": executable files (files with the executable bit set). traditionally named "bin" (from /usr/bin, binary files) or "scripts"
  • "readme" and "license": casing not required
  • submodules: keep separate from source; copy or link them during build to avoid excluding checked-out submodule directories when selecting from "source"
  • external files: place in a "foreign" directory to distinguish from project-maintained files and avoid synchronization issues
  • single-type projects: use a single "modules" directory with module files directly inside (e.g. scheme project modules)
  • hierarchy depth: only as deep as necessary; for example, if only one source language is used, "src/{language-name}" adds no benefit

home directory

/home/username
  exe
  mnt
  personal
  pp -> personal/projects/public
  p -> personal/projects/private
  • exe is in $PATH.
  • mnt for mounted directories, with a special script to mount there, for example mount-home. mounting there has the downside that recursive operations on the home directory might include other filesystems
projects/versioned/repository-name
projects/unversioned/customer-name

creative media

media
  portraits
    {year}
      {integer-score}
        {month}
    2025
      2
        08
        07
      1
        08
        07

other

backup
documents
  editable
  uneditable
download
personal
  projects
    private
    public
  foreign
text
  machine-readable
  plain
  programming
    {language-name}
video

emacs configuration

.emacs
.emacs.d
  lisp
    modes
    themes

about tagging filesystems

sometimes the same file belongs to multiple categories, making duplication across directories useful. this occurs with overlapping classifications, such as music genres where one album can fit several. tagging also enables access by multiple facets, not just a single path.

filesystems like tagsistant implement this, though not always with full posix compatibility. paths in tagging filesystems do not map 1:1 to posix paths. for example, a path representing files tagged with both tag-1 and tag-2 could appear as "music/tag-1/tag-2" or "music/tag-2/tag-1". this ambiguity creates many equivalent paths, making recursive search inefficient.

other issues arise with file operations: copying, programs creating files automatically (e.g. editor cache files), or creating symlinks. it is unclear how current tagging filesystems handle these cases; some may be read-only with separate interfaces for management.

see also

filesystem hierarchy standard