about indent-based tree structures

line indentation can be used to create a tree-like structure where the length of empty space at the beginning of lines determines nesting depth


an indent-tree can be parsed and interpreted in more than one way. following are three possible interpretations for this text:


denoted tree

one entry for each line with an integer for the indent-depth

((0 line-1) (1 line-2) (0 line-3))


indent-depth equals nesting-depth

(line-1 (line-2) line-3)


((line-1 line-2) line-3)
(depth-0 (depth-0 depth-1 (depth-1 depth-2 depth-2) depth-1))

sub-list prefixes are the roots of sub-trees

multiple indent-steps at once

if nesting depth increases by multiple steps at once like in the following example


then line-2 could be interpreted as having no prefix

(line-1 ((line-2 line-2-1)) line-3)

operator application intepretation

space for operators and comma and newline for arguments


operator arg1, arg2,
  argn, ...


operator operator arg1, arg2

with optional round brackets or in other languages corresponds to

operator(operator(arg1, arg2))

space for operators and space and dot for continued arguments


operator arg1 arg2
  . argn ...


operator : operator arg1 arg2
  . a

this is dot being the identity function. can also be extended to take multiple arguments. checkout wisp, which uses this

advantages of indent-based syntax

only space is needed to create a nesting structure and only the beginning of lines needs to be marked. the potential for variation in formatting is lower than for alternative tree notations like s-expressions or xml. the same structure notated by different authors, who otherwise tend to invent and use personal formatting styles for brackets, whitespace and nesting, will look very similar, especially without empty lines


indent alone can not mark multiple sub-lists on the same line, like in this s-expression:

(+ (* 1 2) (/ 4 2))


long lines

line wrapping can be done with continued or one-step increased indent on following lines. naive line wrapping starts at the beginning of a line and can be more difficult to read

indent step

two spaces per indentation step is a widespread convention. use of the tab character is also common, which introduces all the complications associated with tab character usage, including the introduction of a second invisible space character and therefore a possible incorrect mix of spaces and tabs, the designation of an extra character for text compression, the necessity for viewer and editor programs to render it and the required configuration of all potential viewer and editor programs to show the tab character with an appropriate and preferred width. tab is usually rendered as 8 spaces tabular aligned to the next equidistant spacing from the beginning of the line, which isnt how people usually want to indent. indent isnt hard to recognise and viewers could display it in users preferred width regardless of the use of space or tab character

some languages that use indent for code structure

for note taking

here is an indent based, machine and human readable text format for titled, separated parts of text or notes. words can be tags or make up a headline. nested structures can be created in content, but do not need to be parsed. if words are tags, then note lists can be processed to extract, merge or analyse notes by tag. an itpn management utility is part of sph-script. "indent tree packet notation", itpn

word word


  • packet: [prefix content] ...
  • prefix: word [" " word] ...
  • content: ["\n" indent any-character ...] ...

for document markup

here is a generic, indentation based syntax for structured documents. it includes forms that can be evaluated by custom procedures to create output like lists, tables and more. "indent tree markup language", itml

expression properties


  • inline: start and end somewhere on a line
  • indent: include all immediately following further indented lines
  • line: from their start to the end of the line

content interpretation

  • scm: start with # and arguments have to be valid scheme syntax
  • text: start with ## and arguments are plaintext

evaluation phase

  • ascend: itml expressions in arguments have been evaluated
  • descend: itml expressions in arguments have not been evaluated

inline expressions


#(identifier scheme-expression ...)


#identifier scheme-expression ...
  scheme-expression ...


##(identifier plaintext/itml-expression ...)


##identifier plaintext/itml-expression ...
  plaintext/itml-expression ...

line-scm, line-text

#identifier: scheme-expressions ...
##identifier: plaintext/itml-expressions ...


###identifier plaintext ...

the text is passed as a parsed tree without any nested expressions evaluated. this can be used for example to create block escaping


a line before increased indent becomes a heading

this is a heading
  this is content
  and more example text
  a sub-heading
    more content

line breaks

each empty line, two newlines, creates one line break in the output

example text

more text after empty line


inline expression prefixes, colons and backslashes can be escaped with a backslash


block escapes