2018-07-02

about indent-based syntax

line indentation can be used to create a tree-like structure

the length of empty space at the beginning of lines determines the nesting depth of following content on the line

also known as the off-side rule

interpretation

an indent-tree can be parsed and interpreted in more than one way

following are three possible interpretations for this text:

line-1
  line-2
line-3

parsed-indent-tree

one entry for each line with an integer for the indent-depth

((0 line-1) (1 line-2) (0 line-3))

tree

indent-depth equals nesting-depth

(line-1 (line-2) line-3)

prefix-tree

((line-1 line-2) line-3)
(depth-0 (depth-0 depth-1 (depth-1 depth-2 depth-2) depth-1))

sub-list prefixes are the roots of sub-trees

this corresponds to the typical structure of abstract syntax trees

multiple indent-steps at once

if nesting depth increases by multiple steps at once like in the following example

line-1
    line-2
      line-2-1
line-3

then line-2 could be interpreted as not having a prefix

(line-1 ((line-2 line-2-1)) line-3)

advantages of indent-based syntax

only one designated character is needed to create a tree structure, and only the beginning of lines needs to be marked. this makes it in many respects simpler than other tree notations like s-expressions or xml

the potential variability of formatting is relatively low. a structure written by different authors, who otherwise tend to invent and use personal formatting styles for brackets, whitespace and nesting, will look almost the same. formatting variability is further reduced if empty lines are disallowed

downsides

indent alone can not mark multiple sub-trees on the same line, like in this s-expression:

(+ (* 1 2) (/ 4 2))

formatting

long lines

if long lines are acceptable, then they can be wrapped starting on the next line while keeping the indent

they could also be split and continued on the next line after an additional indent-step

indent-step

two spaces per indentation step is a common convention

also common is the use of the tab character, which introduces all the complications associated with tab character usage, including a possible accidental mix of spaces and tabs, the use of a character for text compression, the necessity of viewers and editors to render it and the required configuration of all potential viewer and editor programs to the users preference

parsing

example parsers as well as code for converting between different tree representations can be found in (sph lang indent-syntax)

there is also itml, which is a markup language with support for inline code evaluation that can compile for example to html or plaintext

popular languages

coffeescript

wisp

yaml

python

terminology

indent: space at the beginning of a line

indent-step: a character or a string of characters whose repetition increases indent-depth by one

indent-depth: number of indent-step repetitions in indent

nesting-depth: a denomination of nestedness

indent-tree: a string that uses indent at the beginning of lines to specify nesting-depth


tags: overview start q1 document guide syntax computer structured-text indentation