2025-12-07

option parsing

unix short

single hyphen followed by one letter. multiple flags may cluster, such as -abc meaning -a -b -c. a value may attach (-ofile) or follow as the next token (-o file), depending on the utility.

gnu long

double hyphen followed by a long name, such as --output. a value may attach with an equals sign (--output=file) or follow as the next token (--output file). short options coexist with long ones.

bsd short-only

single hyphen followed by one letter. no long options. many utilities do not accept clusters, so -abc is an error unless explicitly defined. parsing often stops at the first non-option token.

windows style

slash-prefixed names, such as /v or /verbose. value attachment rules vary by command. some tools allow colon-delimited values (/o:fn.txt), others accept no values at all.

sph-options

single-letter options only. no clustering. all options must occur first in argv. a standalone -- stops option parsing. unknown options cause the parser to stop and treat the remainder as positional arguments to allow arguments that start with - without having to use --.

structural questions

addresses how options and positional arguments coexist and how the parser should treat ordering and grouping.

options vs positional arguments

define the boundary between flags and data. for example, in cp -r dir1 dir2, -r is an option and the following two tokens are positional arguments.

mixing options with arguments

some parsers allow options to appear after positional arguments, such as tar cf out.tar -v directory. others forbid this for performance or simplicity.

termination with --

a literal -- indicates that all following tokens are positional, even if they begin with hyphens. example: grep -- -pattern file.

required and optional option values

some flags require a value (-o file), others may accept an optional value (--level[=n]). the parser must determine whether to consume the next token.

required and optional options

a command may declare certain options as mandatory (for example pack --target t). others may be optional with defaults.

option value parsing

values may need conversion to numbers, paths, or domain-specific types. example: --implicit-compression=8 interpreted as integer.

commands and subcommands

commands such as git add introduce a verb hierarchy, each with its own option and argument rules.

semantic rules

defines the meaning of parsed structures once the raw tokens are consumed.

precedence of commands vs options

a system must decide whether a leading token like content is a command name or an argument. command-first systems reserve certain leading tokens for dispatch.

ordering constraints

some tools require options to precede commands, while others accept interleaving. example: program command -v vs program -v command.

repetition and cardinality

rules for whether an option may appear multiple times. example: -I path can be repeated in compilers.

typed values

values may be interpreted as integer, string, or other types. misinterpretation is avoided by declaring expected types.

converters and validators

transform raw values into internal forms (for example number parsing) and reject invalid ones early.

multi-name options

many tools treat -h and --help as the same option. the parser maintains aliases that map to the same key.

implicit defaults

if an option is omitted, the parser may synthesize a value. example: no --compression implies compression=0.

context sensitive options

some programs interpret options only in particular phases or after certain arguments appear. the option meaning depends on prior context rather than on a global flat namespace.

for example, find -type f -exec cmd {} + treats -exec differently depending on the preceding primaries. imagemagick resembles a pipeline: options such as -resize or -gravity apply only to the subsequent operands until another image boundary or operator resets context.

a context sensitive parser maintains an internal mode that changes as certain tokens are consumed. each mode exposes a different option set. this enables compact grammars but requires sequential interpretation rather than isolated option lookup.

parser behaviors

early termination

the parser may stop scanning when encountering a non-option or an explicit --. what follows becomes positional.

unknown option handling

unknown flags may cause immediate failure or may end option parsing.

a common example is how string matcher such as grep require -- before specifying patterns that start with a minus. otherwise it will exit with an error about an unknown option being used.

missing value handling

if an option requires a value but none is present, the parser either errors or halts. example: -o without a filename.

ambiguity resolution

determining whether a token like -abc refers to clustered short options or a literal argument.

performance constraints

constraints on memory and cycles. example: sph-options scans argv once, uses a 256-entry table, and avoids allocations.

memory model

specifies who owns parsed strings and how long they remain valid. many parsers return pointers into argv for zero-copy behavior.

parsing techniques

approaches used to implement an option parser.

getopt

state machine defined by posix. handles clustered short options and optional values according to an options string.

getopt long

gnu extension that supports long names and equals-sign attachment. it returns both the resolved option and the index of the next argument.

adhoc scanning

manual iteration over argv implementing custom rules. used for minimal ad-hoc parsers: read token, check first character, dispatch or break.

fold-based parsing

process argv with a left fold: each token updates an accumulating parser state. srfi-37 uses this to uniformly handle flags and arguments.

table-driven parsing

a specification table defines each option: names, value requirements, and handlers. the parser dispatches operations by table lookup.

configuration models

ways to specify an interface so the parser can operate generically.

minimal configuration

flat arrays or tables indicating which options exist and whether they require values.

declarative specifications

structured definitions including commands, options, and argument patterns. example: sph-cli with (cli-create #:options ... #:commands ...).

keyword-argument configuration

configuration passed as named parameters, often in scheme or dynamic languages. enables fine-grained tuning per option.

diagnostics

reporting for user-visible errors and metadata.

help generation

automatic formatting of commands, options, and argument patterns into structured help text.

interface descriptions

machine-readable dumps describing options, argument patterns, and command structures.

missing arguments

structured messages such as 1 missing argument (facets content-id ...).

unsupported options

errors when encountering unknown flags, such as unsupported option xyz.

design variants

alternative philosophies for how strictly the system interprets tokens.

strict mode

reject unknown or misformatted tokens immediately; enforce full specification adherence.

permissive mode

treat unknown tokens as positional arguments; delay errors until deeper stages.

pattern-first parsing

attempt to match positional patterns before recognizing options; suitable for pattern-heavy interfaces.

option-first parsing

scan options greedily from the start, then hand the rest to command or argument processing.

possible extensions

future adjustments that enhance expressiveness.

one-or-more repetition

allow patterns that require at least one instance of an argument, distinct from zero-or-more.

extended type predicates

add richer validations such as file-exists or directory-writable.

links