2023-02-27

error handling

about dealing with unexpected, or unusual, return states of subroutines

usually subroutines have a main purpose. for example, dividing two numbers, or writing to a file. sometimes the main purpose can not be fulfilled, for example because:

  • external resources are unavailable, like inaccessible filesystem resources or error responses from webservices
  • the provided subroutine arguments are not supported: invalid data types, unhandled cases, division by zero, missing data

states likes this tend to be considered errors.

it can be useful to think in terms of operations that can not fail (for example, addition; as long as the hardware and compiler works) and operations that can fail (for example, memory allocation)

ways to communicate errors

types designated as errors

for example

  • boolean false
  • strings with error descriptions
  • null value types

values designated as errors

for example

  • negative or positive number on error
  • true on success and false on error

pro

  • type checks are usually fast, and subroutines often return a limited number of different types, leaving room for error types

con

  • problematic when all types can be valid results
  • the designated types of subroutines for errors have to be known

error type

for example

  • a distinct type with fields for error identification, error group, description, and possibly more

default values

for example

  • zero for failed parsing of a string to number
  • empty array when data could not be loaded

the result will usually be a valid return type for proceeding, and may be less likely to be incompatible with subsequent computation.

pro

  • can be easy checks for length or zero
  • most useful when the default value is not a possible successful result

con

  • ambiguous results - parsed zero or failed parsing? no entries loaded or failed processing?

multiple return values

for example

  • return one error status value and additional values for results

the additional values still have to be unpacked syntactically, which needs additional forms to create and accept multiple return values. the position of the error value in the collection of output values has to be agreed on between different libraries for consistency. similar to returning a single compound value or special error type. for every subroutine call and return, code has to be added to check for an error status and it becomes a kind of manual flow control, with code that makes a call and checks if it shall continue. this interrupts subroutine composition

with output arguments

the error status can be passed as the subroutine result or output argument (or additional value). compare the following two signatures

input-value ... output-value ... output-status -> value
input-value ... output-value ... -> status

the latter should be simpler, because there is always a status value, and only one, and the value is not mixed inbetween other output values

continuation passing

with error argument

  • call another subroutine with error object and result values as arguments
  • somewhat similar to multiple return values, except the continuation is given explicitly as a subroutine and unpacking happens via parameters
  • is common with asynchronous computation in node.js

con

  • signature of error handling routine has to be agreed on between different libraries for consistency
  • increased subroutine nesting, especially with anonymous functions

with subroutine selection

  • passing alternative subroutines, one of them being called on success and the other on error

local jumps

not exiting the subroutine but continuing at a place at the end of the subroutine where errors are handled. cleanup, like deallocation before exit, can be done at this place. a local variable can be used to save error details where the error occurs. local jumps can also exit loops

pro

  • fast
  • easy to predict execution flow
  • less duplication compared to handling errors at each source as all errors can be handled at one place

non-local jumps

error information can be collected and subroutines exited in the call stack upwards using non-local jumps until a catcher is met that handles the error. caller/callee return semantics are disregarded. this corresponds to exception handling using throw and catch. with exception handling, typically a backtrace is collected when an error occurs and a default catcher that exits the program is installed. exception handling can also be implemented with delimited continuations

cleanup (heap-memory, file handles or similar) might still be necessary on every non-local exit point. garbage collection helps with the memory management at least

pro

  • errors that might originate from multiple different subroutines can be handled at once
  • routine composition is not interrupted by status checks and can therefore be simpler in notation

con

  • at some place, any errors might still need to be handled anyway
  • normally subroutines exit at the point of their call or the whole process is ended. hidden control flow paths might emerge
  • do not help if something needs to happen in context at every error source

global variable

a global variable that is set to specific values on error. the variable needs to be thread local for mult-threaded error handling. the value persists between routine calls and needs to be reset. this might be harder to optimize by the compiler since the value can be changed from many places

other considerations

if multiple libraries use integer values for error ranges then error values of different libraries might overlap and be ambiguous. for this reason, it can become necessary to keep additional information with error objects, for example a string identifier for the library it belongs to (using enums would again be too likely to conflict)

reporting

error reporting is collecting details that may be helpful to find the cause and making this information available to the user. for example, a message written to standard output or a log file

error details

information that can be useful when handling an error:

error-name
error-description
source-routine-name
source-line-number
source-file
source-module-name
backtrace
literal-irritant-expression

examples

glibc

most library functions return a special value to indicate that they have failed. the special value is typically -1, a null pointer, or a constant such as eof that is defined for that purpose. but this return value tells you only that an error has occurred. to find out what kind of error it was, you need to look at the error code stored in the variable errno

sph-sc-lib status

uses a struct for error integer identifier and string group. has helpers, that subroutine results can be passed to, that check the return value and go to the error label on error status. sph-sc-lib

int main() {
  status_declare;
  // code ...
  status_require(test());
  // more code ...
exit:
  return status.id;
}