digital signal processing

"signals processed in this manner are a sequence of numbers that represent samples of a continuous variable in a domain such as time, space, or frequency"

facets of what i deem important knowledge for sound creation



digital sound representation

a common format for storing signals digitally is sample values of the magnitude of a vibration over time, like discrete positions of a membrane between two bounding values

sample formats

  • the data type used for samples values can for example be floating point (float), fixed point or integer
  • floating point is slower and less accurate than fixed point but can handle a very large range of values immediately
  • calculation with floats is not trivial, for example summing of floats can lead to quickly accumulating large errors if no error compensation is used for the summing
  • the larger the underlying bit size of float values, the smaller the rounding errors
  • when integers are used they might still have to be divided and become fractions
  • in digital to analog converters somewhere conversion to exact integers is done
  • if care is taken that all samples are created as integers and not divided, it would not be for general purpose signal processing but most precise
  • if a data type can store more values then a sample can represent a wider range of values

sampling rate

  • also called sample rate
  • describes how many samples represent one second of sound
  • with a higher sampling rate, more and higher frequencies can be represented
  • the maximum representable frequency in hertz is half the sampling rate, as any higher frequency than that would be spaced in a smaller duration than two subsequent sample values could represent


multiple separate sound channels like stereo channels are typically stored in one of two ways:

  • non-interleaved: each channel is stored in a separate sample array. for example for three channels 1 1 1, 2 2 2, 3 3 3
  • interleaved: the samples for one index in all channels are stored together in packets. for example for three channels: 1 2 3 1 2 3 ...
  • non-interleaved can be easier to process and interleaved can be more robust to playback interruptions as for example a lag tends to affect all channels at the same time


radians and hertz

  • two pi radians are one full sine cycle regardless of sample rate
  • a full cycle of a one hertz sine has sample rate number of samples. hertz is defined on the basis of seconds
  • if the sample rate is even then the maximum representable frequency in hertz is an integer. the same is not true for radians

samples or seconds for time

  • sample count is an exact integer measure for a progressing time value and can be used for durations or signal widths for example
  • sample count depends on the sample rate. one sample can be of varying duration depending on the sample rate. a signal of a specific number of samples would be of different frequency with different sample rates
  • seconds do not depend on the sample rate
  • the duration in seconds of a single sample might be an inexact number that is more difficult to calculate with precisely. for example 1s / 44100hz = 0.000022675736961451248s and 1s / 48000hz = 0.000020833333333333333s
  • with seconds it is likely to calculate inexact time values that fall between two samples and have to be rounded, sample count is always sample exact


  • stateless: need only parameter values that need not be kept
  • stateful: need parameters for values that need to be kept between function calls. for example, carryover values that fall outside the currently processed range
  • processing one input sample to one output sample: can affect multiple samples only with state and delay after accumulating samples
  • segment to segment: processors might depend on preceding or following values
  • one to many: output creates a longer signal
  • many to one: a longer signal becomes a shorter one

digital music making tools

  • analog hardware devices for synthesizers with keyboards or sequencing
  • software sequencers where instruments, synthesiser sounds, recordings and effects can be layed out and played back
  • software environments like fruity loops, reaktor, reason and others
  • lv2 or vst plugins as effects and synthesisers
  • taking audio recordings and prepare their ordered playback in a program like ardour or cubase
  • connecting hardware devices with midi and puredata and the user interfaces that it can create
  • preparing sample arrays using a general purpose programming language and writing to sound files


  • harmonic: a harmonic is a wave with a frequency that is a positive integer multiple of the frequency of the original wave
  • time series: a series of data points indexed (or listed or graphed) in time order. most commonly, a time series is a sequence taken at successive equally spaced points in time
  • a sound program is a potential for sounds to occur. elements can appear and disappear like entering and exiting a dimension. there is always some kind of seed, for example the literal arguments in program code
  • amplitude: the magnitude of the difference between a variables extreme values. the amplitude of a periodic variable is a measure of its change over a single period
  • clipping: limiting values to a maximum
  • digital image processing is a subcategory of digital signal processing
  • envelope: a path of loudness transition
  • every linear time-invariant system can be represented by a convolution
  • frequency: number of occurrences of a repeating event per unit time
  • low frequency: less change between samples
  • phase: relative shift or progression of a repeating event
  • sequencer: lays out the starting points and durations of sounds
  • vocoder: splits into frequency bands and removes bands
  • wave < instrument < composition. interval, duration, song or note relative offset