sound synthesis

see also analysis and resynthesis. related to digital signal processing


thought models for thinking about sound

  • physically: sound is made of longitudinal vibrations of matter in three dimensional space. humans have two main pathways to receive and analyse changes in air pressure and also discern its source direction to a limited extent. regularity in received signals is of particular significance and can be described by frequency
  • time/magnitude: one channel of magnitude samples over time as the representation of a vibration of a directionless sound with the values of any separate contributing events summed. can be transferred to a membrane directly to reproduce sound
  • frequency/phase/time/magnitude: imagine a kind of piano roll that, instead of being limited to piano keys and note onsets or durations, holds information for finely spaced sine frequencies and detailed amplitude envelopes of each. this is a type of split representation of time/magnitude data
  • sound sources and instruments: sound being the excitation of matter begins with the excitation of matter. air pressed through a small opening might become to oscillate at high frequencies. traditional musical instruments usually have interfaces for excitation of matter to be induced by human motion, parameterised for example by relative striking position or intensity


simple periodic wave forms

  • square: amplitude alternates between only two values, with the same duration at each. spectrum contains only odd harmonic frequencies. the amplitudes of the harmonics can be calculated with (2 / (pi * harmonic_odd_n)). its stochastic counterpart is a two-state trajectory
  • sawtooth: triangle with one rectangular angle. "its spectrum contains both even and odd harmonics of the fundamental frequency. because it contains all the integer harmonics, it is one of the best waveforms to use for subtractive synthesis of musical sounds, particularly bowed string instruments like violins and cellos, since the slip-stick behavior of the bow drives the strings with a sawtooth-like motion"
  • triangle: linear change with fast change of direction. spectrum contains only odd harmonics with amplitudes maybe (2 / (pi * (harmonic_odd_n ** 2))) (verification needed)
  • rectangle: like a square wave but without the requirement of the same duration at every value. minimum and maximum values could also vary
  • trapezoid: clipped triangle
  • sine wave

    • single-frequency spectrum, no harmonics

    • doesnt change angular direction abruptly

    • the only periodic waveform that retains its wave shape when added to another wave with that form of the same frequency and arbitrary phase and magnitude

    • one full cycle in (2 * pi) radians

    • every sound can theoretically be created by a sum of a possibly infinite number of sines

    • samples can be taken from

      • common sin() function that is based on the taylor series

      • a lookup table where elements are pre-calculated samples. this is a faster way to get sine values. values are pre-calculated for a specific sampling rate and frequency, taking other frequencies might need interpolation or re-calculation

      • less precise sine approximation functions


  • typically created from samples of a random number generator passed through a filter bank to remove frequencies
  • there dont seem to be many other methods to create noise. summing many random sines is possible but computationally intensive
  • amplitude changes tend to sound like water, bandwidth changes tend to sound like wind

frequency filtering


  • low frequencies remain
  • windowed-sinc

    • best for precise frequency removal
    • high computational effort because it typically uses convolution
    • uses an in both directions decaying (windowed) sinc function for the convolution impulse response kernel
    • the longer the impulse response kernel, the smaller the transition bandwidth
    • addition of kernels: adding another filter into the stop-band of another
    • convolution of kernels: adding another filter into the pass-band of another
    • take values of the sinc function, window it with a blackman window which performs well in this context, and use the result as an impulse response of convolution with an input signal
    • when parameters change, the kernel has to be adjusted
  • moving average

    • best for preserving time domain properties because a centered moving average creates no shift between input and output signal


  • high frequencies remain
  • can be created from a windowed-sinc low-pass with a spectrally inverted or spectrally reversed impulse response
  • other options

    • subtract from current values the result of applying a moving average on the values

    • subtract from each value the preceeding value. two values that follow each other with similar intensity will tend to cancel each other out. variant: set the current output value to the previous value plus the current value minus the previous input value

band-pass and band-reject

  • frequencies inside or outside a range remain
  • can be created from a combination of windowed-sinc low-pass and high-pass impulse response kernels. band-pass: convolution of impulse response kernels

filter bank

  • multiple band-bass filters. low-pass and high-pass can be used at the edges

parametric equaliser

  • a filter bank can be used to create a parametric equaliser where the frequency bands are mixed after gain adjustment
  • band center and width as control parameters
  • if any control is unchanged, dont apply associated filter

other time/magnitude data post-processing to shape sounds

  • delay: input samples are put out at a later time
  • reverberation: can be created with many short delays that repeat delayed output and change amplitude and frequency with each repetition
  • basic operations

    • multiplication: shapes the amplitude of signals
    • division: shapes the amplitude of signals. dividing a signal by itself flattens the signal
    • subtraction: remove signals from each other
    • addition: adds signals to each other
    • convolution: multiplies frequencies. "under suitable conditions the fourier transform of a convolution of two signals is the pointwise product of their fourier transforms"
  • convergence: two signals and the output shall fall in between. for example: (b - a = 100%); (a + 0.5 * (b - a))

grain processing

a signal split into small chunks that are then processed

  • repetition or removal of pieces can create a time stretching effect that does not change pitch as much as removing or duplicating individual samples would
  • the repetition of larger grains can create interesting rhythmical effects similar to a repeated delay
  • example operations: randomise, repeat, reduce, effects on selected grains, swap. example parameters: grain_size, repetition.
  • the total output of a grain processor can be shorter or longer than the input. this might lead to a growing length difference between the unprocessed and processed signal and require buffering


  • additive vs subtractive processes: add vs subtract parts to create a desired result. for example summing sinusodials vs filtering
  • amplitude modulation: in particular, fast changes of the amplitude. related to tremolo
  • frequency modulation: changing frequency based on the values of another signal. this creates new harmonics. "in the context of audio coding, fm synthesis can be considered a "lossy compression method" for additive synthesis". related to vibrato
  • granular synthesis: grain processing. additive synthesis: summing sines to create a desired signal
  • transformation of recorded sound vs synthesis from formulas alone

    • getting sound parameters from analysing recorded sounds, while necessarily lossy, gives access to a wide variety of sounds with complex content
    • purely synthesised sounds, while being an exact representation of their original intent and possibly creating what cant be recorded or analysed precisely, require much work to be as complex as real world recorded sounds