2025-12-28

array representations

address continuation

  • contiguous

    • elements occupy a single continuous address interval
    • address computation is purely arithmetic

      • address(i) = base + i * stride
    • no metadata required beyond base and stride
    • relocation requires copying the entire region
  • discontinuous

    • elements are stored in multiple, non-adjacent regions

    • address computation requires at least one indirection

    • typical realizations

      • pointer arrays
      • block directories
      • stable slot tables
    • enables growth and removal without relocating existing elements

    • trades locality and simplicity for stability and flexibility

stability contracts

  • pointer stable

    • once an element address is observed, it remains valid until explicit destruction
    • prohibits moving live elements in memory
    • required when external code caches element pointers
    • severely restricts resizing strategies in contiguous layouts
  • index stable

    • once an index is assigned, it permanently identifies the same logical element

    • index becomes an identity rather than a position

    • requires indirection, tombstones, or fixed slots to prevent shifts

    • enables removal without invalidating references by index

size model

  • static

    • capacity fixed at initialization
    • no reallocation or structural mutation of storage
    • size changes only by modifying a usage counter
    • simplest model with strongest predictability
  • dynamic

    • capacity may change during lifetime

    • growth strategy defines amortized cost and fragmentation

    • shrink strategy determines memory return behavior

    • resizing may or may not preserve pointer stability depending on contract

element type and layout

  • single type

    • one interpretation of memory, one stride
    • simplest aliasing and correctness model
  • nested or multidimensional

    • arrays represent higher dimensional index spaces
    • flat representation

      • one contiguous allocation
      • index mapping function computes linear offset
    • hierarchical representation

      • arrays of pointers to subarrays
      • enables ragged shapes at cost of indirection
  • mixed type

    • memory region accessed through multiple type interpretations

    • requires explicit control over layout and aliasing

    • common mechanisms

      • structs to define composite layout
      • byte-wise access for reinterpretation
    • incorrect aliasing leads to undefined behavior

operations and stability impact

  • resize

    • changes capacity and possibly physical location
    • contiguous dynamic arrays may relocate storage
    • relocation breaks pointer stability but preserves index ordering
  • append

    • adds element at logical end
    • contiguous layouts preserve relative ordering
    • amortized cost depends on growth policy
  • insert

    • adds element at arbitrary logical position
    • contiguous layouts require shifting or gaps
    • shifting breaks pointer and index stability beyond insertion point
  • removal

    • deletes an element and frees its slot

    • strategy defines semantic cost

      • shift removal preserves order, breaks stability

      • swap removal preserves density, breaks identity

      • tombstones preserve identity, reduce density

format choices

  • array of structs

    • each element stores all fields of a logical record
    • one base pointer, one stride
    • optimal for per-record access
    • less efficient for field-wise bulk processing
  • struct of arrays

    • each field stored in a separate array

    • multiple base pointers

    • optimal for bulk and vectorized operations

    • per-record access requires multiple loads

minimal representation patterns

  • static contiguous

    • fixed buffer plus usage count
    • no allocation after initialization
  • dynamic contiguous

    • buffer with explicit capacity and used counters
    • growth policy defines relocation behavior
  • stable slots contiguous

    • contiguous slot array

    • slot identity is stable

    • removal creates empty slots

    • reuse via free list

    • iteration must test slot occupancy

discontinuous designs

  • fixed blocks

    • storage divided into equal-sized contiguous blocks
    • blocks never moved once allocated
    • index mapping computes block index and offset
    • preserves pointer stability within blocks
  • unequal block sizes

    • blocks grow geometrically
    • reduces allocation frequency
    • requires cumulative size metadata for indexing
  • per-element allocation

    • each element allocated independently
    • maximal pointer stability
    • minimal spatial locality
    • high allocator metadata overhead
  • stable slots

    • logical index refers to a slot, not a position

    • slot identity is preserved across removals

    • empty slots accumulate unless reused

    • iteration must skip empty slots

    • reuse is mediated by a free list

free list designs

  • stack-based free list

    • free slots are recorded externally in a separate stack or array
    • removal pushes slot index onto the stack
    • insertion pops a slot index from the stack
    • properties

      • slot metadata does not contaminate element storage
      • free list state is centralized
      • requires separate storage proportional to number of free slots
      • removal and insertion are constant time
    • consequences

      • element storage can remain type-pure
      • free list capacity must be managed independently
  • embedded free list

    • free slots store linkage metadata in place of element data
    • each free slot contains a reference to the next free slot
    • a single head index tracks the free list
    • properties

      • no external free list storage required
      • slot storage is type-polluted while free
      • free list state is distributed across slots
    • consequences

      • requires a representation for empty versus occupied slots
      • element type must tolerate overlay or union-based encoding
      • iteration must distinguish live elements from linkage metadata
  • comparison

    • stack-based

      • clearer separation of data and bookkeeping
      • higher auxiliary memory cost
    • embedded

      • minimal auxiliary memory

      • tighter coupling between storage and allocator logic

example: dynamic growth-only with removals and index stability

  • contract

    • indices represent permanent element identities
    • pointer stability not guaranteed across growth
  • design

    • contiguous slot array with embedded free list
    • growth only increases capacity
  • insertion

    • reuse freed slots before extending used range
  • removal

    • mark slot empty
    • link slot into free list
  • iteration

    • scan full used range
    • skip empty slots
  • compaction

    • optional operation

    • requires emitting index remapping

example: 3d point lists

  • array of structs

    • struct { float x; float y; float z; }
    • spatially local per-point access
    • inefficient for operations on a single coordinate
  • struct of arrays

    • separate x, y, z arrays
    • efficient for coordinate-wise transforms
    • higher cost for per-point access
    • dimensions can be optional by using a null pointers for arrays