2021-01-02

c programming

subtopics

single compilation unit

for relatively small projects, a single main file that includes all necessary code might be preferable to makefiles, headers and separate objects.

the more common way is to compile parts of the application separately into machine code objects, maintain header files with declarations for each object and then use a linker to connect the code in object files. this may save time in the development of big projects when most objects have previously been compiled and only a few have changed and need to be recompiled.

further information: wikipedia article. sqlite uses this style. this style is similar to javascript in html without module systems where all source files and dependencies are included before use at the beginning of an html file

benefits

  • only one object has to be compiled and only a single call to the compiler is needed to compile the main file
  • no complicated makefiles have to be written and maintained
  • application parts are not split into many header files
  • potential for more automatic code optimisation

downsides

  • all included bindings are defined for all the following code and conflicts become likely, because c has no namespacing feature
  • all code has to be recompiled if one source file changes, which can take a long time

links

tips and tricks, hints

  • using offsetof to get a pointer to the structure from only a pointer to the struct field
  • function calls are expressions so they can be used to wrap more complicated code and statements into an expression
  • set output values only on success
  • for functions to accept a variable number of arguments, often the best solution in c is to define _n suffix variants that take a specific number of arguments. for example, list_3(2, 3, 4)
  • string literals are usually signed integers by default. -funsigned-char
  • c floor is slow
  • assignment of big struct to big struct - copies automatically, deeply even with nested array types as long as their memory is declared as part of the structure
  • cant just set a struct to zero, a struct does not have a single value. for example, in case it is an element of an array
  • with #ifndef, files can be created that support macro variables set before they are included, where the macro variables only get defined inside if undefined. this can be used for configuring inside code. the files can be included multiple times with different macro variable values set before
  • pattern: init variable to zero, heap allocation, final function cleanup checks if it has been allocated and frees if necessary
  • not possible to pass macro names to macros and construct other macro names from them

functions that work on memory buffers can work without change on file content with memory maps

file_buffer = mmap(0, file_size, PROT_READ, MAP_SHARED, file_descript, 0);
MD5((unsigned char*) file_buffer, file_size, result);
munmap(file_buffer, file_size);

flat multidimensional array vs array of arrays

[1 2 3 1 2 3] vs [[1 2 3] [1 2 3]]
  • it is possible to allocate one long array with n elements representing one sub-array, or alternatively use an array of arrays of length n
  • flat

    • easier to allocate as only a single memory region is required
    • sub arrays are fixed size
    • the c type[] arrays are of this format
    • if not standard type[] arrays

      • access with [sub_i * sub_size + i]
      • deeper nesting makes the indexing more complicated
    • a variant is to store sub-arrays interleaved like [1 1 2 2 3 3], basically this changes only the indexing
  • nested

    • easier to access because sub arrays can be iterated with a simple incremented index variable, without having to use the sub array size

    • can be immediately used with array operations on like sorting if they are build to work with subsequent array elements

    • one array and each sub array has to be allocated and later freed

structs that store arrays with their size

pro:

  • dont have to declare additional size variable, as it is available with the struct variable
  • dont have to pass array size as an extra argument to functions, especially when the function takes multiple arrays

con:

  • separate size and data variables instead are easily modified for selecting ranges (+n when passed to a function for example)
  • separate size variable can be shared for same size arrays
  • sometimes a count of items to be processed that is equal to the array size is passed to functions. the size in the struct then duplicates this value. for example make_array(count, struct-size-and-data)

c vs higher-level languages

the biggest slowdown i have experienced comes from having to be more specific:

  • specific with types - each function implementation works only for the exact type combinations it was defined for. uint32 and uint64 need two separate implementations, float and uint functions may not be easily templated because of low-level details. preprocessor syntax needs distracting line escaping
  • specific in what is done - more low-level options, more possible variation, more performance implications
  • interleaving memory allocation code and complicated ownership and cleanup tracking