2022-12-02

c programming

subtopics

single compilation unit

for relatively small projects, a single main file that includes all necessary code might be preferable to makefiles, headers and separate objects.

the more common way is to compile parts of the application separately into machine code objects, maintain header files with declarations for each object and then use a linker to connect the code in object files. this may save time in the development of big projects when most objects have previously been compiled and only a few have changed and need to be recompiled.

further information: wikipedia article. sqlite uses a single compilation unit. the style is similar to javascript in html without module systems where all source files and dependencies are included at the beginning of an html file before use.

benefits

  • only one headerfile is needed for linking
  • no complicated makefiles have to be written and maintained
  • only one object has to be compiled
  • only a single call to the compiler is needed
  • increased potential for automatic code optimizations

downsides

  • all included bindings are defined for all the following code and conflicts become likely, because c has no namespacing feature
  • all code has to be recompiled if one source file changes, which can take a long time

links

tips and tricks, hints

  • using offsetof to get a pointer to a structure given only a pointer to a struct field
  • function calls are expressions, so they can be used to wrap more complicated syntax into an expression
  • consider setting output variables only on success
  • for functions to accept a variable number of arguments, the preferred solution in c is often to define _n suffix variants that take a specific number of arguments. for example, list_3(2, 3, 4)
  • string literals are usually signed integers by default. -funsigned-char
  • c floor is slow
  • assignment of large struct copies all content automatically, even with nested array types, as long as the memory is declared as stored in the structure itself
  • setting a struct to zero may be more complicate because a struct does not have a single value
  • with #ifndef, files can be created that support macro variables set before they are included, where the macro variables only get defined inside if undefined. this can be used for configuration. the files can be included multiple times with different macro variable values set before
  • pattern: initialize variable to zero, allocate heap memory, final function cleanup checks if allocated and frees if necessary
  • it is not usually possible to pass macro names to macros and construct other macro names from them

functions that work on memory buffers can work without change on file content using memory maps

file_buffer = mmap(0, file_size, PROT_READ, MAP_SHARED, file_descript, 0);
MD5((unsigned char*) file_buffer, file_size, result);
munmap(file_buffer, file_size);

flat multidimensional array vs array of arrays

[1 2 3 1 2 3] vs [[1 2 3] [1 2 3]]
  • it is possible to allocate one long array with n elements representing each sub array, or alternatively use an array of actual arrays of length n
  • flat

    • easier to allocate because only a single memory region is required
    • sub arrays are fixed size
    • the c type[][] arrays are of this format
    • if not using standard type[][] arrays

      • access with [sub_i * sub_size + i]
      • deeper nesting makes the indexing more complicated
    • a variant is to store sub arrays interleaved, for example [1 1 2 2 3 3]. this changes the indexing
  • nested

    • easier to access because sub arrays can be iterated with a simple incremented index, without having to incorporate the sub array size in the indexing calculation

    • easier to use with generic array operations, for example sorting

    • one array and every sub array has to be allocated separately and later freed

structs that store arrays and size

pro:

  • no need declare additional size variable, as it is available with the struct variable
  • no need to pass array size as an extra argument to functions, especially when the function takes multiple arrays

con:

  • separate size and data variables instead can be more easily used in calculations for selecting ranges (+n when passed to a function for example)
  • a separate size variable can be shared with other arrays of the same size
  • sometimes a count of items that are to be processed is to be passed to functions. the value in the struct may be unnecessary in this case. for example make_array(count, struct_size_and_data)

c vs higher-level languages

the biggest slowdown i have experienced when programming in c versus other languages comes from having to be more specific:

  • specific with types - each function implementation works only for the exact type combinations it was defined for. functions that process uint32 and uint64 need two separate definitions, float and uint functions may not be easily macro templated because of low-level details. macro templating with preprocessor syntax needs distracting line escaping
  • specific in what is done - more low-level options, more possible variation, more performance implications
  • distraction by interleaved memory allocation code, as well as complicated ownership and cleanup semantics
  • more has to be done. for example, declaration, allocation, deallocation, initialization, etc, but this fine grained control may be worth the cost in principle