2020-08-20

possibly useful features missing in c

namespaces: controlling scope of bindings

  • all bindings declared on the top-level of a file exist for all following code, even when the declaring file is included
  • included library code cannot hide helper code and internal code like type definitions and macros. it is difficult to build shareable independent utility libraries as c code, because among type declarations, macros, variable and function names, something is going to conflict or shouldnt be exported
  • cant rename bindings on import. for example, if they are named too generically
  • only alternative is prefixing, which leads to long identifiers
  • any code before inclusion may affect any following included library
  • tiny libraries are less practical because of the overhead of building shared libraries

renaming bindings is not possible as any alias requires a reference to the aliased in scope. if c had namespaces, there might be less need for binary modules, as c code could be included without conflict. this would be similar to other languages like javascript, where modules are just included code.

current options

  • no option, wait till one day it is added to the c standard
  • compile as cpp and use its namespace syntax. see also dotc
  • parse c and its preprocessor, add namespace syntax by rewriting identifier names at definition and places of use - hide unexported bindings, eventually rename exported bindings
  • compile a shared library binary object and use a header file and linker to use it in other code. this is the currently common practice for modularising c code. clang modules also work like this. limiting exports from a shared library needs an extra exports file or code annotations. all exports, including exposed types, have to be declared in the header file and users cant rename them without changing the source. does not fully solve the problem because everything that is in the header file can conflict

memory ownership semantics

declaring and tracking of which part of the program is supposed to allocate needed memory (caller or callee), and which part is supposed to free it (especially when passing pointers between functions). ownership can be passed, received and memory can be lend.

more

  • the preprocessor can not generate multiple expressions from variable length arguments
  • macros that dont need line escaping
  • assignment of multiple values to heap arrays from literals. int* a = malloc(3 * sizeof(int)); a = {1, 2, 3};
  • keyword arguments: particularly useful for optional arguments. scheme has lambda* and javascript uses its object notation to the same extent
  • defining arrays as types. current option is to use structs, which uses padding and therefore not as space efficient in arrays
  • to prevent a file from being included more than once, the whole file content has to be enclosed with a preprocessor if-expression or alternatively preceeded by a pragma-once, the latter might be the easiest solution but perhaps less portable
  • anonymous functions: to pass procedural information as an argument, like abstracting the inside of a for-loop, or for functions used for functions pointers generally. currently one has to define a separate global function and use a function pointer or use the less portable compound literals with limited features
  • macros could be more useful if they could use variable names that can not conflict with the surrounding code they are used in. one current alternative is to use unusual variable name prefixes
  • a fractional type for exact number representations like 14/3
  • #if is not possible inside #define
  • symbols: literal character based identifiers. string literals need string comparison and number variables need extra declaration. enums are probably the next best thing but still need declaration
  • names for values and shadowing: variables are associated with memory space, but there is no really simple way to associate a name with an expression just to not have to repeat it. there is the preprocessor, but it has a different syntax with lines that have to be moved to the beginning of the line and it doesnt shadow variables and needs to be later unset to prevent conflicts