2022-12-02

possibly useful features missing in c

part of c programming

namespaces: controlling scope of bindings

  • currently, all bindings declared on the top-level of a file exist for all following code, even when the declaring file was only included
  • included library code can not hide helper code and internal code such as declarations, type definitions and macros. it is difficult to build independent utility libraries suited for sharing with c code because something is going to conflict or should not be exported
  • any code before inclusion may affect, or conflict with, any following included library
  • small libraries become less practical because of the overhead of building shared library objects and managing header files
  • can not rename bindings on import. for example, if they are named too generically it is not possible to add a prefix
  • prefixing names on definition would lead to long identifiers and does not scale because names would have to be longer and longer in more deeply nested contexts
  • it is not possible to rename bindings after inclusion as any alias requires a reference to the aliased in scope. if c had namespaces, there might be less need for binary modules, as c code could be included without conflict

current options

  • wait till one day it is added to the c standard
  • compile as cpp and use its namespace syntax. see also dotc
  • parse c and its preprocessor, add namespace syntax by rewriting identifier names at definition and places of use - hide unexported bindings, eventually rename exported bindings
  • compile shared library binary objects for each separate namespace, and use header files and a linker to use it in other code. this hides only code that is not part of the header. it is the current common practice for modularizing c code. clang modules also work like this. limiting what is actually exported by a shared library needs an extra exports file or code annotations.

memory ownership semantics

specifying which part of the program is supposed to allocate needed memory (caller or callee) and which part is supposed to deallocate it (especially when passing pointers between functions). ownership can be passed, received and memory can be lend

more

  • preprocessor:

    • using macros to generate multiple expressions from variable length arguments
    • macros that do not need line escaping
    • macros might be more useful if they can internally declare variable names that are guaranteed to not conflict with the surrounding code they are used in. one current alternative is to use unusual variable name prefixes inside the macro
    • #if can not be used inside #define
  • literal assignment of multiple values to heap allocated arrays. int* a = malloc(3 * sizeof(int)); a = {1, 2, 3};
  • keyword arguments. particularly useful for optional arguments. scheme has lambda*, and javascript uses its object notation to the same extent
  • defining arrays as types. currently, the only option is to use structs, which uses padding and is therefore not as space efficient
  • to prevent a file from being included more than once, the whole file content has to be enclosed with a preprocessor if-expression or alternatively preceeded by a pragma-once. the latter might be the easiest solution but is perhaps less portable
  • a fractional type for exact number representations like 14/3
  • symbols: literal character-based identifiers. string literals need string comparison, and number variables need extra declaration. enums are probably the next best thing but still need declaration
  • names for values and shadowing: variables are associated with memory space but there is no really simple way to associate just a name with an expression for not having to repeat it. there is the preprocessor, but it has a different syntax with lines that have to be moved to the beginning of lines, and it does not shadow variables, and is valid for the rest of the code and needs to be later unset to prevent conflicts
  • reflection features to infer the pointer target type, to allow macros like this: "allocate_memory(my_t_pointer)" -> "my_t_pointer = malloc(sizeof(my_t))"
  • anonymous functions: to pass procedural information as an argument, like abstracting the inside of a for-loop, or passed to functions that accept function pointers. currently, one has to define a separate global function and use a function pointer or use the less portable compound literals with limited features