2018-05-11

choosing a programming language

there are many programming languages - thousands. and the number grows with each year

informally "programming language" often describes syntax, semantics, compiler implementations, available libraries, extensions and the community all in one. but the basis of languages are syntax and semantics, because the core utility of a language for a developer is to describe what a computer should do. the implementation and software written in it can be seen as just following from that. that is why i see syntax and semantics as the main criterion for choosing a programming language. learning a language and getting to know its detailed semantics and associated environment well is a big investment because it takes many years of learning to become really good with it

people usually learn the languages the company uses, they are told to learn, the ones that seem the most popular or those of which the marketing reached them then stick with it for life. how languages become popular is an interesting topic by itself. often languages become popular because people write programs in it that become popular, but that does not really say much about the qualities of the language used though

the initial choice is often incidental, but the choice for the tool to be used to solve the next problem often depends on the tool previously used: say someone spent 2 weeks learning language x and a new program is to be written - why start with a new language and require significantly more time for solving the task at hand because of the unfamiliarity, instead of just re-using the previous tool and associated knowledge and get it done. the dependency grows quickly. why ever use a different tool, when things can be done with the known with known effort

then there is fun in figuring things out: something might be fundamentally uncomplicated and creating issues that would never need to be solved with other tools, but that is usually not obvious and it is exciting to find solutions so the investment continues

the time when a programming language was invented does not matter as much. languages can evolve and old designs can prevail being superior. the state of implementations is more important

with programming languages it seems to be more likely for popular bad things to be improved with much effort until it works just well enough than for people to switch to fundamentally better things

questionable arguments

"choose the most popular, what your friends use, or what you have been first taught" - does not actually look much at the language, avoids an informed decision, appeal to popularity

"it does not matter - choose what you feel comfortable with" - ineffectively emotional, just careless if it becomes a choice by habit

"every programming language exists as a tool for a specific purpose it excels for" - simply false. most languages are pretty similar. sometimes the available libraries make a difference for a purpose (like a domain specific language), but only in rare cases the core syntax and semantics

"choose the one that has libraries for things you want to do, or the one with the most libraries" - pragmatic, but not a fundamental choice of syntax and semantics. also, using a language creates a network effect that benefits the propagation of the language

things you can do

look at the license for the compilers or language specification, do not use languages or interpreters/compilers that are not free software. not only might you have to pay (more) money for programming, and possibly regularly, proprietary languages are also usually controlled by central entities that would rather die with the copyright than to allow others to fix it, or would suddenly enact any kind of new restriction they seem fit for themselves putting programmers at a disadvantage

look at the syntax and naming scheme. does it make sense? does it seem to follow a coherent style, orderly, where you pay as you go and different patterns have a good cost-benefit ratio, or does the design seem to have been made ad-hoc? search for syntax references and compare between languages, or compare on hyperpolyglot or similar sites

where is the specification? is there a document that formally defines the language, or is one implementation the reference?

try to find out the reasons why people choose to use or recommend a language. do they have a need that you would share with them? do not accept reasons that do not match with yours

does it require a specific code editor? i would advise against such language environments. one would always be hooked and dependent on the same editor application, its features and its development future

download the interpreter/compiler and try it out. does the effort to write in it, every single character, seem to be there for a good reason? and if not, could it perhaps be easily improved in the future? an implementation is more likely to be changed than specifications including things like the requirement for semicolons as expression terminators for example

some features that seem user-friendly to beginners might actually become limiting as it scales

readability

a note about the word readability, which is sometimes used to describe programming language syntax. the word seems to be easily confused with "ease of thought". many things are sure easy, if they have been trained for a few years. notice that reading is a process that takes place for extracting information from a visual pattern for example. this process can be divided into smaller processes, little actions that are necessary. for example having to look back to the beginning of a line to read the next one, which takes time, or recognizing something surrounded by a whitespace character as a separate thing, remembering a mix of agglutinated characters to be associated with a concept, et cetera. humans are not computers and costs may differ, but too much micro-effort over time might be stressful and limit intellectual capacity. how easy something is to be read is a question of the amount of required non-transferable knowledge and the effort required after the knowledge has been acquired and the action trained

notation can allow different degrees of variability, where different formattings look significantly different but have the same meaning. for example, whitespace is usually variable and many significantly different code formatting styles emerge from different authors, yet it is often ignored by the compiler in many cases and the meaning of the program is not influenced. but different formattings can make code harder to work with

language ranking

while the following list does not go far beyond opinion, i only rate languages i have used for at least one non-trivial program from scratch, at least 16 hours or more

tier 1

scheme

the simplest and most flexible language. syntax and semantics follow an ideal of unusually high pragmatic and auto-didactical efficiency. it is almost as simple as possible to learn, parse and translate

it allows unobstructed functional abstraction that scales. any programming paradigm is possible and a matter of applying the fundamental building blocks - metaprogramming, anything goes

gives freedom that other languages take away: for example having a large range of allowed characters for custom identifiers. and it does not use much needless/noisy syntax except just round brackets - no semicolons, no commas, braces, brackets, colons

s-expression notation for scopes/lists and prefix notation is generic and consistent. many do not realise how multi-functional and straightforward this notation is, and the freedom it opens up to focus on more important things. its homoiconic syntax allows for uncomplicated structural editing and simple code generation (macros)

scheme uses a lot of round brackets - but notice that it is only round brackets and nothing else, for structuring. the regular self-similar style of the notation, combined with the usually relatively indicative and consistent plain english identifier naming gives an important didactic quality, an auto-didactic one, relevant each time when the code is read again. programming is often about managing complexity and having less superfluousness in a language, syntactically and semantically, reduces the complexity necessary to be managed. i would argue that the complexity created through the use of extra character based syntactic patterns grows not linear but even faster

scheme has standards documents that formally specify the language, which is an unusual thing considering that many other languages are implemented without a plan so to say, sometimes in an ad-hoc way where only the hardcoded implementation defines how the language is supposed to work

scheme allows me to create more with much fewer errors per time compared to other languages

as soon as you know for example this scheme syntax overview, you basically know all the syntax and can start building big things

side note: the lisp curse

if you want something without many brackets see wisp (srfi-119)

sph-sc

c written with s-expressions

c

relatively easy to predict expression outcomes (principle of least suprise), low-level with static typing, manual memory management and direct memory access. compilers to create highly optimised machine code for a lot of different platforms available. lacks features like a basic module (dependency) system or hygienic macros

tier 2

sescript: javascript written with s-expressions. the compiled output is completely browser and nodejs compatible. it can use every javascript library. it is javascript, just written differently

coffeescript: javascript without the clutter. indent-based syntax, can use every javascript library, nodejs compatible. my most favorite syntax after schemes. it is smart with reducing ambiguity and making well recognizable patterns out of a small set of different ones. the syntax is also useful for configuration files, as it is simpler and more intuitive than for example yaml, ini, json, xml

tier 3

javascript: quite expressive, but a bit noisy syntactically (semicolons, which are optional, but people use them nevertheless, c-style, square/curly/round brackets mix, high formatting variability). implicit type conversions, truthyness and falsyness, automatic global scope, many syntactically different ways to create different kinds of objects. with having first-class function it bears some similarity to scheme

ruby: relatively clean design. flexible syntax. specialised for object-oriented programming (everything is an object). other strengths are one-liner scripts, string processing, metaprogramming using strings, reflection. invented as a successor to python, perl and smalltalk. uses begin/end keywords instead of indent syntax for scopes unfortunately

python: quite similar to ruby, in some ways cleaner and in some ways less clean, older. pro: indent syntax, con: the rest of syntax and semantics have more irregularities, exceptional elements, limits and special character patterns (self, conventional underscore prefixes, everything is not an object, some cryptic forms)

clojure: more similar to common-lisp than to scheme. gets many things mostly right, like hashtable literals or preferring immutable data-structures. runs on and compiles for the multi-platform java virtual machine. so it is basically java and can use every java library. eclipse license

discouraged

shell, bash: while shell is very useful because it is the language on many command-lines, and is fine for small scripts, everything bigger than 30 lines... the syntax is a cryptic hodgepodge. there is so much formatting variability, and the keywords and built-in binding names are too meaningless and problematic (if/fi, variable substitution and quoting, composition). posix shell should be preferred to bash/zsh and others for portability

visual basic: superceeded by for example ruby. historic misfeatures and missing features (characters). rubys syntax is in many ways similar

common-lisp: total bloat in comparison to scheme. design and naming scheme is far less consistent and particularly arbitrary (defun, defvar, progn) with few english language words

java: business bloat wreck. many people start with this language because it is often taught in schools. specialised for object-orientation (files are classes, classes required). verbosity and repetition is the tenet here. uppercase characters, camelcase. has a vm and works on many different platforms. somewhat similar to perl in what it does to the thinking process - making oneself adjust to be accepting of overcomplicated solutions for example. a language that needs more comments than code. it is self-similar in the sense that third-party libraries are like the core features: there are always many variants to choose from, but often none are good. not only will you probably rewrite a lot of basic stuff regularly, and have to rewrite it because there are no features for abstracting it, and invent new patterns to pass variable/class/method references around, and generally just have to write a lot and create a big files with few things to happen

php: relatively predictable. not very expressive. big and useful intergrated library, which is one of the biggest benefits since it comes compiled with many features. requires dollar signs for each local variable, particularly irregular naming scheme and syntax and generally lacky design, noisy (semicolons, dollar signs, backslashes in namespaces), object-orientation features, more complex scoping rules, implicit type conversions, often no warnings when a variable used is not defined, no (good) first class functions. uses "function" as the keyword to define routines which are rarely functions. also a language that often needs more comments than code. there is a site that documents some parts that are considered bad: phpsadness

bad

perl: a bit crazy (context sensitive expressions, dynamic scoping, shifting arguments, variable prefixes, scalar context, blessing, "my", 1; at the end of modules, proliferation of operators, difficult to install modules with the project, ...). its syntax is relatively easy to use and the interpreter forgiving but the code is still odd. you want to learn this stuff unnecessarily? it is kind of esoteric. bash is similar. everything that perl is deemed good for ruby can do better. but the community is huge and there are lots of modules, many well maintained

c++: not the successor of c. the extensions that c++ makes to c are for the most part unnecessary, overcomplicated or seriously questionable (object-orientation, exception handling). stl, templates and a hierarchical type system (think structs that can inherit from each other) are often quoted as the biggest benefits

c#: is non-free - it belongs to and is developed by the company microsoft for its closed source proprietary windows platform. has some focus on object-orientation. much more like java than c. internet searches for c lead to c# results being included, and like c++ it falsly suggests being an official successor to the c language - deceptive marketing

objective-c: non-free - belongs to the company apple and is explicitly and only targeted to its closed source proprietary ios and osx operating systems. unusual syntax that uses keyword arguments a lot (which i think is a good thing)

swift: successor to objective-c, same issues

neutral

haskell: functional perl. resembles traditional mathematical notation (in a way). over 700M install size. more functional than other languages and lazily typed (which is not necessarily better). cryptic code. can not be learned intuitively by knowing a few patterns, not homoiconic, probably a lot of work to write a parser for. its main package manager is the worst i have worked with. language is still better than most other languages because of academic grounding

lua: may be used in programs as an extension language, but it is like ruby with less of its features. supposedly small and easy to integrate

scala: functional features, c-style syntax, based on java and the java virtual machine like clojure

emacs-lisp: works only in emacs. it is not that bad, but there could be better


tags: programming start document guide computer opinion choice language