Semantic density: Difference between revisions

From APL Wiki
Jump to navigation Jump to search
m (Mark as an essay, as it's mainly Stephen Taylor writing about his own term. If we want an article on the concept, it should be a new page.)
mNo edit summary
Line 14: Line 14:
Exceptions to this are
Exceptions to this are


* control structures: up to two levels of nesting, readers follow them;
* [[control structure]]s: up to two levels of nesting, readers follow them;
* characters other than the Roman alphabet or Arabic numerals; the reader either parses them as punctuation or mathematics (eg 2÷3), or ignores them.  
* characters other than the Roman alphabet or Arabic numerals; the reader either parses them as punctuation or mathematics (eg 2÷3), or ignores them.  



Revision as of 09:49, 19 July 2020

Semantic density is a metric of the readability of a program by a non-programming domain expert.

Programs work with representations of some domain. Every program must thus be read in two ways:

  1. as describing changes in the computer
  2. with reference to the domain

The programmer must understand enough of the first to have the computer animate the representational scheme – adequately to the needs of the domain expert. The domain expert can participate in this process most closely when able to follow the domain logic in the program.

This is possible when a sufficiently high proportion of the tokens (eg names of variables or functions) are drawn from the vocabulary of the reader. (Writers of natural languages, under a general injunction to write with their readers in mind, will find nothing surprising in this.)

Leaving aside any familiarity with programming, the minimum threshold appears to vary little between readers, and is in all cases high. Even a low proportion of ‘foreign’ terms degrades readability.

Exceptions to this are

  • control structures: up to two levels of nesting, readers follow them;
  • characters other than the Roman alphabet or Arabic numerals; the reader either parses them as punctuation or mathematics (eg 2÷3), or ignores them.

Two common features of programming languages obstruct this effect:

  • tokens that can't be omitted, such as void or function
  • most primitive functions have Roman-alphabet names

Certain writing techniques facilitate it:

  • assigning names only once; homonyms are confusing enough in natural languages;
  • naming only objects that correspond to terms in the reader's vocabulary;
  • using (in Dyalog, NARS2000, ngn/apl, dzaima/APL, GNU APL) anonymous dfns (lambdas) to avoid assigning other names;
  • using tacit programming (in Dyalog, NARS2000, ngn/apl, dzaima/APL) to avoid using argument names in expressions.

Further reading

Kenneth Iverson used APL in the 60s to develop readable, executable models of key processes in different scientific fields

Writing software is more like drafting legislation or writing a screenplay than it is like engineering. Paper presented at XP2006, Oulu, June 2006.

Why write specifications when you can collaborate with the users on executable code? Introduces the concept of 'semantic density' in constructing Domain-Specific Notations.