Readability: Difference between revisions

From APL Wiki
Jump to navigation Jump to search
(Created page with "'''Pornography''' is a term traditionally used by APLers to describe code that is hard to read, or uses unusual constructs. Code golf often results in pornographic code. =...")
 
m (Text replacement - "</source>" to "</syntaxhighlight>")
 
(21 intermediate revisions by 2 users not shown)
Line 1: Line 1:
'''Pornography''' is a term traditionally used by APLers to describe code that is hard to read, or uses unusual constructs. [[Code golf]] often results in pornographic code.
Maintaining readability of APL can take a special effort. It is easy to write very dense code, and the mathematical look of APL can encourage usage of single-letter names. Since [[Phil Abrams]] used the term at the [[APL '73]] conference,<ref>Abrams, Phil. ''Program Writing, Rewriting and Style''. APL Conference 73. Canadian Printco Limited. 1973.</ref> APLers have traditionally used ''pornography'' to describe code that is hard to read, or uses unusual constructs. [[Alan Perlis]] countered that ''But as we all know, being people of the world, pornography thrives!''<ref>Perlis, Alan. [https://www.jsoftware.com/papers/perlis78.htm Almost Perfect Artifacts Improve only in Small Ways: APL is more French than English]. [[APL '78]].</ref>
 
== Causes and mitigation ==
[[Code golf]] often results in pornographic code, as does the practice of cramming a whole algorithm into a single line, forming a [[one-liner]]. When computer memory was very limited, such code golf was often a necessary evil.
 
With the advent of [[dfn]]s, it became possible to define a full function or operator on a single line. Since APL [[comment]]s begin at the comment symbol (<syntaxhighlight lang=apl inline>⍝</syntaxhighlight>) and continue until the end of the line, it is impossible to comment a one-liner dfn except outside the source. This, coupled with the inability of a [[wikipedia:debugger|debugger]] to meaningfully trace through a one-liner (unless it is capable of [[primitive]]-by-primitive tracing), constitute hardships for a human reader that attempts to read such code.
 
APL containing user defined names is generally not statically parseable, as the [[name class]], and thus the syntactic role, of such names isn't known until runtime.<ref>[[John Scholes|Scholes, John]]. [https://dfns.dyalog.com/n_kk.htm Kind Koloring of d-fnop named ⍵.] Dfns workspace. [[Dyalog Ltd.]]</ref>
 
Much can be done to improve readability of code, and of APL in particular. Breaking code up into meaningful statements, avoiding one-liners, using descriptive names, and supplying plentiful comments are a good start. Adherence to a well-defined style guide can also help. [[Adám Brudzewsky#Publications|Adám Brudzewsky's style guide]] is an example of such a style guide.
 
== Examples ==
== Examples ==
In [[Books#APL_.E2.80.95_An_Interactive_Approach|APL – an Interactive Approach]], the authors the authors describe the following code as “almost pornographic”:<!--- Intentionally not syntax coloured --->
=== Gilman & Rose ===
:<code>r←(+/x×y)÷((+/(x←x-(+/x)÷⍴x)*2)×+/(y←y-(+/y)÷⍴y)*2)×.5</code>
In [[Books#APL_.E2.80.95_An_Interactive_Approach|APL ― An Interactive Approach]], the authors describe the following code, which computes the correlation coefficient, as “almost pornographic”:
:<syntaxhighlight lang=apl inline>r←(+/x×y)÷((+/(x←x-(+/x)÷⍴x)*2)×+/(y←y-(+/y)÷⍴y)*2)×.5</syntaxhighlight>
By splitting the expression intro even a moderate number of pieces, a symmetry is revealed:
<syntaxhighlight lang=apl>yVar←+/(y-(+/y)÷⍴y)*2
xVar←+/(x-(+/x)÷⍴x)*2
r←(+/x×y)÷(xVar×yVar)×0.5</syntaxhighlight>
This also avoids reusing variable names, and thus ensures that the code can be rerun from any point. The chosen additional variable names are still short, but quite indicative of what they signify ([[wikipedia:variance|variance]]). Finally, the <syntaxhighlight lang=apl inline>.5</syntaxhighlight> is expanded to <syntaxhighlight lang=apl inline>0.5</syntaxhighlight> which helps to clarify that this is a decimal number and not an [[inner product]].
 
A more modern approach breaks out the symmetry into a utility function [[train]], and uses [[leading axis theory]] combined with [[operator]]s and reordering of terms to avoid parentheses (which would otherwise require a mental stack to understand). Finally, the correlation coefficient is defined as a stand-alone function, using [[inner product]] to combine [[sum]]mation with [[multiply|multiplication]]
<syntaxhighlight lang=apl>Var←+/2*⍨⊢-+⌿÷≢
R←2×+.×÷×⍥Var
r←x R y</syntaxhighlight>
Note that <syntaxhighlight lang=apl inline>+⌿÷≢</syntaxhighlight> is an [[idiom]] (common phrase) and is read as ''average'' by even moderately experienced APL programmers.


The [[APL2]] Idiom list includes the following entry:
=== IBM ===
The [[idiom]] list included with [[APL2]] includes the following entry:<ref>Cason, Stan. [ftp://ftp.software.ibm.com/ps/products/apl2/info/APL2IDIOMS.pdf#page=3 APL2 IDIOMS Library], Assignment Algorithms. IBM.</ref>
:{| style=width:100%
:{| style=width:100%
|<source lang=apl inline>X←'line1',0⍴Y←'line2'</source>||<source lang=apl inline>⍝ Pornography. Combining two lines into one.</source>
|<syntaxhighlight lang=apl inline>X←'line1',0⍴Y←'line2'</syntaxhighlight>||<syntaxhighlight lang=apl inline>⍝ Pornography. Combining two lines into one.</syntaxhighlight>
|}
|}
This was once a common technique before, even though it is prone to fail in where the value to the left of <syntaxhighlight lang=apl inline>,0⍴</syntaxhighlight> isn't a vector, for example in the following example where <syntaxhighlight lang=apl inline>X</syntaxhighlight> becomes a 1-element [[vector]] instead of the intended [[scalar]]:
<syntaxhighlight lang=apl>
      X←'l',0⍴Y←'line2'
      Y∘.=X
1
0
0
0
0
</syntaxhighlight>
With the addition of [[Left]] (<syntaxhighlight lang=apl inline>⊣</syntaxhighlight>) to the language, this type of hack became became entirely obsolete:
:<syntaxhighlight lang=apl inline>X←'line1' ⊣ Y←'line2'</syntaxhighlight>
The primitive leaves its left argument unmodified:
<syntaxhighlight lang=apl>
      X←'l' ⊣ Y←'line2'
      Y∘.=X
1 0 0 0 0
</syntaxhighlight>
The Diamond [[statement separator]] (<syntaxhighlight lang=apl inline>⋄</syntaxhighlight>) provides an alternative means of inlining multiple statements:
:<syntaxhighlight lang=apl inline>Y←'line2' ⋄ X←'line1'</syntaxhighlight>
Note that in all the above, <syntaxhighlight lang=apl inline>Y</syntaxhighlight> is assigned first.
=== Morten Kromberg ===
[[Morten Kromberg]] asked one of his colleagues to “Please avoid this kind of pornography:”
[[Morten Kromberg]] asked one of his colleagues to “Please avoid this kind of pornography:”
:<source lang=apl inline>
:<syntaxhighlight lang=apl inline>
ns(⍎container.⎕NS)←⍬
ns(⍎container.⎕NS)←⍬
</source>
</syntaxhighlight>
A much more readable version would be:
Avoiding the unusual [[modified assignment]] (using the 2-[[train]] <syntaxhighlight lang=apl inline>⍎⎕NS</syntaxhighlight> as modifying function) helps:
:<source lang=apl inline>
:<syntaxhighlight lang=apl inline>
ns←ns container.(⍎⎕NS) ⍬
</syntaxhighlight>
Finally, splitting the 2-train apart makes it even clearer:
:<syntaxhighlight lang=apl inline>
ns←⍎ns container.⎕NS ⍬
ns←⍎ns container.⎕NS ⍬
</source>
</syntaxhighlight>
A new [[namespace]], with the original value of <syntaxhighlight lang=apl inline>ns</syntaxhighlight> as name, is created inside <syntaxhighlight lang=apl inline>container</syntaxhighlight> and the character representation <syntaxhighlight lang=apl inline>'#.container.ns'</syntaxhighlight> is returned from <syntaxhighlight lang=apl inline>⎕NS</syntaxhighlight> to <syntaxhighlight lang=apl inline>⍎</syntaxhighlight> which evaluates the name to a reference, that in turn replaces the previous value of <syntaxhighlight lang=apl inline>ns</syntaxhighlight>. Note that <syntaxhighlight lang=apl inline>⎕NS</syntaxhighlight> returns fully qualified namespace path to the newly created namespace, and thus it doesn't matter in which namespace <syntaxhighlight lang=apl inline>⍎</syntaxhighlight> is called.
=== Honeywell ===
[[Honeywell]] <ref>Honeywell. [http://www.softwarepreservation.org/projects/apl/Books/198512_Multics%20APL%20Users%20Guide_AK95-02.pdf#page=106 Multics APL User's Guide] (AK95-02), 3-16. December 1985.</ref> used a more specific definition:
:In APL, <u>pornography</u> is defined informally as the dependence upon undefined evaluation order for the successful or correct evaluation of an APL statement.
This refers to things like
<syntaxhighlight lang=apl>
a←4
(a←3)×a
</syntaxhighlight>
where it is undefined whether the initial value for <syntaxhighlight lang=apl inline>a</syntaxhighlight> is used at all in the second line, yielding 12, or whether the second assignment is done before [[times]] gets its right argument, and thus the result is 9.
 
Similarly, in
<syntaxhighlight lang=apl>
i←2
(2 1⍴10 20)[i;i←1]
</syntaxhighlight>
the evaluation order of the statements in the [[bracket indexing]] is undefined. If <syntaxhighlight lang=apl inline>i</syntaxhighlight> is evaluated before <syntaxhighlight lang=apl inline>i←1</syntaxhighlight> then the result is 20, otherwise it is 10.
 
The Multics APL manual goes on to use the terms ''monstrosity'' and ''eyesore'' for code published in an APL newsletter, such as
:<syntaxhighlight lang=apl inline>Z[B+(C∧X∊D)/⍳⍴X;]+(24p' Y9  X9 ')[(C←(-≠\''''=X)∧A≤⍴D)/A←(D←'⍵⍺')⍳X;]</syntaxhighlight>
The manual suggests that this code should be split into the following expressions:
<syntaxhighlight lang=apl>
D←'⍵⍺'
A←D⍳X
C←(-≠\''''=X)∧A≤⍴D
B←(C∧X∊D)/⍳⍴X
Z[B;]←(24p' Y9  X9 ')[C/A;]
</syntaxhighlight>
 
== See also ==
* [[Semantic density]]
* [[Function-operator overloading]]
== References ==
<references/>
{{APL syntax}}[[Category:Culture]]

Latest revision as of 22:28, 10 September 2022

Maintaining readability of APL can take a special effort. It is easy to write very dense code, and the mathematical look of APL can encourage usage of single-letter names. Since Phil Abrams used the term at the APL '73 conference,[1] APLers have traditionally used pornography to describe code that is hard to read, or uses unusual constructs. Alan Perlis countered that But as we all know, being people of the world, pornography thrives![2]

Causes and mitigation

Code golf often results in pornographic code, as does the practice of cramming a whole algorithm into a single line, forming a one-liner. When computer memory was very limited, such code golf was often a necessary evil.

With the advent of dfns, it became possible to define a full function or operator on a single line. Since APL comments begin at the comment symbol () and continue until the end of the line, it is impossible to comment a one-liner dfn except outside the source. This, coupled with the inability of a debugger to meaningfully trace through a one-liner (unless it is capable of primitive-by-primitive tracing), constitute hardships for a human reader that attempts to read such code.

APL containing user defined names is generally not statically parseable, as the name class, and thus the syntactic role, of such names isn't known until runtime.[3]

Much can be done to improve readability of code, and of APL in particular. Breaking code up into meaningful statements, avoiding one-liners, using descriptive names, and supplying plentiful comments are a good start. Adherence to a well-defined style guide can also help. Adám Brudzewsky's style guide is an example of such a style guide.

Examples

Gilman & Rose

In APL ― An Interactive Approach, the authors describe the following code, which computes the correlation coefficient, as “almost pornographic”:

r←(+/x×y)÷((+/(x←x-(+/x)÷⍴x)*2)×+/(y←y-(+/y)÷⍴y)*2)×.5

By splitting the expression intro even a moderate number of pieces, a symmetry is revealed:

yVar←+/(y-(+/y)÷⍴y)*2
xVar←+/(x-(+/x)÷⍴x)*2
r←(+/x×y)÷(xVar×yVar)×0.5

This also avoids reusing variable names, and thus ensures that the code can be rerun from any point. The chosen additional variable names are still short, but quite indicative of what they signify (variance). Finally, the .5 is expanded to 0.5 which helps to clarify that this is a decimal number and not an inner product.

A more modern approach breaks out the symmetry into a utility function train, and uses leading axis theory combined with operators and reordering of terms to avoid parentheses (which would otherwise require a mental stack to understand). Finally, the correlation coefficient is defined as a stand-alone function, using inner product to combine summation with multiplication

Var←+/2*⍨⊢-+⌿÷≢
R←2×+.×÷×⍥Var
r←x R y

Note that +⌿÷≢ is an idiom (common phrase) and is read as average by even moderately experienced APL programmers.

IBM

The idiom list included with APL2 includes the following entry:[4]

X←'line1',0⍴Y←'line2' ⍝ Pornography. Combining two lines into one.

This was once a common technique before, even though it is prone to fail in where the value to the left of ,0⍴ isn't a vector, for example in the following example where X becomes a 1-element vector instead of the intended scalar:

      X←'l',0⍴Y←'line2'
      Y∘.=X
1
0
0
0
0

With the addition of Left () to the language, this type of hack became became entirely obsolete:

X←'line1' ⊣ Y←'line2'

The primitive leaves its left argument unmodified:

      X←'l' ⊣ Y←'line2'
      Y∘.=X
1 0 0 0 0

The Diamond statement separator () provides an alternative means of inlining multiple statements:

Y←'line2' ⋄ X←'line1'

Note that in all the above, Y is assigned first.

Morten Kromberg

Morten Kromberg asked one of his colleagues to “Please avoid this kind of pornography:”

ns(⍎container.⎕NS)←⍬

Avoiding the unusual modified assignment (using the 2-train ⍎⎕NS as modifying function) helps:

ns←ns container.(⍎⎕NS) ⍬

Finally, splitting the 2-train apart makes it even clearer:

ns←⍎ns container.⎕NS ⍬

A new namespace, with the original value of ns as name, is created inside container and the character representation '#.container.ns' is returned from ⎕NS to which evaluates the name to a reference, that in turn replaces the previous value of ns. Note that ⎕NS returns fully qualified namespace path to the newly created namespace, and thus it doesn't matter in which namespace is called.

Honeywell

Honeywell [5] used a more specific definition:

In APL, pornography is defined informally as the dependence upon undefined evaluation order for the successful or correct evaluation of an APL statement.

This refers to things like

a←4
(a←3)×a

where it is undefined whether the initial value for a is used at all in the second line, yielding 12, or whether the second assignment is done before times gets its right argument, and thus the result is 9.

Similarly, in

i←2
(2 1⍴10 20)[i;i←1]

the evaluation order of the statements in the bracket indexing is undefined. If i is evaluated before i←1 then the result is 20, otherwise it is 10.

The Multics APL manual goes on to use the terms monstrosity and eyesore for code published in an APL newsletter, such as

Z[B+(C∧X∊D)/⍳⍴X;]+(24p' Y9 X9 ')[(C←(-≠\''''=X)∧A≤⍴D)/A←(D←'⍵⍺')⍳X;]

The manual suggests that this code should be split into the following expressions:

D←'⍵⍺'
A←D⍳X
C←(-≠\''''=X)∧A≤⍴D
B←(C∧X∊D)/⍳⍴X
Z[B;]←(24p' Y9  X9 ')[C/A;]

See also

References

  1. Abrams, Phil. Program Writing, Rewriting and Style. APL Conference 73. Canadian Printco Limited. 1973.
  2. Perlis, Alan. Almost Perfect Artifacts Improve only in Small Ways: APL is more French than English. APL '78.
  3. Scholes, John. Kind Koloring of d-fnop named ⍵. Dfns workspace. Dyalog Ltd.
  4. Cason, Stan. APL2 IDIOMS Library, Assignment Algorithms. IBM.
  5. Honeywell. Multics APL User's Guide (AK95-02), 3-16. December 1985.
APL syntax [edit]
General Comparison with traditional mathematicsPrecedenceTacit programming (Train, Hook, Split composition)
Array Numeric literalStringStrand notationObject literalArray notation (design considerations)
Function ArgumentFunction valenceDerived functionDerived operatorNiladic functionMonadic functionDyadic functionAmbivalent functionDefined function (traditional)DfnFunction train
Operator OperandOperator valenceTradopDopDerived operator
Assignment MultipleIndexedSelectiveModified
Other Function axisBracket indexingBranchStatement separatorQuad nameSystem commandUser commandKeywordDot notationFunction-operator overloadingControl structureComment