Array notation

From APL Wiki
Revision as of 22:57, 5 November 2020 by Adám Brudzewsky (talk | contribs) (Created page with "'''Array notation''' is a way to write most arrays literally, with no or minimal use of primitive functions, possibly over multiple code lines. While APL has had at le...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Array notation is a way to write most arrays literally, with no or minimal use of primitive functions, possibly over multiple code lines. While APL has had at least simple numeric Strand notation since APL\360, no major APL has implemented native support for an extended notation as of 2020.

Medium-sized array constants are often needed in code. Due to the lack of a native multi-line notation, programmers have resorted to various ad-hoc methods of approximating such, usually at the cost of reduced readability. A very common technique is repeated concatenation:

poss1 2'fns'  ((0 1)(0.7 0)(0.7 0)×size)
poss   'fnd'  ((0 1)(0   0)(0   0)×size)
poss   'lines'((0 0)(0.7 0)(0.7 0)×size)
poss   'lnd'  ((0 0)(0   0)(0   0)×size)

History

Array notation in NARS.

1981

When NARS introduced nested array theory, the need for a way to represent complex structures was already recognised. Though a formal notation was never adopted, much less implemented, the NARS reference manual included round parentheses to delimit nested arrays.[1]

2015

At Dyalog '15, Phil Last explained that he considered the lack of such a notation a big hole in APL notation and gave a suggestions for such a notation. He presented a model using square brackets to indicate collections of major cells of rank 1 or higher, delimited by line breaks and/or diamonds, for example [1 2 3 4 5 6] would be equivalent to 2 31 2 3 4 5 6. He also proposed that if the delimited expressions were assignments, then the notation would instead declare members of an anonymous namespace, for example for example [a3 b6]. He pointed out that this overloading of the symbols meant that the array notation could only represent constants, as allowing general expressions would lead to ambiguity. He also mentioned that doubled symbols or Unicode brackets could be used instead.[2]

After the presentation, Phil Last had a conversation with Adám Brudzewsky who had recently joined Dyalog Ltd., the language developer of Dyalog APL, and who was inspired to begin an internal Dyalog research project on the matter. Meanwhile, Acre Desktop, a project manager that Last co-develops, moved from storing APL items in component files to storing them in text files, necessitating a literal notation for arrays, and his notation for arrays was adopted. Acre stores unscripted namespaces as directories, so the need for a literal namespace notation only arises when a namespace is an element in a larger array, something that is quite unlikely for application constants.

Array notation at Dyalog '17.

2017

At Dyalog '17, Adám Brudzewsky proposed an alternative notation using round parentheses to indicate collections of major cells of any rank, thus allowing the notation to express nested vectors though scalar major cells, for example (1 2 3 4 5 6) would be equivalent to (1 2 3)(4 5 6). This notation had a striking similarity to the informal notation used in the NARS reference manual over 35 years prior. For namespace, he proposed using colon (:) to delimit name-value pairs, inspired by JSON in which colon is used in the same manner, despite assignment being denoted by = in JavaScript, from which JSON was derived. This distinction allowed arbitrary expressions in arrays, opening the possibility of full integration into the language, while also allowing a namespace with no members to be denoted (). Last's proposal required [:] to distinguish it from bracket indexing into a vector while eliding the indices, a technique used to address all elements.

In addition to the main array notation, Brudzewsky also proposed allowing line breaks between quotes in strings to represent a vector of character vectors (with leading and trailing spaces trimmed).[3] While not included in the live presentation, Brudzewsky's slide deck included a discussion of whether expressions resulting in a scalar should be treated as singleton vectors or not. It concluded that if they were treated as vectors, then an alternative notation in the form of a line continuation character would be necessary to allow writing large vectors over multiple lines of code.[4]

Array notation at Dyalog '18.

2018

At Dyalog '18, Adám Brudzewsky returned with a solution to the issue on whether scalars should be regarded as 1-element vectors (thus increasing the rank of the containing array) or left as scalars (thus forming a vector). He reintroduced square brackets as collections of major cells of rank 1 or higher, repurposing round parentheses as vectors.

The namespace notation remained as before, using round parentheses so the empty namespace could be written in a consistent manner, but he presented formalised scoping rules for the value expressions, namely that these would run in the surrounding namespace, but within their own scope, so any assignment done during such an expression. For example (a:b,b1 2) would neither populate the new namespace with a member b, nor create such a variable in the global scope.[5] Acre quickly adopted this notation.

2020

In the spring of 2020, dzaima/APL adopted the proposed array notation with the exception of forcing the result of statements in square brackets to rank 1 or higher.[6]


References

  1. Carl M. Cheney. APL*PLUS Nested Arrays System (reference manual). 1.1 What are nested arrays? STSC. 1981.
  2. Last, Phil. APL Array Notation (transcript). Dyalog '15.
  3. Brudzewsky, Adám. Literal Notation for Arrays and Namespaces. Dyalog '17
  4. Brudzewsky, Adám Literal Notation for Arrays and Namespaces (slides). Dyalog '17
  5. Brudzewsky, Adám. Array Notation Mk III. Dyalog '18.
  6. Stack Exchange user dzaima. dzaima/APL. Git commit "[1 23 4], ⎕AV,". GitHub.
APL syntax [edit]
General Comparison with traditional mathematicsPrecedenceTacit programming
Array Numeric literalStringStrand notationObject literalArray notation
Function ArgumentFunction valenceDerived functionDerived operatorNiladic functionMonadic functionDyadic functionAmbivalent functionDefined function (traditional)DfnFunction train
Operator OperandOperator valenceTradopDopDerived operator
Assignment MultipleIndexedSelectiveModified
Other Function axisBranchQuad nameSystem commandUser commandKeywordDot notationFunction-operator overloading