4,494
edits
mNo edit summary |
m (→Description) |
||
(9 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
[[File:Array notation syntax.png|thumb|right|[[wikipedia:Railroad diagram|Railroad diagram]] for the array notation syntax.]] | |||
'''Array notation''' is a way to write most [[array]]s literally, with no or minimal use of [[primitive function]]s, possibly over multiple code lines. While APL has had at least simple numeric [[strand notation]] since [[APL\360]], no major APL has implemented native support for an extended notation as of 2020. | '''Array notation''' is a way to write most [[array]]s literally, with no or minimal use of [[primitive function]]s, possibly over multiple code lines. While APL has had at least simple numeric [[strand notation]] since [[APL\360]], no major APL has implemented native support for an extended notation as of 2020. | ||
Line 8: | Line 9: | ||
poss⍪← 'lnd' ((0 0)(0 0)(0 0)×size) | poss⍪← 'lnd' ((0 0)(0 0)(0 0)×size) | ||
</source> | </source> | ||
Using the array notation described in this article, the array could for example be written as: | |||
<source lang=apl> | |||
poss←['fns' ((0 1)(0.7 0)(0.7 0)×size) | |||
'fnd' ((0 1)(0 0)(0 0)×size) | |||
'lines'((0 0)(0.7 0)(0.7 0)×size) | |||
'lnd' ((0 0)(0 0)(0 0)×size)] | |||
</source> | |||
The array notation can also be used to express the inner vectors of vectors: | |||
<source lang=apl> | |||
poss←['fns' ((0 1 ⋄ 0.7 0 ⋄ 0.7 0)×size) | |||
'fnd' ((0 1 ⋄ 0 0 ⋄ 0 0)×size) | |||
'lines'((0 0 ⋄ 0.7 0 ⋄ 0.7 0)×size) | |||
'lnd' ((0 0 ⋄ 0 0 ⋄ 0 0)×size)] | |||
</source> | |||
== Description == | |||
The notation is added to the language by giving meaning to previously invalid statements. The added syntax consists of three constructs that are currently [[SYNTAX ERROR]]s: | |||
* ''broken'' round parentheses | |||
* ''broken'' square brackets | |||
* empty round parentheses: <source lang=apl inline>()</source> | |||
where ''broken'' means interrupted by one or more [[diamond]]s (<source lang=apl inline>⋄</source>) or line breaks (outside of [[dfn]]s). | |||
* A ''broken'' round parenthesis creates a [[namespace]] if every diamond/line break-separated statement is a ''name-value pair''. | |||
* A ''broken'' round parenthesis creates a [[vector]] if every diamond/line break-separated statement is a value expression. In that case, every such statement forms an [[element]] in the resulting vector. | |||
* A ''broken'' square bracket creates a an [[array]] where every diamond/line break-separated statement forms a [[major cell]] in the resulting array. | |||
* <source lang=apl inline>()</source> is equivalent to <source lang=apl inline>(⎕NS 0⍴⊂'')</source> | |||
* A ''name-value pair'' consist of a valid APL identifier, followed by a <source lang=apl inline>:</source> and a value expression. | |||
=== Formal syntax === | |||
The array notation can be described using [[wikipedia:Extended Backus–Naur form|Extended Backus–Naur form]], where an <code>expression</code> is any traditional APL expression: | |||
<pre> | |||
value ::= expression | list | block | space | |||
list ::= '(' ( ( value sep )+ value? | ( sep value )+ sep? ) ')' | |||
block ::= '[' ( ( value sep )+ value? | ( sep value )+ sep? ) ']' | |||
space ::= '(' sep? ( name ':' value ( sep name ':' value )* )? sep? ')' | |||
sep ::= [⋄#x000A#x000D#x0085]+ | |||
</pre> | |||
== History == | == History == | ||
[[File:Nested Arrays System array notation.png|thumb|right|Array notation in [[NARS]].]] | [[File:Nested Arrays System array notation.png|thumb|right|Array notation in [[NARS]].]] | ||
=== 1981 === | === 1981 === | ||
When [[NARS]] introduced [[nested array theory]], the need for a way to represent complex structures was already recognised. Though a formal notation was never adopted, much less implemented, the NARS reference manual included round parentheses to delimit nested arrays.<ref>Carl M | When [[NARS]] introduced [[nested array theory]], the need for a way to represent complex structures was already recognised. Though a formal notation was never adopted, much less implemented, the NARS reference manual included round parentheses to delimit nested arrays.<ref>Cheney, Carl M. ''APL*PLUS Nested Arrays System'' (reference manual). [http://www.sudleyplace.com/APL/Nested%20Arrays%20System.pdf#page=7 1.1 What are nested arrays?] [[STSC]]. 1981.</ref> | ||
=== 2014 === | |||
At [[Dyalog '14]], [[Morten Kromberg]] said: | |||
:''The emphasis on using scripts to store source code means that it's probably time for us to come up with a notation for constants in the language so that in your script you can declare matrices and so on in a nice readable fashion.'' | |||
Although no concrete proposal was made at the time, he set the expectation of this being the subject of a presentation the following year.<ref>[[Morten Kromberg|Kromberg, Morten]]. [https://dyalog.tv/Dyalog14/?v=rRRyDWaU1fA Technical Road Map]. [[Dyalog '15]].</ref> | |||
=== 2015 === | === 2015 === | ||
Line 17: | Line 62: | ||
After the presentation, Phil Last had a conversation with [[Adám Brudzewsky]] who had recently joined [[Dyalog Ltd.]], the [[language developer|language developer]] of [[Dyalog APL]], and who was inspired to begin an internal Dyalog research project on the matter. Meanwhile, Acre Desktop, a project manager that Last co-develops, moved from storing APL items in [[component file]]s to storing them in text files, necessitating a literal notation for arrays, and his notation for arrays was adopted. Acre stores unscripted namespaces as directories, so the need for a literal namespace notation only arises when a namespace is an element in a larger array, something that is quite unlikely for application constants. | After the presentation, Phil Last had a conversation with [[Adám Brudzewsky]] who had recently joined [[Dyalog Ltd.]], the [[language developer|language developer]] of [[Dyalog APL]], and who was inspired to begin an internal Dyalog research project on the matter. Meanwhile, Acre Desktop, a project manager that Last co-develops, moved from storing APL items in [[component file]]s to storing them in text files, necessitating a literal notation for arrays, and his notation for arrays was adopted. Acre stores unscripted namespaces as directories, so the need for a literal namespace notation only arises when a namespace is an element in a larger array, something that is quite unlikely for application constants. | ||
=== 2016 === | |||
Phil Last published a more formal proposal in the [[Vector Journal]]. Again, the notation was only described as a serialisation format; not as an integral part of the language. He added escape sequences to [[string]]s, further distancing the notation from compatibility with existing APL code.<ref>Last, Phil. [http://archive.vector.org.uk/art10501450 A Notation for APL array Embedding and Serialization]. Vector Journal, Volume 26, number 4. [[British APL Association]]. 2016.</ref> | |||
[[File:D11 Literal Notation for Arrays and Namespaces - Summary of notations.png|thumb|right|Array notation at [[Dyalog '17]].]] | [[File:D11 Literal Notation for Arrays and Namespaces - Summary of notations.png|thumb|right|Array notation at [[Dyalog '17]].]] | ||
Line 34: | Line 82: | ||
At [[Dyalog '20]], Adám Brudzewsky presented the notation as ''Release Candidate 1'' and showed how [[Dyalog APL 18.0]]'s updated version of [https://github.com/Dyalog/link/wiki Link] (a simple interface for using source code in text files, synchronising the file system and the [[workspace]]) includes experimental support the array notation, including a facility to use multi-line array notation inside functions. He estimated that [[Dyalog APL 20.0]] will include native interpreter support for the notation in 2022. | At [[Dyalog '20]], Adám Brudzewsky presented the notation as ''Release Candidate 1'' and showed how [[Dyalog APL 18.0]]'s updated version of [https://github.com/Dyalog/link/wiki Link] (a simple interface for using source code in text files, synchronising the file system and the [[workspace]]) includes experimental support the array notation, including a facility to use multi-line array notation inside functions. He estimated that [[Dyalog APL 20.0]] will include native interpreter support for the notation in 2022. | ||
== Design considerations == | |||
In creating the notation's specification, various alternatives were considered. The following requirements were proposed:<ref>[[Adám Brudzewsky]]. Internaldocuments. [[Dyalog Ltd.]] 30 Jun 2017.</ref> | |||
# No new [[glyph]]s | |||
# Reusing existing glyphs for similar purposes | |||
# Similarity to other languages ([[K]], [[wikipedia:JSON|JSON]], [[wikipedia:CSS|CSS]]) | |||
# Visual attractiveness | |||
# Intuitive syntax | |||
# As little [[wikipedia:syntactic sugar|syntactic sugar]] as possible | |||
=== Glyphs === | |||
The design requirement for no new glyphs was contentious, and both [[bi-glyph]] and non-ASCII brackets were considered. Bi-glyphs were rejected out of readability concerns, especially when nested. For example, <source lang=apl inline>1 1 3⍴2</source> could have been written as <source lang=apl inline>[[[[2 2 2]]]]</source>. Non-ASCII brackets were rejected for font and keyboarding reasons, as well as to make it easier for non-APL systems to generate APL data. For example, <source lang=apl inline>⟦</source>…<source lang=apl inline>⟧</source> was proposed to denote a collection of [[major cells]], forming a new array of rank one-higher than the rank of the highest-[[rank]] constituent [[cell]]. However, few [[fonts]] support these glyphs. | |||
The eventual choice was to go with existing symbols, and this had important implications for the specifics of the notation. While ideally, a notation would have been introduced for a collection of major cells, thereby handling both vectors and higher-rank arrays, a problem presents itself with [[axis|axes]] of length 1, because both square brackets and round parentheses already have meaning with when surrounding a single statement (namely [[function axis]]/[[bracket indexing]] and [[precedence]]/[[function train]]s). Thus, while <source lang=apl inline>2 ⟦3⟧</source> could have denoted the [[nested array]] <source lang=apl inline>2 (1⍴3)</source>, this isn't viable with <source lang=apl inline>2 [1⍴3]</source> because this already denotes indexing <source lang=apl inline>2</source> using the indices <source lang=apl inline>1⍴3</source>. To disambiguate, at least one statement separator or line break must be present in each level of array notation brackets and parentheses. | |||
=== Minimum rank of major cells === | |||
While <source lang=apl inline>⟦⟦3⟧⟧</source> could denote <source lang=apl inline>1 1⍴3</source> using non-ASCII glyphs, an equivalent ASCII scheme instead would have required <source lang=apl inline>[[3⋄]⋄]</source> where the inner bracket creates a vector, and the outer creates a [[matrix]]. Using line breaks instead of diamonds, it was found to be counter-intuitive that <source lang=apl>[ | |||
3 | |||
5 | |||
]</source> was to denote two-[[element]] vector while <source lang=apl>[ | |||
3 4 | |||
5 6 | |||
]</source> would be a two-row matrix. Therefore, a special rule was added to the effect that in such collections of major cells, every cell would be considered to have a rank of at least 1, even if it was a [[scalar]]. | |||
In turn, this choice introduced the need for a separate notation to allow vectors to be written over multiple lines, and therefore the round parentheses was extended from its traditional use in [[strand notation]] to also denote a collection of [[enclose]]d elements. | |||
=== Name-value pairs === | |||
As a notation for [[namespace]]s, several details were debated: | |||
# Whether to use <source lang=apl inline>⋄</source> or <source lang=apl inline>;</source> to separate [[wikipedia:name-value pair|name-value pair]]s (in addition to line breaks) | |||
# Which enclosure glyphs to use, <source lang=apl inline>(</source>…<source lang=apl inline>)</source> or <source lang=apl inline>[</source>…<source lang=apl inline>]</source> | |||
# Which glyph should separate the name from the value, <source lang=apl inline>:</source> or <source lang=apl inline>←</source> | |||
# In which scope the value expressions should be evaluated | |||
The <source lang=apl inline>⋄</source> was chosen to separate name-value pairs, as it is generally exchangeable with a line break, while <source lang=apl inline>;</source> though it is used to separate names ― without values ― in [[Defined_function_(traditional)#Semi-colons|headers]] and in [[locals lines]]. Furthermore, it was seen as natural the values would be computed in reading order (left-to-right) just like multiple statements are, and while <source lang=apl inline>⋄</source> would imply this, <source lang=apl inline>;</source> wouldn't. Indeed, in the statement <source lang=apl inline>A[B;C]</source>, expression <source lang=apl inline>C</source> is evaluated before expression <source lang=apl inline>B</source>. It was briefly considered to have values computed from the right, just line stranding is, but this was rejected because replacing the semi-colons with line breaks would then require evaluation beginning with the last line and working upwards! | |||
Round parentheses were chosen because namespaces are seen as (unordered) lists, and so are more similar to vectors than higher-rank arrays. Furthermore, <source lang=apl inline>[]</source> already had meaning (indexing all elements of a vector) while <source lang=apl inline>()</source> didn't have any existing use, and so could be used to denote a new empty namespace, equivalent to <source lang=apl inline>⎕NS 0⍴⊂''</source>. | |||
While initially, <source lang=apl inline>←</source> was seen as the obvious choice to separate the name and the value, it was soon discovered that a namespace with only one member would be indistinguishable from a parenthesised [[assignment]]. Furthermore, it was noted that value expressions could contain intermediary assignments, and that such assignments were of a fundamentally different nature from the name-value declaration. The intermediary assignments would happen in a temporary scope, with any created variables disappearing once the namespace member value was established. | |||
Value expressions could be evaluated in the newly established namespace (similar to expressions in <source lang=apl inline>:Namespace</source> scripts), or in the surrounding scope (similar to inline expressions in [[wikipedia:JavaScript|JavaScript]]'s object notation). It was envisioned that a main usage of the literal notation would be to collect existing values into a namespace, and evaluating inside the new namespace would force the use of <source lang=apl inline>##.</source> to fetch values in the surrounding scope. In a departure from JavaScript, it was found most natural that such intermediate assignments be local to the value expression, similar to assignments in dfns. Global assignment is still available using <source lang=apl inline>⎕THIS.name←value</source>, just as in dfns. | |||
== References == | == References == | ||
<references/> | <references/> | ||
{{APL syntax}}[[Category:APL syntax]][[Category:Nested array model]] | {{APL syntax}}[[Category:APL syntax]][[Category:Nested array model]] |