Array notation: Difference between revisions

Jump to navigation Jump to search
10,218 bytes added ,  22:03, 26 May 2021
m
No edit summary
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[File:Array notation syntax.png|thumb|right|[[wikipedia:Railroad diagram|Railroad diagram]] for the array notation syntax.]]
'''Array notation''' is a way to write most [[array]]s literally, with no or minimal use of [[primitive function]]s, possibly over multiple code lines. While APL has had at least simple numeric [[strand notation]] since [[APL\360]], no major APL has implemented native support for an extended notation as of 2020.
'''Array notation''' is a way to write most [[array]]s literally, with no or minimal use of [[primitive function]]s, possibly over multiple code lines. While APL has had at least simple numeric [[strand notation]] since [[APL\360]], no major APL has implemented native support for an extended notation as of 2020.


Line 8: Line 9:
poss⍪←  'lnd'  ((0 0)(0  0)(0  0)×size)
poss⍪←  'lnd'  ((0 0)(0  0)(0  0)×size)
</source>
</source>
Using the array notation described in this article, the array could for example be written as:
<source lang=apl>
poss←['fns'  ((0 1)(0.7 0)(0.7 0)×size)
      'fnd'  ((0 1)(0  0)(0  0)×size)
      'lines'((0 0)(0.7 0)(0.7 0)×size)
      'lnd'  ((0 0)(0  0)(0  0)×size)]
</source>
The array notation can also be used to express the inner vectors of vectors:
<source lang=apl>
poss←['fns'  ((0 1 ⋄ 0.7 0 ⋄ 0.7 0)×size)
      'fnd'  ((0 1 ⋄ 0  0 ⋄ 0  0)×size)
      'lines'((0 0 ⋄ 0.7 0 ⋄ 0.7 0)×size)
      'lnd'  ((0 0 ⋄ 0  0 ⋄ 0  0)×size)]
</source>
== Description ==
The notation is added to the language by giving meaning to previously invalid statements. The added syntax consists of three constructs that are currently [[SYNTAX ERROR]]s:
* ''broken'' round parentheses
* ''broken'' square brackets
* empty round parentheses: <source lang=apl inline>()</source>
where ''broken'' means interrupted by one or more [[diamond]]s (<source lang=apl inline>⋄</source>) or line breaks (outside of [[dfn]]s).
* A ''broken'' round parenthesis creates a [[namespace]] if every diamond/line break-separated statement is a ''name-value pair''.
* A ''broken'' round parenthesis creates a [[vector]] if every diamond/line break-separated statement is a value expression. In that case, every such statement forms an [[element]] in the resulting vector.
* A ''broken'' square bracket creates a an [[array]] where every diamond/line break-separated statement forms a [[major cell]] in the resulting array.
* <source lang=apl inline>()</source> is equivalent to <source lang=apl inline>(⎕NS 0⍴⊂'')</source>
* A ''name-value pair'' consist of a valid APL identifier, followed by a <source lang=apl inline>:</source> and a value expression.
=== Formal syntax ===
The array notation can be described using [[wikipedia:Extended Backus–Naur form|Extended Backus–Naur form]], where an <code>expression</code> is any traditional APL expression:
<pre>
value    ::= expression | list | block | space
list    ::= '(' ( ( value sep )+ value? | ( sep value )+ sep? ) ')'
block    ::= '[' ( ( value sep )+ value? | ( sep value )+ sep? ) ']'
space    ::= '(' sep? ( name ':' value ( sep name ':' value )* )? sep? ')'
sep      ::= [⋄#x000A#x000D#x0085]+
</pre>
== History ==
== History ==
[[File:Nested Arrays System array notation.png|thumb|right|Array notation in [[NARS]].]]
[[File:Nested Arrays System array notation.png|thumb|right|Array notation in [[NARS]].]]
=== 1981 ===
=== 1981 ===
When [[NARS]] introduced [[nested array theory]], the need for a way to represent complex structures was already recognised. Though a formal notation was never adopted, much less implemented, the NARS reference manual included round parentheses to delimit nested arrays.<ref>Carl M. Cheney. ''APL*PLUS Nested Arrays System'' (reference manual). [http://www.sudleyplace.com/APL/Nested%20Arrays%20System.pdf#page=7 1.1 What are nested arrays?] [[STSC]]. 1981.</ref>
When [[NARS]] introduced [[nested array theory]], the need for a way to represent complex structures was already recognised. Though a formal notation was never adopted, much less implemented, the NARS reference manual included round parentheses to delimit nested arrays.<ref>Cheney, Carl M. ''APL*PLUS Nested Arrays System'' (reference manual). [http://www.sudleyplace.com/APL/Nested%20Arrays%20System.pdf#page=7 1.1 What are nested arrays?] [[STSC]]. 1981.</ref>
 
=== 2014 ===
At [[Dyalog '14]], [[Morten Kromberg]] said:
:''The emphasis on using scripts to store source code means that it's probably time for us to come up with a notation for constants in the language so that in your script you can declare matrices and so on in a nice readable fashion.''
Although no concrete proposal was made at the time, he set the expectation of this being the subject of a presentation the following year.<ref>[[Morten Kromberg|Kromberg, Morten]]. [https://dyalog.tv/Dyalog14/?v=rRRyDWaU1fA Technical Road Map]. [[Dyalog '15]].</ref>


=== 2015 ===
=== 2015 ===
Line 17: Line 62:


After the presentation, Phil Last had a conversation with [[Adám Brudzewsky]] who had recently joined [[Dyalog Ltd.]], the [[language developer|language developer]] of [[Dyalog APL]], and who was inspired to begin an internal Dyalog research project on the matter. Meanwhile, Acre Desktop, a project manager that Last co-develops, moved from storing APL items in [[component file]]s to storing them in text files, necessitating a literal notation for arrays, and his notation for arrays was adopted. Acre stores unscripted namespaces as directories, so the need for a literal namespace notation only arises when a namespace is an element in a larger array, something that is quite unlikely for application constants.
After the presentation, Phil Last had a conversation with [[Adám Brudzewsky]] who had recently joined [[Dyalog Ltd.]], the [[language developer|language developer]] of [[Dyalog APL]], and who was inspired to begin an internal Dyalog research project on the matter. Meanwhile, Acre Desktop, a project manager that Last co-develops, moved from storing APL items in [[component file]]s to storing them in text files, necessitating a literal notation for arrays, and his notation for arrays was adopted. Acre stores unscripted namespaces as directories, so the need for a literal namespace notation only arises when a namespace is an element in a larger array, something that is quite unlikely for application constants.
=== 2016 ===
Phil Last published a more formal proposal in the [[Vector Journal]]. Again, the notation was only described as a serialisation format; not as an integral part of the language. He added escape sequences to [[string]]s, further distancing the notation from compatibility with existing APL code.<ref>Last, Phil. [http://archive.vector.org.uk/art10501450 A Notation for APL array Embedding and Serialization]. Vector Journal, Volume 26, number 4. [[British APL Association]]. 2016.</ref>


[[File:D11 Literal Notation for Arrays and Namespaces - Summary of notations.png|thumb|right|Array notation at [[Dyalog '17]].]]
[[File:D11 Literal Notation for Arrays and Namespaces - Summary of notations.png|thumb|right|Array notation at [[Dyalog '17]].]]
Line 29: Line 77:
The namespace notation remained as before, using round parentheses so the empty namespace could be written in a consistent manner, but he presented formalised scoping rules for the value expressions, namely that these would run in the surrounding namespace, but within their own scope, so any assignment done during such an expression. For example <source lang=apl inline>(a:b,b←1 2)</source> would neither populate the new namespace with a member <source lang=apl inline>b</source>, nor create such a variable in the global scope.<ref>Brudzewsky, Adám.  [https://dyalog.tv/Dyalog18/?v=GAdQuOtPcfM Array Notation Mk III]. [[Dyalog '18]].</ref> Acre quickly adopted this notation.
The namespace notation remained as before, using round parentheses so the empty namespace could be written in a consistent manner, but he presented formalised scoping rules for the value expressions, namely that these would run in the surrounding namespace, but within their own scope, so any assignment done during such an expression. For example <source lang=apl inline>(a:b,b←1 2)</source> would neither populate the new namespace with a member <source lang=apl inline>b</source>, nor create such a variable in the global scope.<ref>Brudzewsky, Adám.  [https://dyalog.tv/Dyalog18/?v=GAdQuOtPcfM Array Notation Mk III]. [[Dyalog '18]].</ref> Acre quickly adopted this notation.


[[File:D09 Array Notation RC1 - Questions.png|thumb|right|Array notation at [[Dyalog '20]]]] -->
[[File:D09 Array Notation RC1 - Questions.png|thumb|right|Array notation at [[Dyalog '20]]]]
===2020===
===2020===
In the spring of 2020, [[dzaima/APL]] adopted the proposed array notation with the exception of forcing the result of statements in square brackets to rank 1 or higher.<ref>Stack Exchange user [https://codegolf.stackexchange.com/users/59183/dzaima dzaima]. [https://github.com/dzaima/APL dzaima/APL]. Git commit "[https://github.com/dzaima/APL/commit/dfebe5de3699b2e3f838a60f72c6b9a9f66317e7 <source lang=apl inline>[1 2⋄3 4]</source>,  <source lang=apl inline>⎕AV</source>,]". GitHub.</ref>
In the spring of 2020, [[dzaima/APL]] adopted the proposed array notation with the exception of forcing the result of statements in square brackets to rank 1 or higher.<ref>Stack Exchange user [https://codegolf.stackexchange.com/users/59183/dzaima dzaima]. [https://github.com/dzaima/APL dzaima/APL]. Git commit "[https://github.com/dzaima/APL/commit/dfebe5de3699b2e3f838a60f72c6b9a9f66317e7 <source lang=apl inline>[1 2⋄3 4]</source>,  <source lang=apl inline>⎕AV</source>,]". GitHub.</ref>


At [[Dyalog '20]], Adám Brudzewsky presented the notation as ''Release Candidate 1'' and showed how [[Dyalog APL 18.0]]'s updated version of [https://github.com/Dyalog/link/wiki Link] (a simple interface for using source code in text files, synchronising the file system and the [[workspace]]) includes experimental support the array notation, including a facility to use multi-line array notation inside functions. He estimated that [[Dyalog APL 20.0]] will include native interpreter support for the notation in 2022.-->
At [[Dyalog '20]], Adám Brudzewsky presented the notation as ''Release Candidate 1'' and showed how [[Dyalog APL 18.0]]'s updated version of [https://github.com/Dyalog/link/wiki Link] (a simple interface for using source code in text files, synchronising the file system and the [[workspace]]) includes experimental support the array notation, including a facility to use multi-line array notation inside functions. He estimated that [[Dyalog APL 20.0]] will include native interpreter support for the notation in 2022.
 
== Design considerations ==
 
In creating the notation's specification, various alternatives were considered. The following requirements were proposed:<ref>[[Adám Brudzewsky]]. Internaldocuments. [[Dyalog Ltd.]] 30 Jun 2017.</ref>
 
# No new [[glyph]]s
# Reusing existing glyphs for similar purposes
# Similarity to other languages ([[K]], [[wikipedia:JSON|JSON]], [[wikipedia:CSS|CSS]])
# Visual attractiveness
# Intuitive syntax
# As little [[wikipedia:syntactic sugar|syntactic sugar]] as possible
 
=== Glyphs ===
 
The design requirement for no new glyphs was contentious, and both [[bi-glyph]] and non-ASCII brackets were considered. Bi-glyphs were rejected out of readability concerns, especially when nested. For example, <source lang=apl inline>1 1 3⍴2</source> could have been written as <source lang=apl inline>[[[[2 2 2]]]]</source>. Non-ASCII brackets were rejected for font and keyboarding reasons, as well as to make it easier for non-APL systems to generate APL data. For example, <source lang=apl inline>⟦</source>…<source lang=apl inline>⟧</source> was proposed to denote a collection of [[major cells]], forming a new array of rank one-higher than the rank of the highest-[[rank]] constituent [[cell]]. However, few [[fonts]] support these glyphs.
 
The eventual choice was to go with existing symbols, and this had important implications for the specifics of the notation. While ideally, a notation would have been introduced for a collection of major cells, thereby handling both vectors and higher-rank arrays, a problem presents itself with [[axis|axes]] of length 1, because both square brackets and round parentheses already have meaning with when surrounding a single statement (namely [[function axis]]/[[bracket indexing]] and [[precedence]]/[[function train]]s). Thus, while <source lang=apl inline>2 ⟦3⟧</source> could have denoted the [[nested array]] <source lang=apl inline>2 (1⍴3)</source>, this isn't viable with <source lang=apl inline>2 [1⍴3]</source> because this already denotes indexing <source lang=apl inline>2</source> using the indices <source lang=apl inline>1⍴3</source>. To disambiguate, at least one statement separator or line break must be present in each level of array notation brackets and parentheses.
 
=== Minimum rank of major cells ===
 
While <source lang=apl inline>⟦⟦3⟧⟧</source> could denote <source lang=apl inline>1 1⍴3</source> using non-ASCII glyphs, an equivalent ASCII scheme instead would have required <source lang=apl inline>[[3⋄]⋄]</source> where the inner bracket creates a vector, and the outer creates a [[matrix]]. Using line breaks instead of diamonds, it was found to be counter-intuitive that <source lang=apl>[
3
5
  ]</source> was to denote two-[[element]] vector while <source lang=apl>[
3 4
5 6
    ]</source> would be a two-row matrix. Therefore, a special rule was added to the effect that in such collections of major cells, every cell would be considered to have a rank of at least 1, even if it was a [[scalar]].
 
In turn, this choice introduced the need for a separate notation to allow vectors to be written over multiple lines, and therefore the round parentheses was extended from its traditional use in [[strand notation]] to also denote a collection of [[enclose]]d elements.
 
=== Name-value pairs ===
 
As a notation for [[namespace]]s, several details were debated:
 
# Whether to use <source lang=apl inline>⋄</source> or <source lang=apl inline>;</source> to separate [[wikipedia:name-value pair|name-value pair]]s (in addition to line breaks)
# Which enclosure glyphs to use, <source lang=apl inline>(</source>…<source lang=apl inline>)</source> or <source lang=apl inline>[</source>…<source lang=apl inline>]</source>
# Which glyph should separate the name from the value, <source lang=apl inline>:</source> or <source lang=apl inline>←</source>
# In which scope the value expressions should be evaluated
 
The <source lang=apl inline>⋄</source> was chosen to separate name-value pairs, as it is generally exchangeable with a line break, while <source lang=apl inline>;</source> though it is used to separate names ― without values ― in [[Defined_function_(traditional)#Semi-colons|headers]] and in [[locals lines]]. Furthermore, it was seen as natural the values would be computed in reading order (left-to-right) just like multiple statements are, and while <source lang=apl inline>⋄</source> would imply this, <source lang=apl inline>;</source> wouldn't. Indeed, in the statement <source lang=apl inline>A[B;C]</source>, expression <source lang=apl inline>C</source> is evaluated before expression <source lang=apl inline>B</source>. It was briefly considered to have values computed from the right, just line stranding is, but this was rejected because replacing the semi-colons with line breaks would then require evaluation beginning with the last line and working upwards!
 
Round parentheses were chosen because namespaces are seen as (unordered) lists, and so are more similar to vectors than higher-rank arrays. Furthermore, <source lang=apl inline>[]</source> already had meaning (indexing all elements of a vector) while <source lang=apl inline>()</source> didn't have any existing use, and so could be used to denote a new empty namespace, equivalent to <source lang=apl inline>⎕NS 0⍴⊂''</source>.
 
While initially, <source lang=apl inline>←</source> was seen as the obvious choice to separate the name and the value, it was soon discovered that a namespace with only one member would be indistinguishable from a parenthesised [[assignment]]. Furthermore, it was noted that value expressions could contain intermediary assignments, and that such assignments were of a fundamentally different nature from the name-value declaration. The intermediary assignments would happen in a temporary scope, with any created variables disappearing once the namespace member value was established.
 
Value expressions could be evaluated in the newly established namespace (similar to expressions in <source lang=apl inline>:Namespace</source> scripts), or in the surrounding scope (similar to inline expressions in [[wikipedia:JavaScript|JavaScript]]'s object notation). It was envisioned that a main usage of the literal notation would be to collect existing values into a namespace, and evaluating inside the new namespace would force the use of <source lang=apl inline>##.</source> to fetch values in the surrounding scope. In a departure from JavaScript, it was found most natural that such intermediate assignments be local to the value expression, similar to assignments in dfns. Global assignment is still available using <source lang=apl inline>⎕THIS.name←value</source>, just as in dfns.


== References ==
== References ==
<references/>
<references/>
{{APL syntax}}[[Category:APL syntax]][[Category:Nested array model]]
{{APL syntax}}[[Category:APL syntax]][[Category:Nested array model]]

Navigation menu