Array notation: Difference between revisions

From APL Wiki
Jump to navigation Jump to search
m (APLAN abbreviation)
 
(46 intermediate revisions by 3 users not shown)
Line 1: Line 1:
[[File:Array notation syntax.png|thumb|right|[[wikipedia:Railroad diagram|Railroad diagram]] for the array notation syntax.]]
{{Built-ins|Array notation|(⋄)|[]}}, abbreviated '''APLAN''' parallel to [[wikipedia:JSON|JSON]], is a way to write most [[array]]s literally, with no or minimal use of [[primitive function]]s, possibly over multiple code lines. It differs from the [[strand notation]] existing since [[APL\360]] in that it can be used to write arrays of rank greater than one. Array notation is supported in [[dzaima/APL]], [[BQN]] (using angle brackets <code>⟨⋄⟩</code> instead of round parentheses <code>(⋄)</code>), and some tools for [[Dyalog APL]], where it is planned as an eventual language feature.
'''Array notation''' is a way to write most [[array]]s literally, with no or minimal use of [[primitive function]]s, possibly over multiple code lines. While APL has had at least simple numeric [[strand notation]] since [[APL\360]], no major APL has implemented native support for an extended notation as of 2020.


Medium-sized array constants are often needed in code. Due to the lack of a native multi-line notation, programmers have resorted to various ad-hoc methods of approximating such, usually at the cost of reduced [[readability]]. A very common technique is repeated [[concatenate|concatenation]]:
Array notation generally consists of a vector notation written with parentheses <syntaxhighlight lang=apl inline>()</syntaxhighlight>, roughly equivalent to stranding, and a high-rank notation using square brackets <syntaxhighlight lang=apl inline>[]</syntaxhighlight>, indicating the [[Mix]] of a vector. It also supports [[namespace]]s, using <syntaxhighlight lang=apl inline>name:value</syntaxhighlight> syntax in round parentheses. [[Statement separator]]s must appear between elements and between [[wikipedia:name–value_pair|name–value pair]]s.
<source lang=apl>
poss←1 2⍴'fns'  ((0 1)(0.7 0)(0.7 0)×size)
poss⍪←  'fnd'  ((0 1)(0  0)(0  0)×size)
poss⍪←  'lines'((0 0)(0.7 0)(0.7 0)×size)
poss⍪←  'lnd'  ((0 0)(0  0)(0  0)×size)
</source>
Using the array notation described in this article, the array could for example be written as:
<source lang=apl>
poss←['fns'  ((0 1)(0.7 0)(0.7 0)×size)
      'fnd'  ((0 1)(0  0)(0  0)×size)
      'lines'((0 0)(0.7 0)(0.7 0)×size)
      'lnd'  ((0 0)(0  0)(0  0)×size)]
</source>
The array notation can also be used to express the inner vectors of vectors:
<source lang=apl>
poss←['fns'  ((0 1 ⋄ 0.7 0 ⋄ 0.7 0)×size)
      'fnd'  ((0 1 ⋄ 0  0 ⋄ 0  0)×size)
      'lines'((0 0 ⋄ 0.7 0 ⋄ 0.7 0)×size)
      'lnd'  ((0 0 ⋄ 0  0 ⋄ 0  0)×size)]
</source>
== Description ==
The notation is added to the language by giving meaning to previously invalid statements. The added syntax consists of three constructs that are currently [[SYNTAX ERROR]]s:


* ''broken'' round parentheses
== Examples ==
* ''broken'' square brackets
* empty round parentheses: <source lang=apl inline>()</source>


where ''broken'' means interrupted by one or more [[diamond]]s (<source lang=apl inline>⋄</source>) or line breaks (outside of [[dfn]]s).
Medium-sized array constants are often needed in code. Due to the lack of a native multi-line notation, programmers have resorted to various ad-hoc methods of approximating such, usually at the cost of reduced [[readability]]. A very common technique is repeated [[concatenate|concatenation]] resulting in the desired value being held in a variable (<syntaxhighlight lang=apl inline>z</syntaxhighlight> in the below examples), as opposed to array notation which can express the final value directly. In addition, the traditional technique sometimes involves the creation of helper variables as a side effect.
 
=== Basic arrays ===
 
{| class=wikitable
! Traditional method !! Array notation !! Description
|-
|<syntaxhighlight lang=apl>(0 6 1 8)(1 4 1 4 2)(2 7 1 8 2 8)(3 1 4 1 5)</syntaxhighlight>
|<syntaxhighlight lang=apl>(0 6 1 8 ⋄ 1 4 1 4 2 ⋄ 2 7 1 8 2 8 ⋄ 3 1 4 1 5)</syntaxhighlight>
|Vector of numeric vectors on a single line.
|-
|<syntaxhighlight lang=apl>z← (0 6 1 8)(1 4 1 4 2)
z,←(2 7 1 8 2 8)(3 1 4 1 5)</syntaxhighlight>
|<syntaxhighlight lang=apl>(0 6 1 8 ⋄ 1 4 1 4 2
2 7 1 8 2 8 ⋄ 3 1 4 1 5)</syntaxhighlight>
|Vector of numeric vectors split over two lines.
|-
|<syntaxhighlight lang=apl>z←,⊂'Three'
z,←⊂'Blind'
z,←⊂'Mice'</syntaxhighlight>
|<syntaxhighlight lang=apl>('Three'
'Blind'
'Mice')</syntaxhighlight>
|Vector of character vectors, one on each line. (The traditional method includes an unnecessary <syntaxhighlight lang=apl inline>,</syntaxhighlight> to indicate that <syntaxhighlight lang=apl inline>z</syntaxhighlight> will be a vector.)
|-
|<syntaxhighlight lang=apl>z←⍉⍪0 6 1 8
z⍪← 1 4 1 4
z⍪← 2 7 1 8
z⍪← 3 1 4 2</syntaxhighlight>
|<syntaxhighlight lang=apl>[0 6 1 8
1 4 1 4
2 7 1 8
3 1 4 2]</syntaxhighlight>
|Numeric matrix.
|-
|<syntaxhighlight lang=apl>z←⍪10
z⍪←20
z⍪←30
z⍪←40</syntaxhighlight>
|<syntaxhighlight lang=apl>[10
20
30
40]</syntaxhighlight>
|Column matrix.
|}
 
=== Involved arrays ===
 
{| class=wikitable
! Traditional method !! Array notation !! Description
|-
|<syntaxhighlight lang=apl>a←⍉⍪0 0 1
a⍪← 1 0 1
a⍪← 0 1 1
z←,⊂a
a←⍉⍪0 1 1
a⍪← 1 1 0
a⍪← 0 1 0
z,←⊂a
a←⍉⍪0 1 1 1
a⍪← 1 1 1 0
z,←⊂a
a←⍉⍪0 1 1 0
a⍪← 1 0 0 1
a⍪← 0 1 1 0
z,←⊂a</syntaxhighlight>
|<syntaxhighlight lang=apl>([0 0 1
1 0 1
0 1 1]
 
[0 1 1
  1 1 0
  0 1 0]
 
[0 1 1 1
  1 1 1 0]
 
[0 1 1 0
  1 0 0 1
  0 1 1 0])</syntaxhighlight>
|Vector of matrices.
|-
|<syntaxhighlight lang=apl>z←⍉⍪0 'OK'
z⍪← 1 'WS FULL'
z⍪← 2 'SYNTAX ERROR'
z⍪← 3 'INDEX ERROR'
z⍪← 4 'RANK ERROR'</syntaxhighlight>
|<syntaxhighlight lang=apl>[0 'OK'
1 'WS FULL'
2 'SYNTAX ERROR'
3 'INDEX ERROR'
4 'RANK ERROR']</syntaxhighlight>
|Table with numeric and text columns.
|-
|<syntaxhighlight lang=apl>a←⍉⍪3 1 4
a⍪← 1 5 0
a←↑a
b←⍉⍪2 7 0
b⍪← 2 0 0
z←a,[0.5] b</syntaxhighlight>
|<syntaxhighlight lang=apl>[[3 1 4
  1 5 0]
[2 7 0
  2 0 0]]</syntaxhighlight>
|Rank 3 numeric array.
|-
|<syntaxhighlight lang=apl>a←,⊂3 1 4
a,←⊂1 5
a←↑a
b←,⊂2 7
b,← 2
b←↑b
z←↑a b</syntaxhighlight>
|<syntaxhighlight lang=apl>[[3
  1 5 9]
[2 7
  2]]</syntaxhighlight>
|Rank 3 numeric array relying on automatic padding with [[fill element]].
|-
|<syntaxhighlight lang=apl>
z←⍉⍪'fns'  ((0 1)(0.7 0)(0.7 0)×size)
z⍪← 'fnd'  ((0 1)(0  0)(0  0)×size)
z⍪← 'lines'((0 0)(0.7 0)(0.7 0)×size)
z⍪← 'lnd'  ((0 0)(0  0)(0  0)×size)
</syntaxhighlight>
|<syntaxhighlight lang=apl>
['fns'  ((0 1 ⋄ 0.7 0 ⋄ 0.7 0)×size)
'fnd'  ((0 1 ⋄ 0  0 ⋄ 0  0)×size)
'lines'((0 0 ⋄ 0.7 0 ⋄ 0.7 0)×size)
'lnd'  ((0 0 ⋄ 0  0 ⋄ 0  0)×size)]
</syntaxhighlight>
|Matrix of simple and nested vectors, with dynamic values.
|}
 
=== Namespaces ===
{| class=wikitable
! Traditional method !! Array notation !! Description
|-
|<syntaxhighlight lang=apl>⎕NS⍬</syntaxhighlight>
|<syntaxhighlight lang=apl>()</syntaxhighlight>
|Empty namespace.
|-
|<syntaxhighlight lang=apl>⎕NS¨⍬⍬⍬</syntaxhighlight>or<syntaxhighlight lang=apl>(⎕NS⍬)(⎕NS⍬)(⎕NS⍬)</syntaxhighlight>
|<syntaxhighlight lang=apl>()()()</syntaxhighlight>
|Vector of namespaces.
|-
|<syntaxhighlight lang=apl>z←⎕NS⍬
z.x←'hello'</syntaxhighlight>
|<syntaxhighlight lang=apl>(x:'hello')</syntaxhighlight>
|Namespace with character vector member.
|-
|<syntaxhighlight lang=apl>z←⎕NS⍬
z.x←⍉⍪'hello'
z.x⍪← 'world'</syntaxhighlight>
|<syntaxhighlight lang=apl>(x:['hello'
    'world'])</syntaxhighlight>
|Namespace with character matrix member.
|-
|<syntaxhighlight lang=apl>z←⎕NS⍬
z.y←⎕NS⍬
z.y.x←⍉⍪'hello'
z.y.x⍪← 'world'</syntaxhighlight>
|<syntaxhighlight lang=apl>(y:(x:['hello'
      'world']))</syntaxhighlight>
|Nested namespace structure with matrix member.
|-
|<syntaxhighlight lang=apl>z←⎕NS⍬
z.f←+
a←⎕NS⍬
a.f←-
z,←a
a←⎕NS⍬
a.f←×
z,←a
a←⎕NS⍬
a.f←÷
z←z.f</syntaxhighlight>
|<syntaxhighlight lang=apl>((f:+)(f:-)(f:×)(f:÷)).f</syntaxhighlight>
|[[Function array]].
|}
[[File:Array notation syntax.png|thumb|right|[[wikipedia:Railroad diagram|Railroad diagram]].]]
 
== Specification ==
The notation consists of syntax that was invalid before its introduction, thus causing no issues for [[backwards compatibility]]. The added syntax consists of three constructs that are currently [[SYNTAX ERROR]]s:
 
* ''broken'' round parentheses: <syntaxhighlight lang=apl inline>(</syntaxhighlight>…<syntaxhighlight lang=apl inline>)</syntaxhighlight>
* ''broken'' square brackets: <syntaxhighlight lang=apl inline>[</syntaxhighlight>…<syntaxhighlight lang=apl inline>]</syntaxhighlight>
* empty round parentheses: <syntaxhighlight lang=apl inline>()</syntaxhighlight>
 
where ''broken'' means interrupted by one or more [[diamond]]s (<syntaxhighlight lang=apl inline>⋄</syntaxhighlight>) or line breaks (outside of [[dfn]]s).


* A ''broken'' round parenthesis creates a [[namespace]] if every diamond/line break-separated statement is a ''name-value pair''.
* A ''broken'' round parenthesis creates a [[namespace]] if every diamond/line break-separated statement is a ''name-value pair''.
* A ''broken'' round parenthesis creates a [[vector]] if every diamond/line break-separated statement is a value expression. In that case, every such statement forms an [[element]] in the resulting vector.
* A ''broken'' round parenthesis creates a [[vector]] if every diamond/line break-separated statement is a value expression. In that case, every such statement forms an [[element]] in the resulting vector.
* A ''broken'' square bracket creates a an [[array]] where every diamond/line break-separated statement forms a [[major cell]] in the resulting array.
* <span id=minrank1>A ''broken'' square bracket creates a an [[array]] where every diamond/line break-separated statement forms a [[major cell]] in the resulting array.[[#minrank1note|*]]<span>
* <source lang=apl inline>()</source> is equivalent to <source lang=apl inline>(⎕NS 0⍴⊂'')</source>
* <syntaxhighlight lang=apl inline>()</syntaxhighlight> creates a new namespace — equivalent to <syntaxhighlight lang=apl inline>(⎕NS 0⍴⊂'')</syntaxhighlight>
* A ''name-value pair'' consist of a valid APL identifier, followed by a <source lang=apl inline>:</source> and a value expression.
* A ''name-value pair'' consists of a valid APL identifier, followed by a colon (<syntaxhighlight lang=apl inline>:</syntaxhighlight>) and a value expression.
 
<span id=minrank1note>[[#minrank1|*]]</span> This rule is followed strictly in [[dzaima/APL]], while [[Dyalog APL]] considers each statement to have a rank of at least 1, even if it is a [[scalar]].


=== Formal syntax ===
=== Formal syntax ===
Line 49: Line 215:


== History ==
== History ==
[[File:Nested Arrays System array notation.png|thumb|right|Array notation in [[NARS]].]]
:''See also the [[Array notation design considerations#Timeline]]''
=== 1981 ===
When [[NARS]] introduced [[nested array theory]], the need for a way to represent complex structures was already recognised. Though a formal notation was never adopted, much less implemented, the NARS reference manual included round parentheses to delimit nested arrays.<ref>Cheney, Carl M. ''APL*PLUS Nested Arrays System'' (reference manual). [http://www.sudleyplace.com/APL/Nested%20Arrays%20System.pdf#page=7 1.1 What are nested arrays?] [[STSC]]. 1981.</ref>
 
=== 2014 ===
At [[Dyalog '14]], [[Morten Kromberg]] said:
:''The emphasis on using scripts to store source code means that it's probably time for us to come up with a notation for constants in the language so that in your script you can declare matrices and so on in a nice readable fashion.''
Although no concrete proposal was made at the time, he set the expectation of this being the subject of a presentation the following year.<ref>[[Morten Kromberg|Kromberg, Morten]]. [https://dyalog.tv/Dyalog14/?v=rRRyDWaU1fA Technical Road Map]. [[Dyalog '15]].</ref>
 
=== 2015 ===
At [[Dyalog '15]], [[Phil Last]] explained that he considered the lack of such a notation a big hole in APL notation and gave a suggestions for such a notation. He presented a model using square brackets to indicate collections of [[major cell]]s of [[rank]] 1 or higher, delimited by line breaks and/or [[diamond]]s, for example <source lang=apl inline>[1 2 3 ⋄ 4 5 6]</source> would be equivalent to <source lang=apl inline>2 3⍴1 2 3 4 5 6</source>. He also proposed that if the delimited expressions were [[assignment]]s, then the notation would instead declare members of an anonymous [[namespace]], for example for example <source lang=apl inline>[a←3 ⋄ b←6]</source>. He pointed out that this overloading of the symbols meant that the array notation could only represent constants, as allowing general expressions would lead to ambiguity. He also mentioned that doubled symbols or [[Unicode]] brackets could be used instead.<ref>[[Phil Last|Last, Phil]]. [https://dyalog.tv/Dyalog15/?v=9-HAvTMhYao APL Array Notation] ([https://www.dyalog.com/uploads/conference/dyalog15/presentations/U07_APL_Array_Notation.pdf transcript]). [[Dyalog '15]].</ref>
 
After the presentation, Phil Last had a conversation with [[Adám Brudzewsky]] who had recently joined [[Dyalog Ltd.]], the [[language developer|language developer]] of [[Dyalog APL]], and who was inspired to begin an internal Dyalog research project on the matter. Meanwhile, Acre Desktop, a project manager that Last co-develops, moved from storing APL items in [[component file]]s to storing them in text files, necessitating a literal notation for arrays, and his notation for arrays was adopted. Acre stores unscripted namespaces as directories, so the need for a literal namespace notation only arises when a namespace is an element in a larger array, something that is quite unlikely for application constants.
 
=== 2016 ===
Phil Last published a more formal proposal in the [[Vector Journal]]. Again, the notation was only described as a serialisation format; not as an integral part of the language. He added escape sequences to [[string]]s, further distancing the notation from compatibility with existing APL code.<ref>Last, Phil. [http://archive.vector.org.uk/art10501450 A Notation for APL array Embedding and Serialization]. Vector Journal, Volume 26, number 4. [[British APL Association]]. 2016.</ref>
 
[[File:D11 Literal Notation for Arrays and Namespaces - Summary of notations.png|thumb|right|Array notation at [[Dyalog '17]].]]
===2017===
At [[Dyalog '17]], Adám Brudzewsky proposed an alternative notation using round parentheses to indicate collections of major cells of any rank, thus allowing the notation to express [[nested]] vectors though [[scalar]] major cells, for example <source lang=apl inline>(⊂1 2 3 ⋄ ⊂4 5 6)</source> would be equivalent to <source lang=apl inline>(1 2 3)(4 5 6)</source>. This notation had a striking similarity to the informal notation used in the [[NARS]] reference manual over 35 years prior. For namespace, he proposed using colon (<source lang=apl inline>:</source>) to delimit [[wikipedia:name-value pair|name-value pair]]s, inspired by [[wikipedia:JSON|JSON]] in which colon is used in the same manner, despite assignment being denoted by <source lang=javascript inline>=</source> in [[wikipedia:JavaScript|JavaScript]], from which JSON was derived. This distinction allowed arbitrary expressions in arrays, opening the possibility of full integration into the language, while also allowing a namespace with no members to be denoted <source lang=apl inline>()</source>. Last's proposal required <source lang=apl inline>[:]</source> to distinguish it from [[bracket indexing]] into a vector while eliding the indices, a technique used to address all [[element]]s.
 
In addition to the main array notation, Brudzewsky also proposed allowing line breaks between quotes in [[string]]s to represent a vector of character vectors (with leading and trailing spaces trimmed).<ref>[[Adám Brudzewsky|Brudzewsky, Adám]]. [https://dyalog.tv/Dyalog17/?v=CRQNzL8cUQE Literal Notation for Arrays and Namespaces]. [[Dyalog '17]]</ref> While not included in the live presentation, Brudzewsky's slide deck included a discussion of whether expressions resulting in a scalar should be treated as [[singleton]] vectors or not. It concluded that if they were treated as [[vector]]s, then an alternative notation in the form of a [[wikipedia:line continuation|line continuation]] character would be necessary to allow writing large vectors over multiple lines of code.<ref>Brudzewsky, Adám [https://www.dyalog.com/uploads/conference/dyalog17/presentations/D11_Literal_Notation_for_Arrays_and_Namespaces.pdf Literal Notation for Arrays and Namespaces] (slides). [[Dyalog '17]]</ref>
[[File:D04 Array Notation Mk III - Summary - Arrays.png|thumb|right|Array notation at [[Dyalog '18]].]]
===2018===
At [[Dyalog '18]], Adám Brudzewsky returned with a solution to the issue on whether scalars should be regarded as 1-element vectors (thus increasing the rank of the containing array) or left as scalars (thus forming a vector). He reintroduced square brackets as collections of major cells of rank 1 or higher, repurposing round parentheses as vectors.
 
The namespace notation remained as before, using round parentheses so the empty namespace could be written in a consistent manner, but he presented formalised scoping rules for the value expressions, namely that these would run in the surrounding namespace, but within their own scope, so any assignment done during such an expression. For example <source lang=apl inline>(a:b,b←1 2)</source> would neither populate the new namespace with a member <source lang=apl inline>b</source>, nor create such a variable in the global scope.<ref>Brudzewsky, Adám.  [https://dyalog.tv/Dyalog18/?v=GAdQuOtPcfM Array Notation Mk III]. [[Dyalog '18]].</ref> Acre quickly adopted this notation.
 
[[File:D09 Array Notation RC1 - Questions.png|thumb|right|Array notation at [[Dyalog '20]]]]
===2020===
In the spring of 2020, [[dzaima/APL]] adopted the proposed array notation with the exception of forcing the result of statements in square brackets to rank 1 or higher.<ref>Stack Exchange user [https://codegolf.stackexchange.com/users/59183/dzaima dzaima]. [https://github.com/dzaima/APL dzaima/APL]. Git commit "[https://github.com/dzaima/APL/commit/dfebe5de3699b2e3f838a60f72c6b9a9f66317e7 <source lang=apl inline>[1 2⋄3 4]</source>,  <source lang=apl inline>⎕AV</source>,]". GitHub.</ref>
 
At [[Dyalog '20]], Adám Brudzewsky presented the notation as ''Release Candidate 1'' and showed how [[Dyalog APL 18.0]]'s updated version of [https://github.com/Dyalog/link/wiki Link] (a simple interface for using source code in text files, synchronising the file system and the [[workspace]]) includes experimental support the array notation, including a facility to use multi-line array notation inside functions. He estimated that [[Dyalog APL 20.0]] will include native interpreter support for the notation in 2022.
 
== Design considerations ==
 
In creating the notation's specification, various alternatives were considered. The following requirements were proposed:<ref>[[Adám Brudzewsky]]. Internaldocuments. [[Dyalog Ltd.]] 30 Jun 2017.</ref>
 
# No new [[glyph]]s
# Reusing existing glyphs for similar purposes
# Similarity to other languages ([[K]], [[wikipedia:JSON|JSON]], [[wikipedia:CSS|CSS]])
# Visual attractiveness
# Intuitive syntax
# As little [[wikipedia:syntactic sugar|syntactic sugar]] as possible
 
=== Glyphs ===
 
The design requirement for no new glyphs was contentious, and both [[bi-glyph]] and non-ASCII brackets were considered. Bi-glyphs were rejected out of readability concerns, especially when nested. For example, <source lang=apl inline>1 1 3⍴2</source> could have been written as <source lang=apl inline>[[[[2 2 2]]]]</source>. Non-ASCII brackets were rejected for font and keyboarding reasons, as well as to make it easier for non-APL systems to generate APL data. For example, <source lang=apl inline>⟦</source>…<source lang=apl inline>⟧</source> was proposed to denote a collection of [[major cells]], forming a new array of rank one-higher than the rank of the highest-[[rank]] constituent [[cell]]. However, few [[fonts]] support these glyphs.
 
The eventual choice was to go with existing symbols, and this had important implications for the specifics of the notation. While ideally, a notation would have been introduced for a collection of major cells, thereby handling both vectors and higher-rank arrays, a problem presents itself with [[axis|axes]] of length 1, because both square brackets and round parentheses already have meaning with when surrounding a single statement (namely [[function axis]]/[[bracket indexing]] and [[precedence]]/[[function train]]s). Thus, while <source lang=apl inline>2 ⟦3⟧</source> could have denoted the [[nested array]] <source lang=apl inline>2 (1⍴3)</source>, this isn't viable with <source lang=apl inline>2 [1⍴3]</source> because this already denotes indexing <source lang=apl inline>2</source> using the indices <source lang=apl inline>1⍴3</source>. To disambiguate, at least one statement separator or line break must be present in each level of array notation brackets and parentheses.
 
=== Minimum rank of major cells ===
 
While <source lang=apl inline>⟦⟦3⟧⟧</source> could denote <source lang=apl inline>1 1⍴3</source> using non-ASCII glyphs, an equivalent ASCII scheme instead would have required <source lang=apl inline>[[3⋄]⋄]</source> where the inner bracket creates a vector, and the outer creates a [[matrix]]. Using line breaks instead of diamonds, it was found to be counter-intuitive that <source lang=apl>[
3
5
  ]</source> was to denote two-[[element]] vector while <source lang=apl>[
3 4
5 6
    ]</source> would be a two-row matrix. Therefore, a special rule was added to the effect that in such collections of major cells, every cell would be considered to have a rank of at least 1, even if it was a [[scalar]].
 
In turn, this choice introduced the need for a separate notation to allow vectors to be written over multiple lines, and therefore the round parentheses was extended from its traditional use in [[strand notation]] to also denote a collection of [[enclose]]d elements.


=== Name-value pairs ===
One-dimensional list syntax with surrounding brackets and delimiters, matching [[wikipedia:sequence|sequence]] notation in mathematics, is common in programming. It appears as early as [[wikipedia:ALGOL 68|ALGOL 68]] with parentheses, and square-bracket lists feature in languages from the 1970s such as [[wikipedia:ML (programming language)|ML]] and [[wikipedia:Icon (programming language)|Icon]]. [[MATLAB]] uses matrix syntax with square brackets, semicolons to separate rows, and commas to separate elements within a row. [[wikipedia:FP (programming language)|FP]] uses angle brackets for lists, and square brackets for function "construction", with behavior like [[function array]]s.


As a notation for [[namespace]]s, several details were debated:
List notation appears in [[Nial]] using brackets and commas like <syntaxhighlight lang=apl inline>[a,b,c]</syntaxhighlight>, and allowing function arrays called "atlases". [[A+]] and [[K]] have a list notation using parentheses and semicolons like <syntaxhighlight lang=apl inline>(a;b;c)</syntaxhighlight>. In A+ this is related to [[bracket indexing]] and an "expression group" notation written with curly braces and semicolons. It allows line breaks, but in addition to rather than in place of semicolons. The later K version corresponds more closely to APL: the semicolon is a statement separator and is interchangeable with a line break, and because K represents arrays with nested lists, it corresponds to both vector and high-rank array notation.


# Whether to use <source lang=apl inline></source> or <source lang=apl inline>;</source> to separate [[wikipedia:name-value pair|name-value pair]]s (in addition to line breaks)
The first published proposals that influenced [[Dyalog APL]]'s array notation were made by [[Phil Last]] at [[Dyalog '15]] and later in [[Vector Journal]].<ref>[[Phil Last]]. [https://dyalog.tv/Dyalog15/?v=9-HAvTMhYao APL Array Notation] ([https://www.dyalog.com/uploads/conference/dyalog15/presentations/U07_APL_Array_Notation.pdf transcript]). [[Dyalog '15]].</ref><ref>[[Phil Last]]. [http://archive.vector.org.uk/art10501450 A Notation for APL array Embedding and Serialization]. [[Vector Journal]], Volume 26, number 4. [[British APL Association]]. 2016.</ref> Last cited the syntax of [[dfn]]s as a sequence of expressions with enclosing braces, as well as [[APL#]]'s namespace notation enclosed in double brackets <code>[[]]</code>, as precursors. He also used the design in Acre Desktop, a project manager for Dyalog APL, to support storing constant arrays and namespaces in text files. Following the conference presentation, [[Adám Brudzewsky]] began work on array notation and presented on it in a series of conferences, initially using parentheses for the high-rank notation<ref>[[Adám Brudzewsky]]. [https://dyalog.tv/Dyalog17/?v=CRQNzL8cUQE Literal Notation for Arrays and Namespaces] ([https://www.dyalog.com/uploads/conference/dyalog17/presentations/D11_Literal_Notation_for_Arrays_and_Namespaces.pdf slides]). [[Dyalog '17]].</ref> and later returning to square brackets.<ref>[[Adám Brudzewsky]]. [https://dyalog.tv/Dyalog18/?v=GAdQuOtPcfM Array Notation Mk III]. [[Dyalog '18]].</ref><ref>[[Adám Brudzewsky]]. [https://apl-germany.de/wp-content/uploads/2021/11/APL_Journal_2020_1u2.pdf#page=34 A Notation for APL Arrays]. APL-Journal, Volume 2020, number 1-2. [[APL Germany|APL-Germany e.V.]] 2020.</ref> Because Last's use of <syntaxhighlight lang=apl inline></syntaxhighlight> to separate namespace keys from values prevented lists from including arbitrary expressions (which might contain assignment), he proposed a change to <syntaxhighlight lang=apl inline>:</syntaxhighlight> as in [[wikipedia:JSON|JSON]]. [[Dyalog APL 18.0]], released in 2020, included support for array notation in source files loaded by Link<ref>[[Dyalog Ltd]]. [https://dyalog.github.io/link/3.0/Discussion/TechDetails/#creating-apl-source-files-and-directories Link User Guide: Creating APL Source Files and Directories]. Retrieved 2022-08-24.</ref>, but not in the language itself.<ref>[[Adám Brudzewsky]]. [https://dyalog.tv/Dyalog20/?v=5drncJiWOM4 Array Notation RC1] ([https://www.dyalog.com/uploads/conference/dyalog20/presentations/D09_Array_Notation_RC1.pdf slides]). [[Dyalog '20]].</ref>
# Which enclosure glyphs to use, <source lang=apl inline>(</source><source lang=apl inline>)</source> or <source lang=apl inline>[</source><source lang=apl inline>]</source>
# Which glyph should separate the name from the value, <source lang=apl inline>:</source> or <source lang=apl inline>←</source>
# In which scope the value expressions should be evaluated


The <source lang=apl inline>⋄</source> was chosen to separate name-value pairs, as it is generally exchangeable with a line break, while <source lang=apl inline>;</source> though it is used to separate names ― without values ― in [[Defined_function_(traditional)#Semi-colons|headers]] and in [[locals lines]]. Furthermore, it was seen as natural the values would be computed in reading order (left-to-right) just like multiple statements are, and while <source lang=apl inline></source> would imply this, <source lang=apl inline>;</source> wouldn't. Indeed, in the statement <source lang=apl inline>A[B;C]</source>, expression <source lang=apl inline>C</source> is evaluated before expression <source lang=apl inline>B</source>. It was briefly considered to have values computed from the right, just line stranding is, but this was rejected because replacing the semi-colons with line breaks would then require evaluation beginning with the last line and working upwards!
The project manager Acre Desktop added support for the non-namespace parts of the notation in early 2018, together with Phil Last's original namespace notation, using square brackets and assignment arrow. [[dzaima/APL]] added support for vector notation with parentheses in 2018, namespaces and function arrays in 2019, and high-rank arrays with square brackets in 2020. [[BQN]] supported lists with angle brackets (<code></code><code></code>) in its initial implementation in 2020; square brackets (<code>[</code><code>]</code>) were reserved for high-rank array notation, which was implemented in 2022.


Round parentheses were chosen because namespaces are seen as (unordered) lists, and so are more similar to vectors than higher-rank arrays. Furthermore, <source lang=apl inline>[]</source> already had meaning (indexing all elements of a vector) while <source lang=apl inline>()</source> didn't have any existing use, and so could be used to denote a new empty namespace, equivalent to <source lang=apl inline>⎕NS 0⍴⊂''</source>.
On April 21, 2023, Dyalog Ltd published a blog post by Morten Kromberg announcing to the [[community]] the formal proposal for an APL array notation<ref name=formprop>Kromberg, Morten. [https://www.dyalog.com/blog/2023/04/formal-proposal-for-apl-array-notation-seeking-feedback/ Formal Proposal for APL Array Notation – Seeking Feedback]. Formal Proposal for APL Array Notation – Seeking Feedback. April 21, 2023.</ref> and by May 5, the specification for scoping in namespaces was changed due to feedback from Dyalog Ltd employee Peter Mikkelsen: Assignments inside value expressions would now affect the surrounding scope rather than having [[dfn]]-like auto-localisation, which can instead be achieved by wrapping the expression in an anonymous dfn.  


While initially, <source lang=apl inline>←</source> was seen as the obvious choice to separate the name and the value, it was soon discovered that a namespace with only one member would be indistinguishable from a parenthesised [[assignment]]. Furthermore, it was noted that value expressions could contain intermediary assignments, and that such assignments were of a fundamentally different nature from the name-value declaration. The intermediary assignments would happen in a temporary scope, with any created variables disappearing once the namespace member value was established.
{{Template:Comparison of array notations}}


Value expressions could be evaluated in the newly established namespace (similar to expressions in <source lang=apl inline>:Namespace</source> scripts), or in the surrounding scope (similar to inline expressions in [[wikipedia:JavaScript|JavaScript]]'s object notation). It was envisioned that a main usage of the literal notation would be to collect existing values into a namespace, and evaluating inside the new namespace would force the use of <source lang=apl inline>##.</source> to fetch values in the surrounding scope. In a departure from JavaScript, it was found most natural that such intermediate assignments be local to the value expression, similar to assignments in dfns. Global assignment is still available using <source lang=apl inline>⎕THIS.name←value</source>, just as in dfns.
== Documentation ==
* [https://mlochbaum.github.io/BQN/doc/arrayrepr.html#array-literals BQN] (as <code>⟨⋄⟩</code>, <code>[]</code>, and <code>{key⇐val⋄}</code>)
* [https://www.nial-array-language.org/ndocs/NialDict2.html#bracket-comma-notation Nial] (as <code>[,]</code> for vectors)
== External links ==
* [https://abrudz.github.io/aplan/Formal%20Proposal%20%E2%80%94%20APL%20Array%20Notation.pdf Formal Proposal] document
* [https://abrudz.github.io/aplan Evaluate APL Array Notation] sandbox


== References ==
== References ==
<references/>
<references/>
{{APL syntax}}[[Category:APL syntax]][[Category:Nested array model]]
{{APL syntax}}[[Category:APL syntax]][[Category:Nested array model]]

Latest revision as of 04:23, 21 September 2023

(⋄) [⋄]

Array notation ((⋄), [⋄]), abbreviated APLAN parallel to JSON, is a way to write most arrays literally, with no or minimal use of primitive functions, possibly over multiple code lines. It differs from the strand notation existing since APL\360 in that it can be used to write arrays of rank greater than one. Array notation is supported in dzaima/APL, BQN (using angle brackets ⟨⋄⟩ instead of round parentheses (⋄)), and some tools for Dyalog APL, where it is planned as an eventual language feature.

Array notation generally consists of a vector notation written with parentheses (), roughly equivalent to stranding, and a high-rank notation using square brackets [], indicating the Mix of a vector. It also supports namespaces, using name:value syntax in round parentheses. Statement separators must appear between elements and between name–value pairs.

Examples

Medium-sized array constants are often needed in code. Due to the lack of a native multi-line notation, programmers have resorted to various ad-hoc methods of approximating such, usually at the cost of reduced readability. A very common technique is repeated concatenation resulting in the desired value being held in a variable (z in the below examples), as opposed to array notation which can express the final value directly. In addition, the traditional technique sometimes involves the creation of helper variables as a side effect.

Basic arrays

Traditional method Array notation Description
(0 6 1 8)(1 4 1 4 2)(2 7 1 8 2 8)(3 1 4 1 5)
(0 6 1 8 ⋄ 1 4 1 4 2 ⋄ 2 7 1 8 2 8 ⋄ 3 1 4 1 5)
Vector of numeric vectors on a single line.
z← (0 6 1 8)(1 4 1 4 2)
z,←(2 7 1 8 2 8)(3 1 4 1 5)
(0 6 1 8 ⋄ 1 4 1 4 2
 2 7 1 8 2 8 ⋄ 3 1 4 1 5)
Vector of numeric vectors split over two lines.
z←,⊂'Three'
z,←⊂'Blind'
z,←⊂'Mice'
('Three'
 'Blind'
 'Mice')
Vector of character vectors, one on each line. (The traditional method includes an unnecessary , to indicate that z will be a vector.)
z←⍉⍪0 6 1 8
z⍪← 1 4 1 4
z⍪← 2 7 1 8
z⍪← 3 1 4 2
[0 6 1 8
 1 4 1 4
 2 7 1 8
 3 1 4 2]
Numeric matrix.
z←⍪10
z⍪←20
z⍪←30
z⍪←40
[10
 20
 30
 40]
Column matrix.

Involved arrays

Traditional method Array notation Description
a←⍉⍪0 0 1
a⍪← 1 0 1
a⍪← 0 1 1
z←,⊂a
a←⍉⍪0 1 1
a⍪← 1 1 0
a⍪← 0 1 0
z,←⊂a
a←⍉⍪0 1 1 1
a⍪← 1 1 1 0
z,←⊂a
a←⍉⍪0 1 1 0
a⍪← 1 0 0 1
a⍪← 0 1 1 0
z,←⊂a
([0 0 1
 1 0 1
 0 1 1]

 [0 1 1
  1 1 0
  0 1 0]

 [0 1 1 1
  1 1 1 0]

 [0 1 1 0
  1 0 0 1
  0 1 1 0])
Vector of matrices.
z←⍉⍪0 'OK'
z⍪← 1 'WS FULL'
z⍪← 2 'SYNTAX ERROR'
z⍪← 3 'INDEX ERROR'
z⍪← 4 'RANK ERROR'
[0 'OK'
 1 'WS FULL'
 2 'SYNTAX ERROR'
 3 'INDEX ERROR'
 4 'RANK ERROR']
Table with numeric and text columns.
a←⍉⍪3 1 4
a⍪← 1 5 0
a←↑a
b←⍉⍪2 7 0
b⍪← 2 0 0
z←a,[0.5] b
[[3 1 4
  1 5 0]
 [2 7 0
  2 0 0]]
Rank 3 numeric array.
a←,⊂3 1 4
a,←⊂1 5
a←↑a
b←,⊂2 7
b,← 2
b←↑b
z←↑a b
[[3
  1 5 9]
 [2 7
  2]]
Rank 3 numeric array relying on automatic padding with fill element.
z←⍉⍪'fns'  ((0 1)(0.7 0)(0.7 0)×size)
z⍪← 'fnd'  ((0 1)(0   0)(0   0)×size)
z⍪← 'lines'((0 0)(0.7 0)(0.7 0)×size)
z⍪← 'lnd'  ((0 0)(0   0)(0   0)×size)
['fns'  ((0 1 ⋄ 0.7 0 ⋄ 0.7 0)×size)
 'fnd'  ((0 1 ⋄ 0   0 ⋄ 0   0)×size)
 'lines'((0 0 ⋄ 0.7 0 ⋄ 0.7 0)×size)
 'lnd'  ((0 0 ⋄ 0   0 ⋄ 0   0)×size)]
Matrix of simple and nested vectors, with dynamic values.

Namespaces

Traditional method Array notation Description
⎕NS⍬
()
Empty namespace.
⎕NS¨⍬⍬⍬
or
(⎕NS⍬)(⎕NS⍬)(⎕NS⍬)
()()()
Vector of namespaces.
z←⎕NS⍬
z.x←'hello'
(x:'hello')
Namespace with character vector member.
z←⎕NS⍬
z.x←⍉⍪'hello'
z.x⍪← 'world'
(x:['hello'
    'world'])
Namespace with character matrix member.
z←⎕NS⍬
z.y←⎕NS⍬
z.y.x←⍉⍪'hello'
z.y.x⍪← 'world'
(y:(x:['hello'
       'world']))
Nested namespace structure with matrix member.
z←⎕NS⍬
z.f←+
a←⎕NS⍬
a.f←-
z,←a
a←⎕NS⍬
a.f←×
z,←a
a←⎕NS⍬
a.f←÷
z←z.f
((f:+)(f:-)(f:×)(f:÷)).f
Function array.

Specification

The notation consists of syntax that was invalid before its introduction, thus causing no issues for backwards compatibility. The added syntax consists of three constructs that are currently SYNTAX ERRORs:

  • broken round parentheses: ()
  • broken square brackets: []
  • empty round parentheses: ()

where broken means interrupted by one or more diamonds () or line breaks (outside of dfns).

  • A broken round parenthesis creates a namespace if every diamond/line break-separated statement is a name-value pair.
  • A broken round parenthesis creates a vector if every diamond/line break-separated statement is a value expression. In that case, every such statement forms an element in the resulting vector.
  • A broken square bracket creates a an array where every diamond/line break-separated statement forms a major cell in the resulting array.*
  • () creates a new namespace — equivalent to (⎕NS 0⍴⊂'')
  • A name-value pair consists of a valid APL identifier, followed by a colon (:) and a value expression.

* This rule is followed strictly in dzaima/APL, while Dyalog APL considers each statement to have a rank of at least 1, even if it is a scalar.

Formal syntax

The array notation can be described using Extended Backus–Naur form, where an expression is any traditional APL expression:

value    ::= expression | list | block | space
list     ::= '(' ( ( value sep )+ value? | ( sep value )+ sep? ) ')'
block    ::= '[' ( ( value sep )+ value? | ( sep value )+ sep? ) ']'
space    ::= '(' sep? ( name ':' value ( sep name ':' value )* )? sep? ')'
sep      ::= [⋄#x000A#x000D#x0085]+

History

See also the Array notation design considerations#Timeline

One-dimensional list syntax with surrounding brackets and delimiters, matching sequence notation in mathematics, is common in programming. It appears as early as ALGOL 68 with parentheses, and square-bracket lists feature in languages from the 1970s such as ML and Icon. MATLAB uses matrix syntax with square brackets, semicolons to separate rows, and commas to separate elements within a row. FP uses angle brackets for lists, and square brackets for function "construction", with behavior like function arrays.

List notation appears in Nial using brackets and commas like [a,b,c], and allowing function arrays called "atlases". A+ and K have a list notation using parentheses and semicolons like (a;b;c). In A+ this is related to bracket indexing and an "expression group" notation written with curly braces and semicolons. It allows line breaks, but in addition to rather than in place of semicolons. The later K version corresponds more closely to APL: the semicolon is a statement separator and is interchangeable with a line break, and because K represents arrays with nested lists, it corresponds to both vector and high-rank array notation.

The first published proposals that influenced Dyalog APL's array notation were made by Phil Last at Dyalog '15 and later in Vector Journal.[1][2] Last cited the syntax of dfns as a sequence of expressions with enclosing braces, as well as APL#'s namespace notation enclosed in double brackets [[]], as precursors. He also used the design in Acre Desktop, a project manager for Dyalog APL, to support storing constant arrays and namespaces in text files. Following the conference presentation, Adám Brudzewsky began work on array notation and presented on it in a series of conferences, initially using parentheses for the high-rank notation[3] and later returning to square brackets.[4][5] Because Last's use of to separate namespace keys from values prevented lists from including arbitrary expressions (which might contain assignment), he proposed a change to : as in JSON. Dyalog APL 18.0, released in 2020, included support for array notation in source files loaded by Link[6], but not in the language itself.[7]

The project manager Acre Desktop added support for the non-namespace parts of the notation in early 2018, together with Phil Last's original namespace notation, using square brackets and assignment arrow. dzaima/APL added support for vector notation with parentheses in 2018, namespaces and function arrays in 2019, and high-rank arrays with square brackets in 2020. BQN supported lists with angle brackets () in its initial implementation in 2020; square brackets ([]) were reserved for high-rank array notation, which was implemented in 2022.

On April 21, 2023, Dyalog Ltd published a blog post by Morten Kromberg announcing to the community the formal proposal for an APL array notation[8] and by May 5, the specification for scoping in namespaces was changed due to feedback from Dyalog Ltd employee Peter Mikkelsen: Assignments inside value expressions would now affect the surrounding scope rather than having dfn-like auto-localisation, which can instead be achieved by wrapping the expression in an anonymous dfn.

Comparison of array notations

The following systems support list or vector notation in some form, beyond simple strand notation. The separators ; in A+ and K, and in APL and BQN, indicate any separator, including a line break.

System Vectors High-rank Namespaces Function arrays Assignable
Nial [,] No N/A Special No
A+ (;) No N/A First-class Yes
K (;) N/A [key:val;] First-class Yes
BQN[9] ⟨⋄⟩ [⋄] {key⇐val⋄} First-class Yes
dzaima/APL (⋄) [⋄] (key:val⋄) Special No
Dyalog Link (⋄) [⋄] (key:val⋄) No No
Acre Desktop[10] (⋄) [⋄] [key←val⋄] No N/A
TinyAPL ⟨⋄⟩ [⋄] ⦃key←val⋄⦄ First-class Yes

Nial and A+ do not support namespaces, while K does not support high-rank arrays, so any such notation is not applicable. The "Function arrays" column indicates whether functions can be placed in array notation. "First class" indicates that functions are first class, so this is possible without special consideration. "Special" indicates creating a special vectors of functions that can be applied to arguments to return a list of results. The "Assignable" column indicates that array notation can be used as an assignment target to perform destructuring. BQN's namespaces don't use a dedicated construction; instead, any block (like a dfn) with statements returns a namespace reference. Acre Desktop only uses array notation for storing literal arrays; it cannot appear in executable code.

Documentation

  • BQN (as ⟨⋄⟩, [⋄], and {key⇐val⋄})
  • Nial (as [,] for vectors)

External links

References

APL syntax [edit]
General Comparison with traditional mathematicsPrecedenceTacit programming (Train, Hook, Split composition)
Array Numeric literalStringStrand notationObject literalArray notation (design considerations)
Function ArgumentFunction valenceDerived functionDerived operatorNiladic functionMonadic functionDyadic functionAmbivalent functionDefined function (traditional)DfnFunction train
Operator OperandOperator valenceTradopDopDerived operator
Assignment MultipleIndexedSelectiveModified
Other Function axisBracket indexingBranchStatement separatorQuad nameSystem commandUser commandKeywordDot notationFunction-operator overloadingControl structureComment