4,577
edits
(emphasise rank of [] components#) |
(Dialect agnostic) |
||
Line 1: | Line 1: | ||
[[ | This article details the design considerations for [[array notation]] in APL. It is also intended to solicit feedback, via the [[{{TALKPAGENAME}}|Discussion page]]. Feedback from other media will also be posted to that page. | ||
== | == Objectives == | ||
The following requirements were proposed as objectives for an APL array notation:<ref>[[Adám Brudzewsky]]. Internal documents. [[Dyalog Ltd.]] 30 Jun 2017.</ref> | |||
# No new [[glyph]]s | |||
# Reusing existing glyphs for similar purposes | |||
# Similarity to other languages ([[K]], [[wikipedia:JSON|JSON]], [[wikipedia:CSS|CSS]]) | |||
# Visual attractiveness | |||
# Intuitive syntax | |||
# As little [[wikipedia:syntactic sugar|syntactic sugar]] as possible | |||
== Specific considerations == | |||
Various alternatives have been considered and the following details each design decision. | |||
=== Glyphs === | |||
The design requirement for no new glyphs was contentious, and both [[bi-glyph]] and non-ASCII brackets were considered. Bi-glyphs were rejected out of readability concerns, especially when nested. For example, <source lang=apl inline>1 1 3⍴2</source> could have been written as <source lang=apl inline>[[[[2 2 2]]]]</source>. Non-ASCII brackets were rejected for font and keyboarding reasons, as well as to make it easier for non-APL systems to generate APL data. For example, <source lang=apl inline>⟦</source>…<source lang=apl inline>⟧</source> was proposed to denote a collection of [[major cells]], forming a new array of rank one-higher than the rank of the highest-[[rank]] constituent [[cell]]. However, few [[fonts]] support these glyphs. | |||
The eventual choice was to go with existing symbols, and this had important implications for the specifics of the notation. While ideally, a notation would have been introduced for a collection of major cells, thereby handling both vectors and higher-rank arrays, a problem presents itself with [[axis|axes]] of length 1, because both square brackets and round parentheses already have meaning with when surrounding a single statement (namely [[function axis]]/[[bracket indexing]] and [[precedence]]/[[function train]]s). Thus, while <source lang=apl inline>2 ⟦3⟧</source> could have denoted the [[nested array]] <source lang=apl inline>2 (1⍴3)</source>, this isn't viable with <source lang=apl inline>2 [1⍴3]</source> because this already denotes indexing <source lang=apl inline>2</source> using the indices <source lang=apl inline>1⍴3</source>. To disambiguate, at least one statement separator or line break must be present in each level of array notation brackets and parentheses. | |||
=== Disambiguating square brackets === | |||
The overloading of square brackets, currently in use only for [[function axis]] and [[bracket indexing]], to mean a higher-rank array, poses a problem of disambiguation in the case where there is only one major cell. For example <source lang=apl inline>'abc'[3 3]</source> could be equivalent to <source lang=apl inline>'cc'</source> or <source lang=apl inline>'abc'(1 2⍴3)</source> depending on whether the brackets are interpreted as indexing or an array. Two proposals have been made, and it is possible to support either or both: | |||
# Square brackets are interpreted as representing an array if no other interpretation is possible, e.g. immediately following an opening round parenthesis, curly brace, or square bracket, or beginning a statement. | |||
# Square brackets are interpreted as representing an array if they are "broken", i.e. contain a diamond or newline that isn't enclosed in another round parenthesis, curly brace, or square bracket. | |||
Option 1 depends on an outer context of the notation, while option 2 depends on the inner content of the notation. The latter has similarity to the manner in which a [[dfn]] is determined to be a function, a monadic operator, or a dyadic operator: If the curly braces ''contain'' <source lang=apl inline>⍵⍵</source> then the dfn is a dyadic operator; otherwise, a <source lang=apl inline>⍺⍺</source> indicates a monadic operator; and any other dfn is a function. | |||
=== Minimum rank of major cells === | |||
While <source lang=apl inline>⟦⟦3⟧⟧</source> could denote <source lang=apl inline>1 1⍴3</source> using non-ASCII glyphs, an equivalent ASCII scheme instead would have required <source lang=apl inline>[[3⋄]⋄]</source> where the inner bracket creates a vector, and the outer creates a [[matrix]]. Using line breaks instead of diamonds, it was found to be counter-intuitive that <source lang=apl>[ | |||
3 | |||
5 | |||
]</source> was to denote two-[[element]] vector while <source lang=apl>[ | |||
3 4 | |||
5 6 | |||
]</source> would be a two-row matrix. This is indeed the case in [[dzaima/APL]], as opposed to in [[Dyalog APL]], where a special rule was added to the effect that in such collections of major cells, every cell will be considered to have a rank of at least 1, even if it is a [[scalar]]. However, this choice introduced the need for a separate notation to allow vectors to be written over multiple lines, and therefore the round parentheses were extended from their traditional use in [[strand notation]] to also denote collections of [[enclose]]d elements. | |||
=== | === Name-value pairs === | ||
value | |||
== | As a notation for [[namespace]]s, several details were debated, as detailed below. | ||
==== Separators between name-value pairs ==== | |||
Should <source lang=apl inline>⋄</source> or <source lang=apl inline>;</source> be used to separate [[wikipedia:name-value pair|name-value pair]]s (in addition to line breaks)? | |||
The <source lang=apl inline>⋄</source> was chosen to separate name-value pairs, as it is generally exchangeable with a line break, while <source lang=apl inline>;</source> though it is used to separate names ― without values ― in [[Defined_function_(traditional)#Semi-colons|headers]] and in [[locals lines]]. Furthermore, it was seen as natural the values would be computed in reading order (left-to-right) just like multiple statements are, and while <source lang=apl inline>⋄</source> would imply this, <source lang=apl inline>;</source> wouldn't. Indeed, in the statement <source lang=apl inline>A[B;C]</source>, expression <source lang=apl inline>C</source> is evaluated before expression <source lang=apl inline>B</source>. It was briefly considered to have values computed from the right, just line stranding is, but this was rejected because replacing the semi-colons with line breaks would then require evaluation beginning with the last line and working upwards! | |||
==== Namespace delimiters ==== | |||
Should round parentheses (<source lang=apl inline>(</source>…<source lang=apl inline>)</source>) or square brackets (<source lang=apl inline>[</source>…<source lang=apl inline>]</source>) be used to enclose namespaces? | |||
Round parentheses were chosen because namespaces are seen as (unordered) lists, and so are more similar to vectors than higher-rank arrays. Furthermore, <source lang=apl inline>[]</source> already had meaning (indexing all elements of a vector) while <source lang=apl inline>()</source> didn't have any existing use, and so could be used to denote a new empty namespace, equivalent to <source lang=apl inline>⎕NS 0⍴⊂''</source>. | |||
==== Separator between name and value ==== | |||
Should <source lang=apl inline>:</source> or <source lang=apl inline>←</source> separate the name from the value? | |||
While initially, <source lang=apl inline>←</source> was seen as the obvious choice to separate the name and the value, it was soon discovered that a namespace with only one member would be indistinguishable from a parenthesised [[assignment]]. Furthermore, it was noted that value expressions could contain intermediary assignments, and that such assignments were of a fundamentally different nature from the name-value declaration. The intermediary assignments would happen in a temporary scope, with any created variables disappearing once the namespace member value was established. | |||
==== Scoping ==== | |||
In which scope the value expressions should be evaluated? | |||
Value expressions could be evaluated in the newly established namespace (similar to expressions in <source lang=apl inline>:Namespace</source> scripts), or in the surrounding scope (similar to inline expressions in [[wikipedia:JavaScript|JavaScript]]'s object notation). It was envisioned that a main usage of the literal notation would be to collect existing values into a namespace, and evaluating inside the new namespace would force the use of <source lang=apl inline>##.</source> to fetch values in the surrounding scope. In a departure from JavaScript, it was found most natural that such intermediate assignments be local to the value expression, similar to assignments in dfns. Global assignment is still available using <source lang=apl inline>⎕THIS.name←value</source>, just as in dfns. | |||
== Timeline == | |||
=== 1996 === | === 1996 === | ||
Line 39: | Line 79: | ||
=== 2013 === | === 2013 === | ||
Phil Last sent a proposal to Dyalog outlining two possible executable notations for creating multi-dimensional arrays without function application. One using potential new system construct :Array and :Cell to be used in tradfns and another using line-ends between balanced brackets to define arrays of rank-2 or greater in both dfns and tradfns. | Phil Last sent a proposal to Dyalog outlining two possible executable notations for creating multi-dimensional arrays without function application. One using potential new system construct <source lang=text inline>:Array</source> and <source lang=text inline>:Cell</source> to be used in tradfns and another using line-ends between balanced brackets to define arrays of rank-2 or greater in both dfns and tradfns. | ||
It became RFE 9458: Large and higher rank literal values. See [[File:Embedding data.pdf]] | It became RFE 9458: Large and higher rank literal values. See [[File:Embedding data.pdf]] | ||
Line 79: | Line 119: | ||
[[APL Germany]]'s 2020 journal also included a description of the notation, including a discussion of potential issues with [[assignment]].<ref>Brudzewsky, Adám. [https://apl-germany.de/wp-content/uploads/2021/11/APL_Journal_2020_1u2.pdf#page=34 A Notation for APL Arrays]. APL-Journal, Volume 2020, number 1-2. [[APL Germany|APL-Germany e.V.]] 2020.</ref> | [[APL Germany]]'s 2020 journal also included a description of the notation, including a discussion of potential issues with [[assignment]].<ref>Brudzewsky, Adám. [https://apl-germany.de/wp-content/uploads/2021/11/APL_Journal_2020_1u2.pdf#page=34 A Notation for APL Arrays]. APL-Journal, Volume 2020, number 1-2. [[APL Germany|APL-Germany e.V.]] 2020.</ref> | ||
== | == Language comparison == | ||
The following systems support list or vector notation in some form, beyond simple [[strand notation]]. The separators <code>;</code> in A+ and K, and <code>⋄</code> in APL and BQN, indicate any separator, including a line break. | |||
{| class=wikitable | |||
! Language !! Vectors !! High-rank !! [[Namespace]]s !! [[Function array]]s !! Assignable | |||
|- | |||
| [[Nial]] || <code>[,]</code> || || || {{Yes}} || {{No}} | |||
|- | |||
| [[A+]] || <code>(;)</code> || || || {{Maybe|First-class}} || {{Yes}} | |||
|- | |||
| [[K]] || <code>(;)</code> || || || {{Maybe|First-class}} || {{Yes}} | |||
|- | |||
| [[dzaima/APL]] || <code>(⋄)</code> || <code>[⋄]</code> || <code>(key:val⋄)</code> || {{Yes}} || {{Maybe|N/A}} | |||
|- | |||
| [[BQN]]<ref>[[Marshall Lochbaum|Lochbaum, Marshall]]. [https://mlochbaum.github.io/BQN/doc/arrayrepr.html#array-literals BQN: Array notation and display; Array literals]. Retrieved 2022-09-01.</ref> || <code>⟨⋄⟩</code> || <code>[⋄]</code> || <code>{key⇐val⋄}</code> || {{Maybe|First-class}} || {{Yes}} | |||
|- | |||
| [[Dyalog Link]] || <code>(⋄)</code> || <code>[⋄]</code> || <code>(key:val⋄)</code> || {{No|No (indirect)}} || {{No}} | |||
|- | |||
| Acre Desktop<ref>The Carlisle Group. [https://github.com/the-carlisle-group/Acre-Desktop/wiki/APL-Array-Notation APL Array Notation]. Acre Desktop Wiki. GitHub. Retrieved 2022-09-01.</ref> || <code>(⋄)</code> || <code>[⋄]</code> || <code>[key←val⋄]</code> || {{No}} || {{No}} | |||
|} | |||
The "Function arrays" column indicates whether functions can be placed in array notation ([[function array]]s can be created in Dyalog by another method). "First class" indicates that functions are first class, so this is possible without special consideration; in Nial and dzaima/APL vectors of functions are a special form that can be applied to arguments to return a list of results. The "Assignable" column indicates that array notation can be used as an assignment target to perform destructuring. BQN's namespaces don't use a dedicated construction; instead, any block (like a [[dfn]]) with <code>⇐</code> statements returns a namespace reference. Acre Desktop only uses array notation for storing literal arrays; it cannot appear in executable code. | |||
The | |||
== References == | == References == | ||
<references/> | <references/> | ||
{{APL syntax}}[[Category:APL syntax]][[Category:Nested array model]] | {{APL syntax}}[[Category:APL syntax]][[Category:Nested array model]] |