Array model: Difference between revisions

From APL Wiki
Jump to navigation Jump to search
Miraheze>Adám Brudzewsky
No edit summary
Miraheze>Marshall
(Describe array models more generally)
Line 1: Line 1:
APL uses a uniquely versatile array model, much more powerful than the [https://en.wikipedia.org/wiki/Linked_list linked lists] of many traditional programming languages like Python and JavaScript.
The distinguishing feature of APL and the array language family is its focus on arrays. In most array languages the array is the only first class datatype. While this sounds like a very strict model of language design, in fact it imposes no restrictions at all: any kind of data can be treated as a [[scalar]], or array with rank 0!


All data in APL resides in arrays. An array is a rectangular collection of numbers, characters and arrays, arranged along zero or more axes. Numbers and characters are 0 dimensional arrays, and are referred to as scalars. Characters are in single quotes.
APL's array model is distinct from and richer than the one-dimensional data structures given the name "array" in languages such as Python, Javascript, and Java. These structures correspond to APL vectors, sometimes with the requirement that all elements have the same type. In APL it is the arrangement of data into a multidimensional shape, and not any requirement about the way it is stored or the type of its elements, that defines an array. APL arrays are most closely related to multidimensional FORTRAN or C arrays.


Creating a 1-dimensional array is very simple: just write the elements next to each other, separated by spaces. This syntax works for any data that you want in the array: numbers, characters, other arrays (etc.)
An array is a rectangular collection of elements, arranged along zero or more axes. The number of axes is called the array's rank while their lengths make up the shape. Names are given to arrays with particular ranks:
* An array with 0 axes is a [[scalar]].
* An array with 1 axis is a [[vector]].
* An array with 2 axes is a [[matrix]].
An array's shape is then a vector whose elements are axis lengths, and its rank is a scalar.


A string is just a character vector, which may be written with single quotes around the entire string.
In a language with [[stranding]], such as [[APL2]], creating a 1-dimensional array is very simple: just write the elements next to each other, separated by spaces. The APL2 family allows any array to be used as an element, including scalar numbers or characters (written with quotes) as well as larger arrays. In order to include a stranded array inside another array it must be parenthesized.


Parentheses are used to group array items.
In most APLs, "string" is just a term for a character vector, and a string may be written with single quotes around the entire string.


Numbers are very straight forward: a number is a number. It doesn't matter if it is an integer or not, nor if it is real or complex. Negative numbers are denoted by a ''high minus'': <code>¯</code>. You can also use scientific notation with <code>E</code> (some dialects permit <code>e</code>), so a million is <code>1E6</code>
== Array model variations ==


The items of an array can be of different types.
Within the general definition given above, APLs and other array languages differ on many details such as which values are permissible as elements and how arrays are traversed and compared. In some APLs the elements of an array can never be arrays; in others they are always arrays!


An array has a rank and a depth.
=== "Array languages" without arrays ===
 
Some languages, despite deriving from APL, do not use APL-style arrays at all! Examples include [[K]] and [[I]], which only have vectors, and [[MATLAB]], which has true multidimensional arrays but usually treats data as matrices. Languages like K are usually considered part of the array language or APL family, but may or may not be considered array languages themselves.
 
=== Homogeneous and inhomogeneous arrays ===
 
Some APLs impose the rule that all elements of arrays have the same type, such as all character or all numeric. In the very earliest instantiations of APL as a mathematical notation the question of type probably never came up: all array elements were numbers. IBM's [[APL\360]] imposed the rule that arrays should be homogeneous (all elements have the same type), and this rule has been maintained in newer languages such as [[SHARP APL]] and [[J]]. With the creation of the nested array model many APLs began to allow "mixed" arrays containing both character and numeric data.
 
In order to allow programmers to work with inhomogeneous data, many languages with homogeneous arrays define a special kind of element which "encloses" or "boxes" an array.
 
Whether an array language is homogeneous or inhomogeneous depends only on the language's behavior from the programmer's perspective. Nearly all APLs use homogeneous arrays for implementation purposes, with pointers to enclose elements. However, the language's array model is determined by what is presented to the programmer and not what is stored in memory. If a programmer can create and manipulate arrays with both character and numeric data the same way they would work with completely numeric arrays, then the language is inhomogeneous!
 
=== "Flat" versus "Nested" arrays ===
 
A user of [[Iverson notation]] manipulates arrays but never works directly with the contents of arrays. The elements of arrays are characters or numbers, and APL primitives transform arrays into other arrays. The elements of arrays are only touched indirectly.
 
A historically important extension to APL was to allow arrays to contain other arrays, inductively. The Nested Array Research System ([[NARS]]) was developed to study this extension. Languages which use this extension describe arrays which do not contain other arrays as "simple" and those which do as "nested".
 
In APLs with a nested array model, the programmer can [[Pick]] an element of an array directly. The resulting element is always an array, even for simple arrays: Pick never returns something which is "just a number". This may be viewed in multiple ways: either an array's elements are in fact always arrays, or Pick and similar functions wrap non-array elements so they are still arrays. An APL could be defined which gives an error rather than allow a program to pick into a simple array. Another choice might be to return a non-array character, but an APL which allowed such values to be used might no longer be considered an array language.
 
=== "Floating" versus "fixed" ===
 
The floating array model is a modification to the nested array model. It defines an array whose only element is a simple scalar to be identical to the scalar itself. In implementation terms, a simple scalar will "float" to the to of a stack of rank-0 arrays. Languages which do not make this identification are called "fixed".
 
== Array characteristics ==


=== Depth ===
=== Depth ===

Revision as of 14:11, 14 October 2019

The distinguishing feature of APL and the array language family is its focus on arrays. In most array languages the array is the only first class datatype. While this sounds like a very strict model of language design, in fact it imposes no restrictions at all: any kind of data can be treated as a scalar, or array with rank 0!

APL's array model is distinct from and richer than the one-dimensional data structures given the name "array" in languages such as Python, Javascript, and Java. These structures correspond to APL vectors, sometimes with the requirement that all elements have the same type. In APL it is the arrangement of data into a multidimensional shape, and not any requirement about the way it is stored or the type of its elements, that defines an array. APL arrays are most closely related to multidimensional FORTRAN or C arrays.

An array is a rectangular collection of elements, arranged along zero or more axes. The number of axes is called the array's rank while their lengths make up the shape. Names are given to arrays with particular ranks:

  • An array with 0 axes is a scalar.
  • An array with 1 axis is a vector.
  • An array with 2 axes is a matrix.

An array's shape is then a vector whose elements are axis lengths, and its rank is a scalar.

In a language with stranding, such as APL2, creating a 1-dimensional array is very simple: just write the elements next to each other, separated by spaces. The APL2 family allows any array to be used as an element, including scalar numbers or characters (written with quotes) as well as larger arrays. In order to include a stranded array inside another array it must be parenthesized.

In most APLs, "string" is just a term for a character vector, and a string may be written with single quotes around the entire string.

Array model variations

Within the general definition given above, APLs and other array languages differ on many details such as which values are permissible as elements and how arrays are traversed and compared. In some APLs the elements of an array can never be arrays; in others they are always arrays!

"Array languages" without arrays

Some languages, despite deriving from APL, do not use APL-style arrays at all! Examples include K and I, which only have vectors, and MATLAB, which has true multidimensional arrays but usually treats data as matrices. Languages like K are usually considered part of the array language or APL family, but may or may not be considered array languages themselves.

Homogeneous and inhomogeneous arrays

Some APLs impose the rule that all elements of arrays have the same type, such as all character or all numeric. In the very earliest instantiations of APL as a mathematical notation the question of type probably never came up: all array elements were numbers. IBM's APL\360 imposed the rule that arrays should be homogeneous (all elements have the same type), and this rule has been maintained in newer languages such as SHARP APL and J. With the creation of the nested array model many APLs began to allow "mixed" arrays containing both character and numeric data.

In order to allow programmers to work with inhomogeneous data, many languages with homogeneous arrays define a special kind of element which "encloses" or "boxes" an array.

Whether an array language is homogeneous or inhomogeneous depends only on the language's behavior from the programmer's perspective. Nearly all APLs use homogeneous arrays for implementation purposes, with pointers to enclose elements. However, the language's array model is determined by what is presented to the programmer and not what is stored in memory. If a programmer can create and manipulate arrays with both character and numeric data the same way they would work with completely numeric arrays, then the language is inhomogeneous!

"Flat" versus "Nested" arrays

A user of Iverson notation manipulates arrays but never works directly with the contents of arrays. The elements of arrays are characters or numbers, and APL primitives transform arrays into other arrays. The elements of arrays are only touched indirectly.

A historically important extension to APL was to allow arrays to contain other arrays, inductively. The Nested Array Research System (NARS) was developed to study this extension. Languages which use this extension describe arrays which do not contain other arrays as "simple" and those which do as "nested".

In APLs with a nested array model, the programmer can Pick an element of an array directly. The resulting element is always an array, even for simple arrays: Pick never returns something which is "just a number". This may be viewed in multiple ways: either an array's elements are in fact always arrays, or Pick and similar functions wrap non-array elements so they are still arrays. An APL could be defined which gives an error rather than allow a program to pick into a simple array. Another choice might be to return a non-array character, but an APL which allowed such values to be used might no longer be considered an array language.

"Floating" versus "fixed"

The floating array model is a modification to the nested array model. It defines an array whose only element is a simple scalar to be identical to the scalar itself. In implementation terms, a simple scalar will "float" to the to of a stack of rank-0 arrays. Languages which do not make this identification are called "fixed".

Array characteristics

Depth

Main article: Depth

The depth is the level of nesting. For example an array of simple (i.e. non-nested) scalars has depth 1, an array containing only depth 1 arrays has depth 2 and a simple scalar (e.g a number or character) has depth 0. However, for some arrays it's not so easy. An array may contain both vectors and scalars. In cases like this, the depth is reported as negative. You can find the depth of an array with the function .

Rank

Main article: Rank

The concept of rank is very important in APL, and isn't present in many other languages. It refers to the number of dimensions in the array. So far, rank-0 arrays are scalars, rank-1 arrays are vectors, rank-2 arrays are usually called matrices or tables.

External links

Formal definition

Chat lesson

Vector notation

Template:APL programming language