Performance: Difference between revisions

Jump to navigation Jump to search
1,334 bytes added ,  08:26, 14 May 2020
m
no edit summary
No edit summary
mNo edit summary
(2 intermediate revisions by the same user not shown)
Line 7: Line 7:


=== Internal datatypes ===
=== Internal datatypes ===
{{Main|Internal datatypes}}
{{Main|Internal type}}
Most APLs expose only a small number of scalar types to the user: one or two [[numeric]] types (such as double-precision real or [[complex]] numbers), and a single [[character]] type. However, for performance reasons these types can be implemented internally using various subset types. For example, [[APL\360#Internal types|APL\360 uses]] numeric arrays of 1-bit [[Boolean]]s, 4-byte integers, or 8-byte floating point numbers, but converts between them transparently so that from the user's perspective all numbers behave like 8-byte floats (as this type contains the others). In [[Dyalog APL]] this hierarchy is [[Dyalog APL#Internal types|significantly expanded]], adding 1-byte and 2-byte integers as well as 16-byte [[complex]] numbers containing the other types (however, Dyalog also allows the user access to [[decimal float]]s if requested, which breaks the strict hierarchy).
Most APLs expose only a small number of scalar types to the user: one or two [[numeric]] types (such as double-precision real or [[complex]] numbers), and a single [[character]] type. However, for performance reasons these types can be implemented internally using various subset types. For example, [[APL\360#Internal types|APL\360 uses]] numeric arrays of 1-bit [[Boolean]]s, 4-byte integers, or 8-byte floating point numbers, but converts between them transparently so that from the user's perspective all numbers behave like 8-byte floats (as this type contains the others). In [[Dyalog APL]] this hierarchy is [[Dyalog APL#Internal types|significantly expanded]], adding 1-byte and 2-byte integers as well as 16-byte [[complex]] numbers containing the other types (however, Dyalog also allows the user access to [[decimal float]]s if requested, which breaks the strict hierarchy).


When working with large arrays, an implementation can dynamically choose the type of arrays as execution progresses. For some operations it is advantageous to force an array to the smallest possible type, a procedure known as "squeezing". The ability to dynamically change array type can be a practical advantage of interpreted array languages over statically typed compiled languages, since the interpreter is sometimes able to choose a smaller type than the compiler. This may be because the programmer chooses a suboptimal type or because the interpreter can take advantage of situations where an array could possible require a larger type, but doesn't in a particular instance of a program. With an implementation using [[vector instruction]]s, a smaller internal type can directly translate to faster execution because a vector register (and hence a vector operation) can fit more elements when they are smaller.<ref name="advantage"/>
When working with large arrays, an implementation can dynamically choose the type of arrays as execution progresses. For some operations it is advantageous to force an array to the smallest possible type, a procedure known as "squeezing". The ability to dynamically change array type can be a practical advantage of interpreted array languages over statically typed compiled languages, since the interpreter is sometimes able to choose a smaller type than the compiler. This may be because the programmer chooses a suboptimal type or because the interpreter can take advantage of situations where an array could possible require a larger type, but doesn't in a particular instance of a program. With an implementation using [[vector instructions]], a smaller internal type can directly translate to faster execution because a vector register (and hence a vector operation) can fit more elements when they are smaller.<ref name="advantage"/>


=== Fast array operations ===
=== Fast array operations ===
Line 22: Line 22:
=== Alternate array representations ===
=== Alternate array representations ===
Internally, APL arrays are usually stored as two lists in memory. The first is a list of the shape (although some implementations also include the "stride"<ref>Nick Nickolov ''Compiling APL to JavaScript'' (Vector Volume 26)</ref>). The second is the ravel of elements in the array. Nested arrays consist of pointers to arrays which may be distributed across memory, their use can lead to very inefficient memory read patterns - in contrast to flat arrays which are stored as a contiguous block.
Internally, APL arrays are usually stored as two lists in memory. The first is a list of the shape (although some implementations also include the "stride"<ref>Nick Nickolov ''Compiling APL to JavaScript'' (Vector Volume 26)</ref>). The second is the ravel of elements in the array. Nested arrays consist of pointers to arrays which may be distributed across memory, their use can lead to very inefficient memory read patterns - in contrast to flat arrays which are stored as a contiguous block.
=== Reference counting and data reuse ===
Because APL's immutable arrays do not permit [[wikipedia:circular reference|circular reference]]s (although other features like [[objects]] might), APL implementations almost universally use [[wikipedia:reference counting|reference counting]] as a memory management technique. In some implementations, such as [[Dyalog APL]], reference counting is supplemented with [[wikipedia:tracing garbage collection|tracing garbage collection]], which is run infrequently to handle circular references.
Because reference counting keeps track of the exact number of references, and not just whether an array is referenced or not, it can be used not only to find when an array can be released (reference count 0), but also to find when it can be reused when passed as an argument (reference count 1).<ref>Cantrill, Brian. [https://queue.acm.org/detail.cfm?id=1531242 "A Conversation with Arthur Whitney"]. 2009.</ref> When permitted, reusing arguments can reduce memory usage and improve cache locality. In some cases, it also allows for faster primitive implementations: for example, [[Reshape]] can change only an array's [[shape]] while leaving its [[ravel]] data in place, [[Take]] can free trailing [[major cell]]s from an array while leaving the remainder, and [[At]] can modify only part of an array.


=== Operation merging and dynamic compilation ===
=== Operation merging and dynamic compilation ===

Navigation menu