Performance: Difference between revisions

Jump to navigation Jump to search
111 bytes added ,  13:12, 31 August 2023
Link removed
(Link removed)
 
(2 intermediate revisions by one other user not shown)
Line 1: Line 1:
'''Performance''' refers to the speed with which programs are executed in a particular language implementation. While a language such as APL cannot inherently be fast or slow, it is often described as being suitable to high-performance implementation, and there are many APL implementations focused partially or exclusively on performance. Currently-developed array-family implementations that advertise high performance include [[Dyalog APL]], [[J]], [[K]] (both Kx and Shakti), and [[Q]], while research projects focused primarily on performance include [[APEX]], [[Co-dfns]], [[SaC]], [[Futhark]], and [[TAIL]].
'''Performance''' refers to the speed with which programs are executed in a particular language implementation. While a language such as APL cannot inherently be fast or slow, it is often described as being suitable to high-performance implementation, and there are many APL implementations focused partially or exclusively on performance. Currently-developed array-family implementations that advertise high performance include [[Dyalog APL]], [[J]], [[K]] (both Kx and Shakti), and [[Q]], while research projects focused primarily on performance include [[APEX]], [[Co-dfns]], [[SaC]], [[Futhark]], and [[TAIL]].


While dynamically-typed interpreted languages are typically considered to be slow (that is, by nature they lead implementations to run slowly), APL code which uses primarily flat arrays has been described as an excellent fit for modern hardware,<ref>Martin Thompson. "Rectangles All The Way Down" ([https://www.dyalog.com/uploads/conference/dyalog18/presentations/U12_Rectangles_All_The_Way_Down.pdf slides], [https://dyalog.tv/Dyalog18/?v=mK2WUDIY4hk video]) at [[Dyalog '18]].</ref> and [[Dyalog APL]] can in some cases perform better than straightforward [[wikipedia:C (programming language)|C]] implementations.<ref>Matthew Maycock. [https://ummaycoc.github.io/wc.apl/ Beating C with Dyalog APL: wc]. 2019-10.</ref><ref name="advantage">[[Marshall Lochbaum]]. "The Interpretive Advantage" ([https://www.dyalog.com/user-meetings/uploads/conference/dyalog18/presentations/D15_The_Interpretive_Advantage.zip slides (0.5 MB)], [https://dyalog.tv/Dyalog18/?v=-6no6N3i9Tg video]) at [[Dyalog '18]].</ref> Taking advantage of a high-performance implementation often requires writing in a flatter style, with few or no [[box]]es or [[nested]] arrays, and compiled or GPU-based APLs may not fully support nested arrays.
While dynamically-typed interpreted languages are typically considered to be slow (that is, by nature they lead implementations to run slowly), APL code which uses primarily flat arrays has been described as an excellent fit for modern hardware,<ref>Martin Thompson. "Rectangles All The Way Down" ([https://www.dyalog.com/uploads/conference/dyalog18/presentations/U12_Rectangles_All_The_Way_Down.pdf slides], [https://dyalog.tv/Dyalog18/?v=mK2WUDIY4hk video]) at [[Dyalog '18]].</ref> and [[Dyalog APL]] can in some cases perform better than straightforward [[wikipedia:C (programming language)|C]] implementations.<ref>Matthew Maycock. [https://ummaycoc.github.io/wc.apl/ Beating C with Dyalog APL: wc]. 2019-10.</ref><ref name="advantage">[[Marshall Lochbaum]]. "The Interpretive Advantage" ([https://www.dyalog.com/uploads/conference/dyalog18/presentations/D15_The_Interpretive_Advantage.zip slides (0.5 MB)], [https://dyalog.tv/Dyalog18/?v=-6no6N3i9Tg video]) at [[Dyalog '18]].</ref> Taking advantage of a high-performance implementation often requires writing in a flatter style, with few or no [[box]]es or [[nested]] arrays, and compiled or GPU-based APLs may not fully support nested arrays.


== Arrays and performance ==
== Arrays and performance ==
Line 31: Line 31:
{{Main|Magic function}}
{{Main|Magic function}}
The technique of implementing APL primitives using other primitives, or even simpler cases of the same primitive, can be advantageous for performance in addition to being easier for the implementer.<ref>[[Roger Hui]]. [http://www.dyalog.com/blog/2015/06/in-praise-of-magic-functions-part-one/ "In Praise of Magic Functions: Part I"]. [[Dyalog Ltd.|Dyalog]] blog. 2015-06-22.</ref> Even when a primitive does not use APL directly, reasoning in APL can lead to faster implementation techniques.<ref>[[Marshall Lochbaum]]. [https://www.dyalog.com/blog/2018/06/expanding-bits-in-shrinking-time/ "Expanding Bits in Shrinking Time"]. [[Dyalog Ltd.|Dyalog]] blog. 2018-06-11.</ref>
The technique of implementing APL primitives using other primitives, or even simpler cases of the same primitive, can be advantageous for performance in addition to being easier for the implementer.<ref>[[Roger Hui]]. [http://www.dyalog.com/blog/2015/06/in-praise-of-magic-functions-part-one/ "In Praise of Magic Functions: Part I"]. [[Dyalog Ltd.|Dyalog]] blog. 2015-06-22.</ref> Even when a primitive does not use APL directly, reasoning in APL can lead to faster implementation techniques.<ref>[[Marshall Lochbaum]]. [https://www.dyalog.com/blog/2018/06/expanding-bits-in-shrinking-time/ "Expanding Bits in Shrinking Time"]. [[Dyalog Ltd.|Dyalog]] blog. 2018-06-11.</ref>
=== APL hardware ===
{{Main|APL hardware}}
APL hardware is hardware that has been designed to natively support APL array operations. This breaks the popular understanding of APL as an interpreted language. Unlike x86, which is targeted to operate on individual scalars one at a time, native APL architectures would be targeted to operate on entire arrays at a time, thereby increasing the speed of APL processing.


=== Alternate array representations ===
=== Alternate array representations ===
Internally, APL arrays are usually stored as two lists in memory. The first is a list of the shape (although it's also possible to store the "stride", enabling different views of the same data<ref>NumPy Reference. [https://numpy.org/doc/stable/reference/generated/numpy.ndarray.strides.html "ndarray.strides"]. Accessed 2020-11-09.</ref><ref>[[Nick Nickolov]]. [http://archive.vector.org.uk/art10501160 "Compiling APL to JavaScript"]. [[Vector Journal]] Volume 26 No. 1. 2013-09. (The strided representation was later removed from [[ngn/apl]].)</ref>). The second is the ravel of elements in the array. Nested arrays consist of pointers to arrays which may be distributed across memory, their use can lead to very inefficient memory read patterns - in contrast to flat arrays which are stored as a contiguous block.
Internally, APL arrays are usually stored as two lists in memory. The first is a list of the shape (although it's also possible to store the "stride", enabling different views of the same data<ref>NumPy Reference. [https://numpy.org/doc/stable/reference/generated/numpy.ndarray.strides.html "ndarray.strides"]. Accessed 2020-11-09.</ref><ref>Nick Nickolov. [http://archive.vector.org.uk/art10501160 "Compiling APL to JavaScript"]. [[Vector Journal]] Volume 26 No. 1. 2013-09. (The strided representation was later removed from [[ngn/apl]].)</ref>). The second is the ravel of elements in the array. Nested arrays consist of pointers to arrays which may be distributed across memory, their use can lead to very inefficient memory read patterns - in contrast to flat arrays which are stored as a contiguous block.


=== Reference counting and data reuse ===
=== Reference counting and data reuse ===
Line 47: Line 43:
=== Ahead-of-time compilation ===
=== Ahead-of-time compilation ===
* [https://snakeisland.com/ Snake Island Research]
* [https://snakeisland.com/ Snake Island Research]
=== APL hardware ===
{{Main|APL hardware}}
APL hardware is hardware that has been designed to natively support APL array operations, and was a topic of some interest in the 1970s and 80s. APL hardware in this sense has not been developed. However, [[wikipedia:SIMD|SIMD]] and [[wikipedia:vector processor|vector processor]]s can be used for similar purposes and in some cases are directly inspired by APL. SIMD instructions are now widely available on consumer hardware, having been introduced to Intel's processors beginning in the late 90s.


== Performant usage ==
== Performant usage ==

Navigation menu