APL hardware: Difference between revisions

Jump to navigation Jump to search
223 bytes added ,  08:12, 7 July 2022
Line 2: Line 2:


== Cellular APL Computer ==
== Cellular APL Computer ==
[https://ieeexplore.ieee.org/document/1671509 System Design of a Cellular APL Computer], written in April 1970 by Kenneth J. Thurber and John W. Myrna, is a paper describing a possible general design for a computer which implements a dialect of APL as its machine language. The purpose of the design was to take advantage of the inherent parallelism in APL by being flexible enough to operate on entire arrays. The design was built to be cellular, meaning that each component would handle a separate part of the APL logic.  
A 1970 paper describes a possible general design for a computer which implements a dialect of APL as its machine language. The purpose of the design was to take advantage of the inherent parallelism in APL by being flexible enough to operate on entire arrays. The design was built to be cellular, meaning that each component would handle a separate part of the APL logic.<ref>Thurber, Kenneth J. and Myrna, John W. [https://ieeexplore.ieee.org/document/1671509 System Design of a Cellular APL Computer]. IEEE Transactions on Computers, volume C-19, issue 4. Institute of Electrical and Electronics Engineers. April 1970.</ref>
=== Design ===
=== Design ===
The specified design contains:
The specified design contains:
Line 11: Line 11:
* thirty-two vector accumulators (VA1, VA2..., VA32)
* thirty-two vector accumulators (VA1, VA2..., VA32)
* input-output controllers (IOC)
* input-output controllers (IOC)
* a preprocessor (PP)
* a pre-processor (PP)


The MLIM is a 32x32 array of memory cells. Each memory cell contains four shift registers named A, B, C, and T. This is equivalent to creating 4 arrays of memory cells with one shift register each. The arrays created by registers A, B, and C are used to store and operate on data, while T is temporary array storage for the result of an operation. The operations which the MLIM can perform can either read from A and B to store the result in C, or read from C and B and store the result in A. The RL processing helps place each memory array in the correct locations of A, B, and C such that the operands line up before the MLIM performs its operations.  
The MLIM is a 32x32 array of memory cells. Each memory cell contains four shift registers named A, B, C, and T. This is equivalent to creating 4 arrays of memory cells with one shift register each. The arrays created by registers A, B, and C are used to store and operate on data, while T is temporary array storage for the result of an operation. The operations which the MLIM can perform can either read from A and B to store the result in C, or read from C and B and store the result in A. The RL processing helps place each memory array in the correct locations of A, B, and C such that the operands line up before the MLIM performs its operations.  


MA1 through MA16 are each 32x32 arrays of memory cells which can each store one word (16 bits). This means the total array storage of the computer is 16,384 words.  
MA1 through MA16 are each 32×32 arrays of memory cells which can each store one word (16 bits). This means the total array storage of the computer is 16,384 words.  


The IMU is a temporary location for instructions, to "give the programmer a usable memory of 16,384 words." Each cell is a 32-bit read-only memory cell.  
The IMU is a temporary location for instructions, to "give the programmer a usable memory of 16,384 words." Each cell is a 32-bit read-only memory cell.  
Line 23: Line 23:
The PP would handle storage allocation, basic operations, and other operations which the MLIM cannot perform.
The PP would handle storage allocation, basic operations, and other operations which the MLIM cannot perform.


VA1 through VA32 are each composed of two 32 bit registers A and B. Register A of each accumulator is connected to the Routing and Control Logic board via a decoder. Each decoder is connected to its VA via a 32-bit bus, to the Routing and Control Logic board via a 32-bit output bus, and to the PP via a 5-bit input logic bus. if p is the value on the 5-bit bus such that 0≤p≤31, then the 32-bit bus shifts the bits in the output such that it returns 32-p, 32-p+1, ..., 32. Thus, it shifts the input left by p bits, masking out the indices that are greater than or equal to 32. Thus, this type of register is called a ''shift register''. The decoders are considered a part of the RL cell. Register B is directly connected to register A, and has a direct vector routing bus connected to the Routing and Control Logic board.  
VA1 through VA32 are each composed of two 32 bit registers A and B. Register A of each accumulator is connected to the Routing and Control Logic board via a decoder. Each decoder is connected to its VA via a 32-bit bus, to the Routing and Control Logic board via a 32-bit output bus, and to the PP via a 5-bit input logic bus. if p is the value on the 5-bit bus such that 0 ≤ p ≤ 31, then the 32-bit bus shifts the bits in the output such that it returns 32−p, 32−p+1, , 32. Thus, it shifts the input left by p bits, masking out the indices that are greater than or equal to 32. Thus, this type of register is called a ''shift register''. The decoders are considered a part of the RL cell. Register B is directly connected to register A, and has a direct vector routing bus connected to the Routing and Control Logic board.  


It is an important functionality that vectors can be loaded into right justified into a VA, then read offset such that its length ≤ 32. Because of register B, each accumulator can perform reductions by repeatedly adding the register A.
It is an important functionality that vectors can be loaded into right justified into a VA, then read offset such that its length ≤ 32. Because of register B, each accumulator can perform reductions by repeatedly adding the register A.
Line 70: Line 70:
M⍲N
M⍲N
</source>
</source>
The masking functionality of the RL combined with the native ability for scan (reduce while reading the output in between each step) and reduce (<source lang=apl inline>\</source> and <source lang=apl inline>/</source> respectively) allows for iota (monadic <source lang=apl inline>⍳</source>) to be defined, while its generalized indexing functionality allows reverse and transpose (monadic <source lang=apl inline>⌽ ⍉</source>) to be defined. Shape and Ravel (monadic <source lang=apl inline>, </source>) can also be defined by using the RL and PP in parallel. Thus, a list of complex operators can be defined:  
The masking functionality of the RL combined with the native ability for [[Scan]] (reduce while reading the output in between each step) and [[Reduce]] (<source lang=apl inline>\</source> and <source lang=apl inline>/</source> respectively) allows for [[Index Generator]] (monadic <source lang=apl inline>⍳</source>) to be defined, while its generalized indexing functionality allows [[Reverse]] and [[Transpose]] (monadic <source lang=apl inline>⌽</source> and <source lang=apl inline>⍉</source>) to be defined. [[Shape]] and [[Ravel]] (monadic <source lang=apl inline>⍴</source> and <source lang=apl inline>,</source>) can also be defined by using the RL and PP in parallel. Thus, a list of complex operators can be defined:  
<source lang=apl>
<source lang=apl>
⍳N
⍳N

Navigation menu