Unicode
The advent of Unicode solved many problems with dealing with APL characters, however there was still some wiggle room as to which Unicode codepoint were to be used in a Unicode implementation of APL, and different implementors made different choices. This article, which documents these differences, is adapted from an original paper by Bob Smith[1] that attempted to raise awareness of these issues because the differences impede transfer of information.
The relevant document for the APL character set is the APL Character Repertoire (ACR)[2]. For whatever reasons, that document never became a standard, but it does provide some guidance, and is better than each implementor making separate choices.
Introduction
There are a surprising number of similar APL characters in Unicode and in several cases some implementors went one way, others the other way. The following table lists the characters in question, along with the way APL2, Dyalog, GNU APL, NARS2000, ngn/apl, and dzaima/APL behave. APL2000 states that Generally the default codepoint scheme for the VisualAPL product follows the IBM APL2 workstation scheme. Please edit the table if you believe there are other characters that should be included in the table, or to add another dialect.
When there are differences among APL implementations, users can become confused. They type something into one APL system, copy it to another and are greeted by a SYNTAX ERROR or the like.
The whole basis for the confusion in a lengthy thread on comp.lang.apl entitled caret vs and[3] was that in some implementations the symbol for the logical And function was U+005E only, in some implementations it was U+2227 only, and in some both characters worked. The original poster encountered some APL text from the APL Wiki that had been produced by a system that supports U+005E and copied it into a system that used U+2227 only and failed on U+005E.
When systems differ in the set of acceptable characters for the same function, it serves only to confuse the end user to the detriment of the community. The cautious APL programmer can avoid such problems by choosing symbols that work across dialects. Note that in the below table, there is exactly one universally accepted codepoint for each symbol (these have been indicated by a single "Universal" cell stretching across the row), except for And where APL2 doesn't recognise the otherwise universal U+2227. However, APL2 does not have And extended to Least Common Multiple, so it is equivalent to Times (×
) which can therefore be used instead for truly portable code.
Comparison of implementations
The following characters are included have been encountered in APL code displayed somewhere on the Internet or in a PDF file. Blindly copying them into an APL session can produce an error which might well confuse the user.
APL name | Glyph | Codepoint | Unicode name | APL2 | Dyalog APL |
GNU APL |
NARS 2000 |
ngn/ apl |
dzaima/ APL |
Monadic | Dyadic |
---|---|---|---|---|---|---|---|---|---|---|---|
Star | * |
U+002A | Asterisk | Universal | Exponential | Power | |||||
⋆ |
U+22C6 | Star operator | No | No | Yes | Yes | Yes | No | |||
Minus | - |
U+002D | Hyphen-minus | Universal | Negate | Minus | |||||
− |
U+2212 | Minus sign | No | No | Yes | Yes | Yes | No | |||
Logical And | ^ |
U+005E | Circumflex accent | Yes | Yes | Yes | Yes | Yes | No | And | |
∧ |
U+2227 | Logical And | No[4] | Yes | Yes | Yes | Yes | Yes | |||
Stile | | |
U+007C | Vertical line | Universal | Magnitude | Residue | |||||
∣ |
U+2223 | Divides | Yes | Yes | Yes | Yes | Yes | No | |||
Tilde | ~ |
U+007E | Tilde | Universal | Not | Without | |||||
∼ |
U+223C | Tilde operator | No | No | Yes | Yes | No | No | |||
Alpha | α |
U+03B1 | Greek small letter Alpha | No | Yes | Yes | No | No | Left Argument | ||
⍺ |
U+237A | APL functional symbol Alpha | Universal | ||||||||
Iota | ι |
U+03B9 | Greek small letter Iota | No | No | Yes | No | No | No | Left Argument | |
⍳ |
U+2373 | APL functional symbol Iota | Universal | ||||||||
Rho | ρ |
U+03C1 | Greek small letter Rho | No | No | Yes | No | No | No | Left Argument | |
⍴ |
U+2374 | APL functional symbol Rho | Universal | ||||||||
Omega | ω |
U+03C9 | Greek small letter Omega | No | Yes | Yes | No | No | Right Argument | ||
⍵ |
U+2375 | APL functional symbol Omega | Universal | ||||||||
Epsilon | ϵ |
U+03F5 | Greek lunate Epsilon symbol | No | No | Yes | No | No | No | Enlist/Type | Membership |
∈ [5] |
U+2208 | Element of | No | No | Yes | Yes | No | No | |||
∊ |
U+220A | Small Element of | Universal | ||||||||
Jot | ∘ |
U+2218 | Ring operator | Universal | Outer product | Beside/Bind | |||||
◦ |
U+25E6 | White bullet | No | No | Yes | Yes | No | No | |||
Less than or equal to |
≤ |
U+2264 | Less-than or equal to | Universal | Less than or equal to | ||||||
⩽ |
U+2A7D | Less than or slanted equal to | No | No | Yes | Yes | No | No | |||
Greater than or equal to |
≥ |
U+2265 | Greater than or equal to | Universal | Greater than or equal to | ||||||
⩾ |
U+2A7E | Greater than or slanted equal to | No | No | Yes | Yes | No | No | |||
Logical Nor | ⊽ |
U+22BD | Nor | No | No | Yes | Yes | No | No | Nor | |
⍱ |
U+2371 | APL functional symbol down caret tilde | Universal | ||||||||
Logical Nand | ⊼ |
U+22BC | Nand | No | No | Yes | Yes | No | No | Nand | |
⍲ |
U+2372 | APL functional symbol up caret tilde | Universal | ||||||||
Diamond | ⋄ |
U+22C4 | Diamond operator | Universal | Statement Separator | ||||||
◇ |
U+25C7 | White Diamond | No | No | No | Yes | No | No | |||
◊ |
U+25CA | Lozenge | No | No | Yes | Yes | No | No | |||
⬦ |
U+2B26 | Diamond | No | No | Yes | Yes | No | No | |||
Quad | ⎕ |
U+2395 | APL functional symbol Quad | Universal | Quad name | ||||||
▯ |
U+25AF | White vertical rectangle | Yes | No | Yes | Yes | No | No | |||
Circle | ○ |
U+25CB | White circle | Universal | Pi Times | Circular | |||||
⚪ |
U+26AA | Medium white circle | No | No | Yes | Yes | No | No |
Functionality
The following statements can be used to test the functionality of the symbols:
⍎⎕← '1',(⎕UCS 16⊥0 0 2 10),'1' ⍝ Star ⍎⎕← '1',(⎕UCS 16⊥2 2 12 6),'1' ⍝ Star ⍎⎕← '1',(⎕UCS 16⊥0 0 2 13),'1' ⍝ Minus ⍎⎕← '1',(⎕UCS 16⊥2 2 1 2),'1' ⍝ Minus ⍎⎕← '1',(⎕UCS 16⊥0 0 5 14),'1' ⍝ And ⍎⎕← '1',(⎕UCS 16⊥2 2 2 7),'1' ⍝ And ⍎⎕← '1',(⎕UCS 16⊥0 0 7 12),'1' ⍝ Modulus ⍎⎕← '1',(⎕UCS 16⊥2 2 2 3),'1' ⍝ Modulus ⍎⎕← (⎕UCS 16⊥0 0 7 14),'1' ⍝ Tilde ⍎⎕← (⎕UCS 16⊥2 2 3 12),'1' ⍝ Tilde ⍎⎕←'1{',(⎕UCS 16⊥0 3 11 1),'}1' ⍝ Alpha ⍎⎕←'1{',(⎕UCS 16⊥2 3 7 10),'}1' ⍝ Alpha ⍎⎕← '{',(⎕UCS 16⊥0 3 12 9),'}1' ⍝ Omega ⍎⎕← '{',(⎕UCS 16⊥2 3 7 5),'}1' ⍝ Omega ⍎⎕← '1',(⎕UCS 16⊥2 2 0 8),'1' ⍝ Epsilon ⍎⎕← '1',(⎕UCS 16⊥2 2 0 10),'1' ⍝ Epsilon ⍎⎕← '1',(⎕UCS 16⊥2 2 1 8),'.=1' ⍝ Jot ⍎⎕← '1',(⎕UCS 16⊥2 5 14 6),'.=1' ⍝ Jot ⍎⎕← '1',(⎕UCS 16⊥2 2 6 4),'1' ⍝ Less than or equal to ⍎⎕← '1',(⎕UCS 16⊥2 10 7 13),'1' ⍝ Less than or equal to ⍎⎕← '1',(⎕UCS 16⊥2 2 6 5),'1' ⍝ Greater than or equal to ⍎⎕← '1',(⎕UCS 16⊥2 10 7 14),'1' ⍝ Greater than or equal to ⍎⎕← '1',(⎕UCS 16⊥2 2 11 13),'1' ⍝ Nor ⍎⎕← '1',(⎕UCS 16⊥2 3 7 1),'1' ⍝ Nor ⍎⎕← '1',(⎕UCS 16⊥2 2 11 12),'1' ⍝ Nand ⍎⎕← '1',(⎕UCS 16⊥2 2 12 4),'1' ⍝ Diamond ⍎⎕← '1',(⎕UCS 16⊥2 5 12 7),'1' ⍝ Diamond ⍎⎕← '1',(⎕UCS 16⊥2 5 12 10),'1' ⍝ Diamond ⍎⎕← '1',(⎕UCS 16⊥2 11 2 6),'1' ⍝ Diamond ⍎⎕← (⎕UCS 16⊥2 3 9 5),'←1' ⍝ Quad ⍎⎕← (⎕UCS 16⊥2 5 10 15),'←1' ⍝ Quad ⍎⎕← '1',(⎕UCS 16⊥2 5 12 11),'1' ⍝ Circle ⍎⎕← '1',(⎕UCS 16⊥2 6 10 10),'1' ⍝ Circle ⍎⎕← '1',(⎕UCS 16⊥2 3 7 2),'1' ⍝ Nand
Note that the last four lines will not work on a system that doesn’t support dfns.
Atomic Vector
If the Atomic vector (⎕AV
) has no room in which to include these new characters, an implementation can translate them on entry to the corresponding symbol that is in ⎕AV
. NARS2000 even has a means of translating symbols on the way out via Copy (Ctrl+C in Windows) to various other APL systems that don't support the same set of principal characters NARS2000 uses for the functions in the above table.
Considerations
Unicode was a great start to enabling APL characters to be used, however in order for there to be interoperability, implementors have to agree upon which characters are functional. It doesn't matter if one's system can change the mapping of glyphs to codepoints as the vast majority of users won't change from the default behavior. Implementors therefore have to decide if it is worthwhile to support the above codepoints.
References
- ↑ Smith, Bob. APL Characters and Their Aliases. 14 Dec 2013–25 Dec 2019. Sudley Place Software.
- ↑ ISO-IEC/JTC1/SC22/WG3. N3067: APL Character Repertoire. 28 Dec 1999.
- ↑ comp.lang.apl. caret vs and. 28 Oct 2013–9 Dec 2013
- ↑
×
(U+00D7) is a universally supported substitute - ↑ Found by Hanspeter Moser in The Toronto Toolkit
APL glyphs [edit] | |
---|---|
Information | Glyph ∙ Typing glyphs (on Linux) ∙ Unicode ∙ Fonts ∙ Mnemonics ∙ Overstrikes ∙ Migration level |
Individual glyphs | Jot (∘ ) ∙ Right Shoe (⊃ ) ∙ Up Arrow (↑ ) ∙ Zilde (⍬ ) ∙ High minus (¯ ) ∙ Dot (. ) ∙ Del (∇ )
|