Interval Index: Difference between revisions

Jump to navigation Jump to search
189 bytes added ,  22:18, 10 September 2022
m
Text replacement - "</source>" to "</syntaxhighlight>"
m (Text replacement - "<source" to "<syntaxhighlight")
m (Text replacement - "</source>" to "</syntaxhighlight>")
Line 1: Line 1:
:''This page is about classifying data. See [[Indexing]], [[Indices]], [[Index Generator]], and [[Index of]] for other operations named after indices.''
:''This page is about classifying data. See [[Indexing]], [[Indices]], [[Index Generator]], and [[Index of]] for other operations named after indices.''


{{Built-in|Interval Index|⍸}}, or '''Bins''' (<syntaxhighlight lang=apl inline>⍋</source>) in [[A+]] and [[BQN]], is a [[primitive function|primitive]] [[search function]] which locates [[cell]]s of its right [[argument]] relative to cells of a [[sorted]] left argument. Like searching for the right page for a word in a dictionary, knowing the first word on each page, Interval Index finds for each cell (word to search for) an [[index]] (page number) in the left argument (dictionary) so that the target cell is between the [[major cell]] at that index (first word on the page) and the major cell at the following index (first word on the next page). In general computer science, this problem is known as a '''predecessor search''' (see [[wikipedia:predecessor problem|predecessor problem]]), since it searches for the predecessor—entry immediately preceding—of an array, but may be referred to by the name of a common technique used to solve it, the [[wikipedia:binary search|binary search]]. Interval Index may also be considered a way to perform [[wikipedia:data binning|data binning]].
{{Built-in|Interval Index|⍸}}, or '''Bins''' (<syntaxhighlight lang=apl inline>⍋</syntaxhighlight>) in [[A+]] and [[BQN]], is a [[primitive function|primitive]] [[search function]] which locates [[cell]]s of its right [[argument]] relative to cells of a [[sorted]] left argument. Like searching for the right page for a word in a dictionary, knowing the first word on each page, Interval Index finds for each cell (word to search for) an [[index]] (page number) in the left argument (dictionary) so that the target cell is between the [[major cell]] at that index (first word on the page) and the major cell at the following index (first word on the next page). In general computer science, this problem is known as a '''predecessor search''' (see [[wikipedia:predecessor problem|predecessor problem]]), since it searches for the predecessor—entry immediately preceding—of an array, but may be referred to by the name of a common technique used to solve it, the [[wikipedia:binary search|binary search]]. Interval Index may also be considered a way to perform [[wikipedia:data binning|data binning]].


Like [[Grade]], Interval Index relies on [[array ordering]], but is not subject to [[comparison tolerance]]. Its result depends on [[index origin]] when present.
Like [[Grade]], Interval Index relies on [[array ordering]], but is not subject to [[comparison tolerance]]. Its result depends on [[index origin]] when present.
Line 13: Line 13:
       'AEJR' ⍸ 'Ken' 'Adin' 'Larry' 'Phil' 'Roger'
       'AEJR' ⍸ 'Ken' 'Adin' 'Larry' 'Phil' 'Roger'
3 1 3 3 4
3 1 3 3 4
</source>
</syntaxhighlight>
{{Works in|[[Dyalog APL]]}}
{{Works in|[[Dyalog APL]]}}
In [[J]] the ''last'' letter of each group should be used instead, and the left argument must be converted to a list of [[box]]ed strings: <syntaxhighlight lang=j inline>(<@,"0'DIQZ') I. 'Ken';'Adin';'Larry';'Phil';'Roger'</source>. The results are one smaller, because J uses [[index origin]] 0.
In [[J]] the ''last'' letter of each group should be used instead, and the left argument must be converted to a list of [[box]]ed strings: <syntaxhighlight lang=j inline>(<@,"0'DIQZ') I. 'Ken';'Adin';'Larry';'Phil';'Roger'</syntaxhighlight>. The results are one smaller, because J uses [[index origin]] 0.


Left argument elements with multiple characters can also be used. Here, Ken and Larry are placed in the same group because their first names fall between "Ke" and "Lo".
Left argument elements with multiple characters can also be used. Here, Ken and Larry are placed in the same group because their first names fall between "Ke" and "Lo".
Line 21: Line 21:
       ('A' 'Ke' 'Lo' 'Pa') ⍸ 'Ken' 'Adin' 'Larry' 'Phil' 'Roger'
       ('A' 'Ke' 'Lo' 'Pa') ⍸ 'Ken' 'Adin' 'Larry' 'Phil' 'Roger'
2 1 2 4 4
2 1 2 4 4
</source>
</syntaxhighlight>
{{Works in|[[Dyalog APL]]}}
{{Works in|[[Dyalog APL]]}}


Line 28: Line 28:
       1 10 100 1000 ⍸ 44 2 1 0 2481
       1 10 100 1000 ⍸ 44 2 1 0 2481
2 1 1 0 4
2 1 1 0 4
</source>
</syntaxhighlight>
{{Works in|[[Dyalog APL]]}}
{{Works in|[[Dyalog APL]]}}


Line 44: Line 44:
       schedule ⍸ timestamps
       schedule ⍸ timestamps
1 2 2
1 2 2
</source>
</syntaxhighlight>


== Description ==
== Description ==
Line 55: Line 55:
* The [[index]] of the last left argument major cell preceding that cell, or
* The [[index]] of the last left argument major cell preceding that cell, or
* The number of left argument major cells which precede it.
* The number of left argument major cells which precede it.
Either definition may be adjusted to use either strict or non-strict precedence, and the result may be offset by some amount. Note that the result in the first case is an [[index]], and should be subject to [[index origin]], while the result in the second is merely a count, and should not. Therefore the definitions are incompatible in any language with variable index origin, and at least one must be adjusted by <syntaxhighlight lang=apl inline>⎕IO</source>.
Either definition may be adjusted to use either strict or non-strict precedence, and the result may be offset by some amount. Note that the result in the first case is an [[index]], and should be subject to [[index origin]], while the result in the second is merely a count, and should not. Therefore the definitions are incompatible in any language with variable index origin, and at least one must be adjusted by <syntaxhighlight lang=apl inline>⎕IO</syntaxhighlight>.


== History ==
== History ==
Line 61: Line 61:
In 1987 [[Roger Hui]] presented the function ''classify'', with a definition very similar to that later introduced in [[Dyalog APL]], but restricted to [[vector]]s.<ref>[[Roger Hui|Hui, Roger]], [https://www.jsoftware.com/papers/from.htm Some Uses of { and }], §1.2. [[APL87]]. APL Quote Quad, Volume 17, Number 4. 1987-05.</ref>
In 1987 [[Roger Hui]] presented the function ''classify'', with a definition very similar to that later introduced in [[Dyalog APL]], but restricted to [[vector]]s.<ref>[[Roger Hui|Hui, Roger]], [https://www.jsoftware.com/papers/from.htm Some Uses of { and }], §1.2. [[APL87]]. APL Quote Quad, Volume 17, Number 4. 1987-05.</ref>


[[A+]] was the first APL to feature Interval Index as a primitive. Called Bins (<syntaxhighlight lang=apl inline>⍋</source>), it assumed the left argument to be sorted ascending with no duplicates and returned the number of cells strictly less than the target cell, giving right-closed indices. A+ used the same [[High-rank set functions|cell rank convention]] as Find ([[Index Of]]) for Interval Index, a choice that was also adopted in later implementations of the primitive.
[[A+]] was the first APL to feature Interval Index as a primitive. Called Bins (<syntaxhighlight lang=apl inline>⍋</syntaxhighlight>), it assumed the left argument to be sorted ascending with no duplicates and returned the number of cells strictly less than the target cell, giving right-closed indices. A+ used the same [[High-rank set functions|cell rank convention]] as Find ([[Index Of]]) for Interval Index, a choice that was also adopted in later implementations of the primitive.


Interval Index (<syntaxhighlight lang=j inline>I.</source>) in [[J]] was implemented in release 6.01, available in 2006.<ref>Jsoftware. [https://www.jsoftware.com/docs/archive/release/binsearch.htm I. ''Interval Index'' Implemented]. 2005.</ref> The choice of <syntaxhighlight lang=j inline>I.</source> for a symbol introduced the pairing of Interval Index with the monadic function [[Indices]]. In release 6.02 (2008), [[Index Of]] was extended to allow non-matching cell shapes, but Interval Index was not similarly extended.
Interval Index (<syntaxhighlight lang=j inline>I.</syntaxhighlight>) in [[J]] was implemented in release 6.01, available in 2006.<ref>Jsoftware. [https://www.jsoftware.com/docs/archive/release/binsearch.htm I. ''Interval Index'' Implemented]. 2005.</ref> The choice of <syntaxhighlight lang=j inline>I.</syntaxhighlight> for a symbol introduced the pairing of Interval Index with the monadic function [[Indices]]. In release 6.02 (2008), [[Index Of]] was extended to allow non-matching cell shapes, but Interval Index was not similarly extended.


Interval Index was also introduced in [[Dyalog APL 16.0]], initially with the requirement that the left argument be sorted ascending with no duplicates. Dyalog extended it by adding [[total array ordering]] in [[Dyalog APL 17.0]], and by removing the no-duplicate requirement in [[Dyalog APL 17.1]]. Dyalog's introduction was the first time the primitive was added to a language with [[index origin]]. The primitive was chosen to depend on <syntaxhighlight lang=apl inline>⎕IO</source>, with its result decreased by 1 relative to A+ and J with index origin 0 in order to accommodate the index origin 1 case (thus, the index origin 1 case matches A+ and J in terms of the possible result indices).
Interval Index was also introduced in [[Dyalog APL 16.0]], initially with the requirement that the left argument be sorted ascending with no duplicates. Dyalog extended it by adding [[total array ordering]] in [[Dyalog APL 17.0]], and by removing the no-duplicate requirement in [[Dyalog APL 17.1]]. Dyalog's introduction was the first time the primitive was added to a language with [[index origin]]. The primitive was chosen to depend on <syntaxhighlight lang=apl inline>⎕IO</syntaxhighlight>, with its result decreased by 1 relative to A+ and J with index origin 0 in order to accommodate the index origin 1 case (thus, the index origin 1 case matches A+ and J in terms of the possible result indices).


{|class=wikitable
{|class=wikitable
! Language      !! Name          !! [[Glyph]]                          !! Minimum result                        !! Interval shape !! Left argument requirements
! Language      !! Name          !! [[Glyph]]                          !! Minimum result                        !! Interval shape !! Left argument requirements
|-
|-
| [[A+]]        || Bins          || <syntaxhighlight lang=apl inline>⍋</source>  || <syntaxhighlight lang=apl inline>0</source>    || (]            || Ascending; no duplicates (unchecked)
| [[A+]]        || Bins          || <syntaxhighlight lang=apl inline>⍋</syntaxhighlight>  || <syntaxhighlight lang=apl inline>0</syntaxhighlight>    || (]            || Ascending; no duplicates (unchecked)
|-
|-
| [[J]]          || Interval Index || <syntaxhighlight lang=j inline>I.</source>  || <syntaxhighlight lang=j inline>0</source>      || (]            || Ascending or descending (unchecked)
| [[J]]          || Interval Index || <syntaxhighlight lang=j inline>I.</syntaxhighlight>  || <syntaxhighlight lang=j inline>0</syntaxhighlight>      || (]            || Ascending or descending (unchecked)
|-
|-
| [[Dyalog APL]] || Interval Index || <syntaxhighlight lang=apl inline>⍸</source>  || <syntaxhighlight lang=apl inline>⎕IO-1</source> || [)            || Ascending; no duplicates prior to [[Dyalog APL 17.1|17.1]]
| [[Dyalog APL]] || Interval Index || <syntaxhighlight lang=apl inline>⍸</syntaxhighlight>  || <syntaxhighlight lang=apl inline>⎕IO-1</syntaxhighlight> || [)            || Ascending; no duplicates prior to [[Dyalog APL 17.1|17.1]]
|-
|-
| [[BQN]]        || Bins          || <syntaxhighlight lang=apl inline>⍋⍒</source> || <syntaxhighlight lang=apl inline>0</source>    || [)            || Ascending (<syntaxhighlight lang=apl inline>⍋</source>) or descending (<syntaxhighlight lang=apl inline>⍒</source>)
| [[BQN]]        || Bins          || <syntaxhighlight lang=apl inline>⍋⍒</syntaxhighlight> || <syntaxhighlight lang=apl inline>0</syntaxhighlight>    || [)            || Ascending (<syntaxhighlight lang=apl inline>⍋</syntaxhighlight>) or descending (<syntaxhighlight lang=apl inline>⍒</syntaxhighlight>)
|}
|}


Navigation menu