4,506
edits
m (Text replacement - "<source" to "<syntaxhighlight") |
m (Text replacement - "</source>" to "</syntaxhighlight>") |
||
Line 1: | Line 1: | ||
:''This page is about classifying data. See [[Indexing]], [[Indices]], [[Index Generator]], and [[Index of]] for other operations named after indices.'' | :''This page is about classifying data. See [[Indexing]], [[Indices]], [[Index Generator]], and [[Index of]] for other operations named after indices.'' | ||
{{Built-in|Interval Index|⍸}}, or '''Bins''' (<syntaxhighlight lang=apl inline>⍋</ | {{Built-in|Interval Index|⍸}}, or '''Bins''' (<syntaxhighlight lang=apl inline>⍋</syntaxhighlight>) in [[A+]] and [[BQN]], is a [[primitive function|primitive]] [[search function]] which locates [[cell]]s of its right [[argument]] relative to cells of a [[sorted]] left argument. Like searching for the right page for a word in a dictionary, knowing the first word on each page, Interval Index finds for each cell (word to search for) an [[index]] (page number) in the left argument (dictionary) so that the target cell is between the [[major cell]] at that index (first word on the page) and the major cell at the following index (first word on the next page). In general computer science, this problem is known as a '''predecessor search''' (see [[wikipedia:predecessor problem|predecessor problem]]), since it searches for the predecessor—entry immediately preceding—of an array, but may be referred to by the name of a common technique used to solve it, the [[wikipedia:binary search|binary search]]. Interval Index may also be considered a way to perform [[wikipedia:data binning|data binning]]. | ||
Like [[Grade]], Interval Index relies on [[array ordering]], but is not subject to [[comparison tolerance]]. Its result depends on [[index origin]] when present. | Like [[Grade]], Interval Index relies on [[array ordering]], but is not subject to [[comparison tolerance]]. Its result depends on [[index origin]] when present. | ||
Line 13: | Line 13: | ||
'AEJR' ⍸ 'Ken' 'Adin' 'Larry' 'Phil' 'Roger' | 'AEJR' ⍸ 'Ken' 'Adin' 'Larry' 'Phil' 'Roger' | ||
3 1 3 3 4 | 3 1 3 3 4 | ||
</ | </syntaxhighlight> | ||
{{Works in|[[Dyalog APL]]}} | {{Works in|[[Dyalog APL]]}} | ||
In [[J]] the ''last'' letter of each group should be used instead, and the left argument must be converted to a list of [[box]]ed strings: <syntaxhighlight lang=j inline>(<@,"0'DIQZ') I. 'Ken';'Adin';'Larry';'Phil';'Roger'</ | In [[J]] the ''last'' letter of each group should be used instead, and the left argument must be converted to a list of [[box]]ed strings: <syntaxhighlight lang=j inline>(<@,"0'DIQZ') I. 'Ken';'Adin';'Larry';'Phil';'Roger'</syntaxhighlight>. The results are one smaller, because J uses [[index origin]] 0. | ||
Left argument elements with multiple characters can also be used. Here, Ken and Larry are placed in the same group because their first names fall between "Ke" and "Lo". | Left argument elements with multiple characters can also be used. Here, Ken and Larry are placed in the same group because their first names fall between "Ke" and "Lo". | ||
Line 21: | Line 21: | ||
('A' 'Ke' 'Lo' 'Pa') ⍸ 'Ken' 'Adin' 'Larry' 'Phil' 'Roger' | ('A' 'Ke' 'Lo' 'Pa') ⍸ 'Ken' 'Adin' 'Larry' 'Phil' 'Roger' | ||
2 1 2 4 4 | 2 1 2 4 4 | ||
</ | </syntaxhighlight> | ||
{{Works in|[[Dyalog APL]]}} | {{Works in|[[Dyalog APL]]}} | ||
Line 28: | Line 28: | ||
1 10 100 1000 ⍸ 44 2 1 0 2481 | 1 10 100 1000 ⍸ 44 2 1 0 2481 | ||
2 1 1 0 4 | 2 1 1 0 4 | ||
</ | </syntaxhighlight> | ||
{{Works in|[[Dyalog APL]]}} | {{Works in|[[Dyalog APL]]}} | ||
Line 44: | Line 44: | ||
schedule ⍸ timestamps | schedule ⍸ timestamps | ||
1 2 2 | 1 2 2 | ||
</ | </syntaxhighlight> | ||
== Description == | == Description == | ||
Line 55: | Line 55: | ||
* The [[index]] of the last left argument major cell preceding that cell, or | * The [[index]] of the last left argument major cell preceding that cell, or | ||
* The number of left argument major cells which precede it. | * The number of left argument major cells which precede it. | ||
Either definition may be adjusted to use either strict or non-strict precedence, and the result may be offset by some amount. Note that the result in the first case is an [[index]], and should be subject to [[index origin]], while the result in the second is merely a count, and should not. Therefore the definitions are incompatible in any language with variable index origin, and at least one must be adjusted by <syntaxhighlight lang=apl inline>⎕IO</ | Either definition may be adjusted to use either strict or non-strict precedence, and the result may be offset by some amount. Note that the result in the first case is an [[index]], and should be subject to [[index origin]], while the result in the second is merely a count, and should not. Therefore the definitions are incompatible in any language with variable index origin, and at least one must be adjusted by <syntaxhighlight lang=apl inline>⎕IO</syntaxhighlight>. | ||
== History == | == History == | ||
Line 61: | Line 61: | ||
In 1987 [[Roger Hui]] presented the function ''classify'', with a definition very similar to that later introduced in [[Dyalog APL]], but restricted to [[vector]]s.<ref>[[Roger Hui|Hui, Roger]], [https://www.jsoftware.com/papers/from.htm Some Uses of { and }], §1.2. [[APL87]]. APL Quote Quad, Volume 17, Number 4. 1987-05.</ref> | In 1987 [[Roger Hui]] presented the function ''classify'', with a definition very similar to that later introduced in [[Dyalog APL]], but restricted to [[vector]]s.<ref>[[Roger Hui|Hui, Roger]], [https://www.jsoftware.com/papers/from.htm Some Uses of { and }], §1.2. [[APL87]]. APL Quote Quad, Volume 17, Number 4. 1987-05.</ref> | ||
[[A+]] was the first APL to feature Interval Index as a primitive. Called Bins (<syntaxhighlight lang=apl inline>⍋</ | [[A+]] was the first APL to feature Interval Index as a primitive. Called Bins (<syntaxhighlight lang=apl inline>⍋</syntaxhighlight>), it assumed the left argument to be sorted ascending with no duplicates and returned the number of cells strictly less than the target cell, giving right-closed indices. A+ used the same [[High-rank set functions|cell rank convention]] as Find ([[Index Of]]) for Interval Index, a choice that was also adopted in later implementations of the primitive. | ||
Interval Index (<syntaxhighlight lang=j inline>I.</ | Interval Index (<syntaxhighlight lang=j inline>I.</syntaxhighlight>) in [[J]] was implemented in release 6.01, available in 2006.<ref>Jsoftware. [https://www.jsoftware.com/docs/archive/release/binsearch.htm I. ''Interval Index'' Implemented]. 2005.</ref> The choice of <syntaxhighlight lang=j inline>I.</syntaxhighlight> for a symbol introduced the pairing of Interval Index with the monadic function [[Indices]]. In release 6.02 (2008), [[Index Of]] was extended to allow non-matching cell shapes, but Interval Index was not similarly extended. | ||
Interval Index was also introduced in [[Dyalog APL 16.0]], initially with the requirement that the left argument be sorted ascending with no duplicates. Dyalog extended it by adding [[total array ordering]] in [[Dyalog APL 17.0]], and by removing the no-duplicate requirement in [[Dyalog APL 17.1]]. Dyalog's introduction was the first time the primitive was added to a language with [[index origin]]. The primitive was chosen to depend on <syntaxhighlight lang=apl inline>⎕IO</ | Interval Index was also introduced in [[Dyalog APL 16.0]], initially with the requirement that the left argument be sorted ascending with no duplicates. Dyalog extended it by adding [[total array ordering]] in [[Dyalog APL 17.0]], and by removing the no-duplicate requirement in [[Dyalog APL 17.1]]. Dyalog's introduction was the first time the primitive was added to a language with [[index origin]]. The primitive was chosen to depend on <syntaxhighlight lang=apl inline>⎕IO</syntaxhighlight>, with its result decreased by 1 relative to A+ and J with index origin 0 in order to accommodate the index origin 1 case (thus, the index origin 1 case matches A+ and J in terms of the possible result indices). | ||
{|class=wikitable | {|class=wikitable | ||
! Language !! Name !! [[Glyph]] !! Minimum result !! Interval shape !! Left argument requirements | ! Language !! Name !! [[Glyph]] !! Minimum result !! Interval shape !! Left argument requirements | ||
|- | |- | ||
| [[A+]] || Bins || <syntaxhighlight lang=apl inline>⍋</ | | [[A+]] || Bins || <syntaxhighlight lang=apl inline>⍋</syntaxhighlight> || <syntaxhighlight lang=apl inline>0</syntaxhighlight> || (] || Ascending; no duplicates (unchecked) | ||
|- | |- | ||
| [[J]] || Interval Index || <syntaxhighlight lang=j inline>I.</ | | [[J]] || Interval Index || <syntaxhighlight lang=j inline>I.</syntaxhighlight> || <syntaxhighlight lang=j inline>0</syntaxhighlight> || (] || Ascending or descending (unchecked) | ||
|- | |- | ||
| [[Dyalog APL]] || Interval Index || <syntaxhighlight lang=apl inline>⍸</ | | [[Dyalog APL]] || Interval Index || <syntaxhighlight lang=apl inline>⍸</syntaxhighlight> || <syntaxhighlight lang=apl inline>⎕IO-1</syntaxhighlight> || [) || Ascending; no duplicates prior to [[Dyalog APL 17.1|17.1]] | ||
|- | |- | ||
| [[BQN]] || Bins || <syntaxhighlight lang=apl inline>⍋⍒</ | | [[BQN]] || Bins || <syntaxhighlight lang=apl inline>⍋⍒</syntaxhighlight> || <syntaxhighlight lang=apl inline>0</syntaxhighlight> || [) || Ascending (<syntaxhighlight lang=apl inline>⍋</syntaxhighlight>) or descending (<syntaxhighlight lang=apl inline>⍒</syntaxhighlight>) | ||
|} | |} | ||