Find: Difference between revisions

From APL Wiki
Jump to navigation Jump to search
m (Text replacement - "<source" to "<syntaxhighlight")
Tags: Mobile edit Mobile web edit
 
(5 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{Built-in|Find|⍷}} is a [[dyadic]] [[primitive function]] which tests if the left [[argument]] appears as a contiguous [[subarray]] of the right argument.
{{Built-in|Find|⍷}} is a [[dyadic]] [[primitive function]] which indicates where the left [[argument]] appears as a contiguous [[subarray]] of the right argument.


== Examples ==
== Examples ==
Line 5: Line 5:
Both arguments can be arrays of any [[shape]]. The entire left argument is tested against each position in the right argument. The result is a [[boolean]] array having the same shape as the right argument, where a 1 indicates the position of the [[first]] element of the matched subarray (which can be seen as the "leftmost" or "top left" position in case of a [[vector]] or [[matrix]]). If the left argument has lower [[rank]], it is treated as if the shape is prepended with ones. If the left argument has higher rank, Find does not error, but it is never found in the right argument (resulting in an all-zero array).
Both arguments can be arrays of any [[shape]]. The entire left argument is tested against each position in the right argument. The result is a [[boolean]] array having the same shape as the right argument, where a 1 indicates the position of the [[first]] element of the matched subarray (which can be seen as the "leftmost" or "top left" position in case of a [[vector]] or [[matrix]]). If the left argument has lower [[rank]], it is treated as if the shape is prepended with ones. If the left argument has higher rank, Find does not error, but it is never found in the right argument (resulting in an all-zero array).


<source lang=apl>
<syntaxhighlight lang=apl>
       'ANA'⍷'BANANA'  ⍝ Matches may overlap
       'ANA'⍷'BANANA'  ⍝ Matches may overlap
0 1 0 1 0 0
0 1 0 1 0 0
Line 27: Line 27:
       WEEK⍷'DAY'  ⍝ WEEK not found in 'DAY'; left arg may have higher rank but it is never found
       WEEK⍷'DAY'  ⍝ WEEK not found in 'DAY'; left arg may have higher rank but it is never found
0 0 0
0 0 0
</source>
</syntaxhighlight>


For [[nested array|nested arrays]], Find tests for exact [[match]] between the elements.
For [[nested array|nested arrays]], Find tests for exact [[match]] between the elements.


<source lang=apl>
<syntaxhighlight lang=apl>
       'BIRDS' 'NEST'⍷'BIRDS' 'NEST' 'SOUP'
       'BIRDS' 'NEST'⍷'BIRDS' 'NEST' 'SOUP'
1 0 0
1 0 0
</source>
</syntaxhighlight>


== Model ==
== Model ==
Find can be modelled as follows, where all possible subarrays of the right argument are checked to see if they [[match]] the left argument:<ref>[[Roger Hui|Hui, Roger]]. [https://forums.dyalog.com/viewtopic.php?f=30&t=1735 ⍷ follies]. Dyalog Forums. 16 Feb 2021.</ref>
Find can be modelled as follows (for non-empty left arguments), where all possible subarrays of the right argument are checked to see if they [[match]] the left argument:<ref>[[Roger Hui|Hui, Roger]]. [https://forums.dyalog.com/viewtopic.php?f=30&t=1735 ⍷ follies]. Dyalog Forums. 16 Feb 2021.</ref>
<source lang=apl>
<syntaxhighlight lang=apl>
ebar←{⎕IO←0
ebar←{⎕IO←0
  r←(≢⍴⍺)⌈≢⍴⍵                    ⍝ maximum rank
  r←(≢⍴⍺)⌈≢⍴⍵                    ⍝ maximum rank
Line 46: Line 46:
  (⍴⍵) ↑ ⍺∘{⍺≡(⍴⍺)↑⍵↓ww}¨ ⍳(×⍴⍺)+(⍴⍵)-⍴⍺
  (⍴⍵) ↑ ⍺∘{⍺≡(⍴⍺)↑⍵↓ww}¨ ⍳(×⍴⍺)+(⍴⍵)-⍴⍺
}
}
</source>
</syntaxhighlight>
== Empty left argument ==
=== Empty left argument ===
Implementations differ in their treatment of empty left arguments:
Implementations differ in their treatment of empty left arguments:
* [[APL2]], [[GNU APL]], [[NARS2000]], and [[Dyalog APL]] indicate positions where the left argument can fit, even if the [[prototype]]s don't match.
* [[APL2]], [[GNU APL]], [[NARS2000]], and [[Dyalog APL]] indicate positions where the left argument can fit, even if the [[prototype]]s don't match.
* [[APLX]] never finds any empty arrays.
* [[APLX]] never finds any empty arrays.
* [[APL+]] finds empty arrays everywhere, even where they would extend beyond the edges of the right argument.
* [[APL+]] finds empty arrays everywhere, even where they would extend beyond the edges of the right argument.
=== Discussion ===
In 2021, internal discussions about the correctness of the implemented primitive for empty left arguments happened at [[Dyalog Ltd]].<ref>[[Dyalog Ltd]]. Internal emails. ''more ⍷ follies
'', 15–19 Feb; ''ancient bug in ⍷ with empty left argument'', 7–8 Apr, and 26 May 2021.</ref>


In February, [[Roger Hui]] posed that the primitive had a bug in that it was finding empty subarrays of the wrong type, while defined in terms of [[match]] (<source lang=apl inline></source>), which does distinguish between empty arrays of unequal type. [[Adám Brudzewsky]] devised an alternative mental model to describe Find's behaviour where, rather than checking if the left argument could be ''extracted'' from the right argument by peeling off outer elements, one could check if the left argument could be ''overlaid'' on the right argument, without the right argument changing. He devised two almost identical models, to emphasise the difference between the extraction model and the overlay model:
The prototype is never used, in contrast to [[Match]], which in many APLs compares prototypes of empty arrays. The behavior may come from the use of the [[inner product]] <syntaxhighlight lang=apl inline>∧.=</syntaxhighlight> in early [[Array_model#Flat_array_theory|flat]] APLs where [[Match]] is not a primitive; this function naturally checks elements and not the prototype. [[Adin Falkoff]] presented code for Find-like functions using <syntaxhighlight lang=apl inline>∧.=</syntaxhighlight> at [[APL79]].<ref>[[Adin Falkoff|Falkoff, Adin]]. [https://dl.acm.org/doi/10.1145/390009.804448 A note on pattern matching: Where do you find the match to an empty array?] at [[APL79]].</ref>
<source lang=apl>
 
ee←{ ⍝ extraction model
Correspondingly, the above model can be amended to match the behaviour of the primitive (in APL2, etc.) by replacing the <syntaxhighlight lang=apl inline>≡</syntaxhighlight> with <syntaxhighlight lang=apl inline>{(⍺≡⍥⍴⍵)∧(∧/⍺≡¨⍥,⍵)}</syntaxhighlight> which ignores prototypes, only comparing shape and elements:<ref>[[Adám Brudzewsky|Brudzewsky, Adám]]. [https://forums.dyalog.com/viewtopic.php?f=30&t=1735&p=7416#p7416 RE: ⍷ follies]. Dyalog Forums. 23 Aug 2022.</ref>
    ⎕IO←0
<syntaxhighlight lang=apl>
    ra←≢sa←⍴⍺ ⋄ rw←≢sw←⍴⍵
ebar2←{⎕IO←0
    rm←ra⌈rw
r←(≢⍴⍺)⌈≢⍴⍵                    ⍝ maximum rank
    rm>ra:⍵ ∇⍨⍺⍴⍨sa,⍨1⍴⍨rm-ra
r>≢⍴⍺:(⍺⍴⍨(⍴⍺),⍨(r-≢⍴⍺)⍴1)∇ ⍝ if ⍺ has lesser  rank, make it the same rank
    sa∨.>rm↑sw,¯1:sw⍴0
(⍴⍺)∨.>r↑(⍴⍵),¯1:(⍴⍵)⍴0        return 0s if ⍺ has greater rank or is longer
    _Extract_←{ does extracting ⍺⍺ from ⍵⍵ change ⍺⍺?
ww←⍵
        ⍺⍺≡⍺↑⍵↓⍵⍵
(⍴⍵) ↑ ⍺∘{{(⍺≡⍥⍴⍵)∧(∧/⍺≡¨⍥,⍵)} (⍴⍺)↑⍵↓ww}¨ (×⍴⍺)+(⍴⍵)-⍴⍺
    }
    sw↑sa∘(⍺ _Extract_ ⍵)¨(-⍨∘×⍨sa)↓⍳sw
}
}
eo←{ ⍝ overlay model
</syntaxhighlight>
    ⎕IO←0
    ra←≢sa←⍴⍺ ⋄ rw←≢sw←⍴⍵
    rm←ra⌈rw
    rm>ra:⍵ ∇⍨⍺⍴⍨sa,⍨1⍴⍨rm-ra
    sa∨.>rm↑sw,¯1:sw⍴0
    _Overlay_←{ ⍝ does overlaying ⍺⍺ on ⍵⍵ change ⍵⍵?
        ⍵⍵≡⍺⍺@((⍳⍺)+⊂⍵)⊢⍵⍵
    }
    sw↑sa∘(⍺ _Overlay_ ⍵)¨(-⍨∘×⍨sa)↓⍳sw
}
</source>
[[Morten Kromberg]] speculated that the behaviour stemmed from early [[Array_model#Flat_array_theory|flat]] APL where [[Match]] didn't exist. Instead, common practice was to use [[And]]-[[reduce|reduction]] (often written as the [[inner product]] <source lang=apl inline>∧.=</source>) over element-wise [[equal to|equality]], which ignores type mismatches because the comparison of two empty arrays (a [[scalar function]] application) itself is empty, thus making the reduction yield the [[identity element]] of And, which is true (<source lang=apl inline>1</source>).
 
In April, Hui wrote that he ''disagree[d] strongly with the "alternative APL and mental model"'' which Brudzewsky had devised, because it ''among other things [meant he] can not give a good accounting of it.  Also that all the descriptions (APL or non-APL) of string search/find that [he had] seen do not use that mental model.''.
 
Kromberg agreed with Hui that Brudzewsky's model was ''strained at best'' and ''clearly a modern construction based on a more complete understanding of <source lang=apl inline>≡</source> and prototypes, than a possible explanation for what the implementors where thinking when they did this work.'' He reiterated his theory about And-reductons over equality in a moving window, thus posing that the current behaviour can be seen as correct.
 
In May, Brudzewsky found support for Kromberg's theory, based on that exact usage in a conference proceeding,<ref>[[Adin Falkoff|Falkoff, Adin]]. [A note on pattern matching: Where do you find the match to an empty array?] [[APL79]]. doi:[https://doi.org/10.1145/800136.804470 10.1145/800136.804470].</ref> finding that redefining <source lang=apl inline>≡</source> accordingly as <source lang=apl inline>{(⍺≡⍥⍴⍵)∧(∧/⍺≡¨⍥,⍵)}</source> would make Hui's <source lang=apl inline>ebar</source> model align with the behaviour of the primitive as implemented. Hui promised to write an appendix to his earlier forum post ''at an appropriate time'', but passed away before being able to do so.
 
It is
 
== See also ==
== See also ==
* [[Membership]]
* [[Membership]]

Latest revision as of 21:36, 10 September 2022

Find () is a dyadic primitive function which indicates where the left argument appears as a contiguous subarray of the right argument.

Examples

Both arguments can be arrays of any shape. The entire left argument is tested against each position in the right argument. The result is a boolean array having the same shape as the right argument, where a 1 indicates the position of the first element of the matched subarray (which can be seen as the "leftmost" or "top left" position in case of a vector or matrix). If the left argument has lower rank, it is treated as if the shape is prepended with ones. If the left argument has higher rank, Find does not error, but it is never found in the right argument (resulting in an all-zero array).

      'ANA'⍷'BANANA'  ⍝ Matches may overlap
0 1 0 1 0 0

      WEEK
SUNDAY
MONDAY
TUESDAY
WEDNESDAY
THURSDAY
FRIDAY
SATURDAY
      'DAY'⍷WEEK  ⍝ Find the pattern 'DAY' in WEEK; right arg may have higher rank
0 0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 0 1 0 0 0
      WEEK⍷'DAY'  ⍝ WEEK not found in 'DAY'; left arg may have higher rank but it is never found
0 0 0

For nested arrays, Find tests for exact match between the elements.

      'BIRDS' 'NEST'⍷'BIRDS' 'NEST' 'SOUP'
1 0 0

Model

Find can be modelled as follows (for non-empty left arguments), where all possible subarrays of the right argument are checked to see if they match the left argument:[1]

ebar←{⎕IO←0
 r←(≢⍴⍺)⌈≢⍴⍵                    ⍝ maximum rank
 r>≢⍴⍺:(⍺⍴⍨(⍴⍺),⍨(r-≢⍴⍺)⍴1)∇ ⍵  ⍝ if ⍺ has lesser  rank, make it the same rank
 (⍴⍺)∨.>r↑(⍴⍵),¯1:(⍴⍵)⍴0        ⍝ return 0s if ⍺ has greater rank or is longer
 ww←⍵
 (⍴⍵) ↑ ⍺∘{⍺≡(⍴⍺)↑⍵↓ww}¨ ⍳(×⍴⍺)+(⍴⍵)-⍴⍺
}

Empty left argument

Implementations differ in their treatment of empty left arguments:

  • APL2, GNU APL, NARS2000, and Dyalog APL indicate positions where the left argument can fit, even if the prototypes don't match.
  • APLX never finds any empty arrays.
  • APL+ finds empty arrays everywhere, even where they would extend beyond the edges of the right argument.

The prototype is never used, in contrast to Match, which in many APLs compares prototypes of empty arrays. The behavior may come from the use of the inner product ∧.= in early flat APLs where Match is not a primitive; this function naturally checks elements and not the prototype. Adin Falkoff presented code for Find-like functions using ∧.= at APL79.[2]

Correspondingly, the above model can be amended to match the behaviour of the primitive (in APL2, etc.) by replacing the with {(⍺≡⍥⍴⍵)∧(∧/⍺≡¨⍥,⍵)} which ignores prototypes, only comparing shape and elements:[3]

ebar2←{⎕IO←0
 r←(≢⍴⍺)⌈≢⍴⍵                    ⍝ maximum rank
 r>≢⍴⍺:(⍺⍴⍨(⍴⍺),⍨(r-≢⍴⍺)⍴1)∇ ⍵  ⍝ if ⍺ has lesser  rank, make it the same rank
 (⍴⍺)∨.>r↑(⍴⍵),¯1:(⍴⍵)⍴0        ⍝ return 0s if ⍺ has greater rank or is longer
 ww←⍵
 (⍴⍵) ↑ ⍺∘{⍺ {(⍺≡⍥⍴⍵)∧(∧/⍺≡¨⍥,⍵)} (⍴⍺)↑⍵↓ww}¨ ⍳(×⍴⍺)+(⍴⍵)-⍴⍺
}

See also

External links

Documentation

References

APL built-ins [edit]
Primitives (Timeline) Functions
Scalar
Monadic ConjugateNegateSignumReciprocalMagnitudeExponentialNatural LogarithmFloorCeilingFactorialNotPi TimesRollTypeImaginarySquare RootRound
Dyadic AddSubtractTimesDivideResiduePowerLogarithmMinimumMaximumBinomialComparison functionsBoolean functions (And, Or, Nand, Nor) ∙ GCDLCMCircularComplexRoot
Non-Scalar
Structural ShapeReshapeTallyDepthRavelEnlistTableCatenateReverseRotateTransposeRazeMixSplitEncloseNestCut (K)PairLinkPartitioned EnclosePartition
Selection FirstPickTakeDropUniqueIdentityStopSelectReplicateExpandSet functions (IntersectionUnionWithout) ∙ Bracket indexingIndexCartesian ProductSort
Selector Index generatorGradeIndex OfInterval IndexIndicesDealPrefix and suffix vectors
Computational MatchNot MatchMembershipFindNub SieveEncodeDecodeMatrix InverseMatrix DivideFormatExecuteMaterialiseRange
Operators Monadic EachCommuteConstantReplicateExpandReduceWindowed ReduceScanOuter ProductKeyI-BeamSpawnFunction axisIdentity (Null, Ident)
Dyadic BindCompositions (Compose, Reverse Compose, Beside, Withe, Atop, Over) ∙ Inner ProductDeterminantPowerAtUnderRankDepthVariantStencilCutDirect definition (operator)Identity (Lev, Dex)
Quad names Index originComparison toleranceMigration levelAtomic vector