Find: Difference between revisions

From APL Wiki
Jump to navigation Jump to search
(Model and empty args)
Line 36: Line 36:
</source>
</source>


== Model ==
Find can be modelled as follows, where all possible subarrays of the right argument are checked to see if they [[match]] the left argument:<ref>[[Roger Hui|Hui, Roger]]. [https://forums.dyalog.com/viewtopic.php?f=30&t=1735 ⍷ follies]. Dyalog Forums. 16 Feb 2021.</ref>
<source lang=apl>
ebar←{⎕IO←0
r←(≢⍴⍺)⌈≢⍴⍵                    ⍝ maximum rank
r>≢⍴⍺:(⍺⍴⍨(⍴⍺),⍨(r-≢⍴⍺)⍴1)∇ ⍵  ⍝ if ⍺ has lesser  rank, make it the same rank
(⍴⍺)∨.>r↑(⍴⍵),¯1:(⍴⍵)⍴0        ⍝ return 0s if ⍺ has greater rank or is longer
ww←⍵
(⍴⍵) ↑ ⍺∘{⍺≡(⍴⍺)↑⍵↓ww}¨ ⍳(×⍴⍺)+(⍴⍵)-⍴⍺
}
</source>
== Empty left argument ==
Implementations differ in their treatment of empty left arguments:
* [[APL2]], [[GNU APL]], [[NARS2000]], and [[Dyalog APL]] indicate positions where the left argument can fit, even if the [[prototype]]s don't match.
* [[APLX]] never finds any empty arrays.
* [[APL+]] finds empty arrays everywhere, even where they would extend beyond the edges of the right argument.
=== Discussion ===
In February, April, and May, [[Dyalog Ltd]] had internall discussions about the correctness of the implemented primitive for empty left arguments.<ref>[[Dyalog Ltd]]. Internal emails. ''more ⍷ follies
'', 15–19 Feb; ''ancient bug in ⍷ with empty left argument'', 7–8 Apr, and 26 May 2021.</ref>
In February, [[Roger Hui]] posed that the primitive had a bug in that it was finding empty subarrays of the wrong type, while defined in terms of [[match]] (<source lang=apl inline>≡</source>), which does distinguish between empty arrays of unequal type. [[Adám Brudzewsky]] devised an alternative mental model to describe Find's behaviour where, rather than checking if the left argument could be ''extracted'' from the right argument by peeling off outer elements, one could check if the left argument could be ''overlaid'' on the right argument, without the right argument changing. He devised two almost identical models, to emphasise the difference between the extraction model and the overlay model:
<source lang=apl>
ee←{ ⍝ extraction model
    ⎕IO←0
    ra←≢sa←⍴⍺ ⋄ rw←≢sw←⍴⍵
    rm←ra⌈rw
    rm>ra:⍵ ∇⍨⍺⍴⍨sa,⍨1⍴⍨rm-ra
    sa∨.>rm↑sw,¯1:sw⍴0
    _Extract_←{ ⍝ does extracting ⍺⍺ from ⍵⍵ change ⍺⍺?
        ⍺⍺≡⍺↑⍵↓⍵⍵
    }
    sw↑sa∘(⍺ _Extract_ ⍵)¨(-⍨∘×⍨sa)↓⍳sw
}
eo←{ ⍝ overlay model
    ⎕IO←0
    ra←≢sa←⍴⍺ ⋄ rw←≢sw←⍴⍵
    rm←ra⌈rw
    rm>ra:⍵ ∇⍨⍺⍴⍨sa,⍨1⍴⍨rm-ra
    sa∨.>rm↑sw,¯1:sw⍴0
    _Overlay_←{ ⍝ does overlaying ⍺⍺ on ⍵⍵ change ⍵⍵?
        ⍵⍵≡⍺⍺@((⍳⍺)+⊂⍵)⊢⍵⍵
    }
    sw↑sa∘(⍺ _Overlay_ ⍵)¨(-⍨∘×⍨sa)↓⍳sw
}
</source>
[[Morten Kromberg]] speculated that the behaviour stemmed from early [[Array_model#Flat_array_theory|flat]] APL where [[Match]] didn't exist. Instead, common practice was to use [[And]]-[[reduce|reduction]] (often written as the [[inner product]] <source lang=apl inline>∧.=</source>) over element-wise [[equal to|equality]], which ignores type mismatches because the comparison of two empty arrays (a [[scalar function]] application) itself is empty, thus making the reduction yield the [[identity element]] of And, which is true (<source lang=apl inline>1</source>).
In April, Hui wrote that he ''disagree[d] strongly with the "alternative APL and mental model"'' which Brudzewsky had devised, because it ''among other things [meant he] can not give a good accounting of it.  Also that all the descriptions (APL or non-APL) of string search/find that [he had] seen do not use that mental model.''.
Kromberg agreed with Hui that Brudzewsky's model was ''strained at best'' and ''clearly a modern construction based on a more complete understanding of <source lang=apl inline>≡</source> and prototypes, than a possible explanation for what the implementors where thinking when they did this work.'' He reiterated his theory about And-reductons over equality in a moving window, thus posing that the current behaviour can be seen as correct.
In May, Brudzewsky found support for Kromberg's theory, based on that exact usage in a conference proceeding,<ref>[[Adin Falkoff|Falkoff, Adin]]. [A note on pattern matching: Where do you find the match to an empty array?] [[APL79]]. doi:[https://doi.org/10.1145/800136.804470 10.1145/800136.804470].</ref> finding that redefining <source lang=apl inline>≡</source> accordingly as <source lang=apl inline>{(⍺≡⍥⍴⍵)∧(∧/⍺≡¨⍥,⍵)}</source> would make Hui's <source lang=apl inline>ebar</source> model align with the behaviour of the primitive as implemented. Hui promised to write an appendix to his earlier forum post ''at an appropriate time'', but passed away before being able to do so.
It is
== See also ==
== See also ==
* [[Membership]]
* [[Membership]]
== External links ==
== External links ==



Revision as of 08:14, 23 August 2022

Find () is a dyadic primitive function which tests if the left argument appears as a contiguous subarray of the right argument.

Examples

Both arguments can be arrays of any shape. The entire left argument is tested against each position in the right argument. The result is a boolean array having the same shape as the right argument, where a 1 indicates the position of the first element of the matched subarray (which can be seen as the "leftmost" or "top left" position in case of a vector or matrix). If the left argument has lower rank, it is treated as if the shape is prepended with ones. If the left argument has higher rank, Find does not error, but it is never found in the right argument (resulting in an all-zero array).

      'ANA'⍷'BANANA'  ⍝ Matches may overlap
0 1 0 1 0 0

      WEEK
SUNDAY
MONDAY
TUESDAY
WEDNESDAY
THURSDAY
FRIDAY
SATURDAY
      'DAY'⍷WEEK  ⍝ Find the pattern 'DAY' in WEEK; right arg may have higher rank
0 0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 0 0
0 0 0 1 0 0 0 0 0
0 0 0 0 0 1 0 0 0
      WEEK⍷'DAY'  ⍝ WEEK not found in 'DAY'; left arg may have higher rank but it is never found
0 0 0

For nested arrays, Find tests for exact match between the elements.

      'BIRDS' 'NEST'⍷'BIRDS' 'NEST' 'SOUP'
1 0 0

Model

Find can be modelled as follows, where all possible subarrays of the right argument are checked to see if they match the left argument:[1]

ebar←{⎕IO←0
 r←(≢⍴⍺)⌈≢⍴⍵                    ⍝ maximum rank
 r>≢⍴⍺:(⍺⍴⍨(⍴⍺),⍨(r-≢⍴⍺)⍴1)∇ ⍵  ⍝ if ⍺ has lesser  rank, make it the same rank
 (⍴⍺)∨.>r↑(⍴⍵),¯1:(⍴⍵)⍴0        ⍝ return 0s if ⍺ has greater rank or is longer
 ww←⍵
 (⍴⍵) ↑ ⍺∘{⍺≡(⍴⍺)↑⍵↓ww}¨ ⍳(×⍴⍺)+(⍴⍵)-⍴⍺
}

Empty left argument

Implementations differ in their treatment of empty left arguments:

  • APL2, GNU APL, NARS2000, and Dyalog APL indicate positions where the left argument can fit, even if the prototypes don't match.
  • APLX never finds any empty arrays.
  • APL+ finds empty arrays everywhere, even where they would extend beyond the edges of the right argument.

Discussion

In February, April, and May, Dyalog Ltd had internall discussions about the correctness of the implemented primitive for empty left arguments.[2]

In February, Roger Hui posed that the primitive had a bug in that it was finding empty subarrays of the wrong type, while defined in terms of match (), which does distinguish between empty arrays of unequal type. Adám Brudzewsky devised an alternative mental model to describe Find's behaviour where, rather than checking if the left argument could be extracted from the right argument by peeling off outer elements, one could check if the left argument could be overlaid on the right argument, without the right argument changing. He devised two almost identical models, to emphasise the difference between the extraction model and the overlay model:

ee←{ ⍝ extraction model
    ⎕IO←0
    ra←≢sa←⍴⍺ ⋄ rw←≢sw←⍴⍵
    rm←ra⌈rw
    rm>ra:⍵ ∇⍨⍺⍴⍨sa,⍨1⍴⍨rm-ra
    sa∨.>rm↑sw,¯1:sw⍴0
    _Extract_←{ ⍝ does extracting ⍺⍺ from ⍵⍵ change ⍺⍺?
        ⍺⍺≡⍺↑⍵↓⍵⍵
    }
    sw↑sa∘(⍺ _Extract_ ⍵)¨(-⍨∘×⍨sa)↓⍳sw
}
eo←{ ⍝ overlay model
    ⎕IO←0
    ra←≢sa←⍴⍺ ⋄ rw←≢sw←⍴⍵
    rm←ra⌈rw
    rm>ra:⍵ ∇⍨⍺⍴⍨sa,⍨1⍴⍨rm-ra
    sa∨.>rm↑sw,¯1:sw⍴0
    _Overlay_←{ ⍝ does overlaying ⍺⍺ on ⍵⍵ change ⍵⍵?
        ⍵⍵≡⍺⍺@((⍳⍺)+⊂⍵)⊢⍵⍵
    }
    sw↑sa∘(⍺ _Overlay_ ⍵)¨(-⍨∘×⍨sa)↓⍳sw
}

Morten Kromberg speculated that the behaviour stemmed from early flat APL where Match didn't exist. Instead, common practice was to use And-reduction (often written as the inner product ∧.=) over element-wise equality, which ignores type mismatches because the comparison of two empty arrays (a scalar function application) itself is empty, thus making the reduction yield the identity element of And, which is true (1).

In April, Hui wrote that he disagree[d] strongly with the "alternative APL and mental model" which Brudzewsky had devised, because it among other things [meant he] can not give a good accounting of it. Also that all the descriptions (APL or non-APL) of string search/find that [he had] seen do not use that mental model..

Kromberg agreed with Hui that Brudzewsky's model was strained at best and clearly a modern construction based on a more complete understanding of and prototypes, than a possible explanation for what the implementors where thinking when they did this work. He reiterated his theory about And-reductons over equality in a moving window, thus posing that the current behaviour can be seen as correct.

In May, Brudzewsky found support for Kromberg's theory, based on that exact usage in a conference proceeding,[3] finding that redefining accordingly as {(⍺≡⍥⍴⍵)∧(∧/⍺≡¨⍥,⍵)} would make Hui's ebar model align with the behaviour of the primitive as implemented. Hui promised to write an appendix to his earlier forum post at an appropriate time, but passed away before being able to do so.

It is

See also

External links

Documentation

References

  1. Hui, Roger. ⍷ follies. Dyalog Forums. 16 Feb 2021.
  2. Dyalog Ltd. Internal emails. more ⍷ follies , 15–19 Feb; ancient bug in ⍷ with empty left argument, 7–8 Apr, and 26 May 2021.
  3. Falkoff, Adin. [A note on pattern matching: Where do you find the match to an empty array?] APL79. doi:10.1145/800136.804470.
APL built-ins [edit]
Primitives (Timeline) Functions
Scalar
Monadic ConjugateNegateSignumReciprocalMagnitudeExponentialNatural LogarithmFloorCeilingFactorialNotPi TimesRollTypeImaginarySquare RootRound
Dyadic AddSubtractTimesDivideResiduePowerLogarithmMinimumMaximumBinomialComparison functionsBoolean functions (And, Or, Nand, Nor) ∙ GCDLCMCircularComplexRoot
Non-Scalar
Structural ShapeReshapeTallyDepthRavelEnlistTableCatenateReverseRotateTransposeRazeMixSplitEncloseNestCut (K)PairLinkPartitioned EnclosePartition
Selection FirstPickTakeDropUniqueIdentityStopSelectReplicateExpandSet functions (IntersectionUnionWithout) ∙ Bracket indexingIndexCartesian ProductSort
Selector Index generatorGradeIndex OfInterval IndexIndicesDealPrefix and suffix vectors
Computational MatchNot MatchMembershipFindNub SieveEncodeDecodeMatrix InverseMatrix DivideFormatExecuteMaterialiseRange
Operators Monadic EachCommuteConstantReplicateExpandReduceWindowed ReduceScanOuter ProductKeyI-BeamSpawnFunction axisIdentity (Null, Ident)
Dyadic BindCompositions (Compose, Reverse Compose, Beside, Withe, Atop, Over) ∙ Inner ProductDeterminantPowerAtUnderRankDepthVariantStencilCutDirect definition (operator)Identity (Lev, Dex)
Quad names Index originComparison toleranceMigration levelAtomic vector