Accelerate some more predicates (like ==) #17
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It's super common to count or search for or filter data using predicates other than
isequal
orisless
, and we should really support things like==
(e.g.count(==(0), vector)
)The interesting thing is that it's not clear if we even can accelerate something like
==
in theory, because in theory there are no constraints on the relationship of==
andisequal
, so having a dictionary of values (where dictionaries useisequal
comparisons) doesn't necessarily make things faster. However, ignoring theory, in practice we can do something dirty that works for common types likeAbstractFloat
.The appraoch here attempts to be somewhat flexible, allowing one to expand the search space via a new internal
other_equal
function (e.g.other_equal(0.0) == [-0.0]
butother_equal(1.0) == []
). We also check for the case that==(x,x)
isfalse
. The approach won't work forBase
types likeMatrix{Complex{Float64}}
since there is a combinatoric explosion in the search space.Another crappy thing is the way
count(==(0), vector)
doesn't actually work as advertised here, haha, you need to typecount(==(0.0), vector)
. We also might like to usecount(iszero, vector)
but the search space foriszero
is a bit hard to define (e.g.Complex{<:AbstractFloat}
has four distinct values where this istrue
).Other predicates like
isone
are a pain (besides numbers, we haveisone("")
istrue
butisone('x')
is an error...).One way of constraining the search space might be to limit to certain element types. But honestly I believe everything should still function as normal on arrays of
Any
so dispatching on element type can become an antipattern. I suppose it's OK in the case of an acceleration that doesn't impact the result... stronger typing should make things execute faster... hmm...CC @bkamins this is one of those things that needs to be done to make this package more generally usable - I'm still not convinced of what approach to take.