Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize CollectionSpec checker #436

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 6 additions & 8 deletions src/cljc/schema/spec/collection.cljc
Original file line number Diff line number Diff line change
Expand Up @@ -23,19 +23,18 @@
(let [_ (macros/assert! (= 2 (count e)) "remaining can have only one schema.")
c (spec/sub-checker (second e) params)]
#?(:clj (fn [^java.util.List res x]
(doseq [i x]
(.add res (c i)))
(reduce (fn [res i] (doto res (.add (c i)))) res x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this expected to be faster? The List can't be used safely concurrently so I would be surprised if you could do better than the doseq. Have you profiled and seen a speedup?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reduce (and run!) is faster and more memory efficient because as an internal iterator it can be fully specialized to the collection and usually does not need to allocate at all. I have profiled it sometime but by now it is just a habit like using (into [] (map ...) x) instead of (into [] (map ... x)). Since reduce works on all seqs really the only reason to ever use doseq is if :when/:while/:let is super useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The weird thing about reduce here is that it's threading through an extra argument that you don't care about because you're mutating the argument anyway. I think it makes the code slightly less clear and I would be a bit surprised if there was a performance benefit, but I'd be happy to believe it if there were accompanying measurements.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The weird thing about reduce here is that it's threading through an extra argument that you don't care about because you're mutating the argument anyway.

Might be worth comparing to (reduce #(.add res (c %2))) nil x).

Copy link
Contributor

@frenchy64 frenchy64 Mar 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it would probably beat the current code handily since I don't see a type hint on the inner res.

EDIT: I've added a reflection check to the master build.
EDIT2: Of course I missed the doto propagating the type hint, nevermind.

(then res nil))
:cljs (fn [res x]
(swap! res into (map c x))
(reduce (fn [res i] (doto res (.push (c i)))) res x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that something mutable could be faster here, but I'm again unsure about the use of reduce. Can you share profiling results?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

into uses reduce too! Here it is more important that the data structure is mutable. I'll do some profiling later.

(then res nil)))))

(let [parser (:parser e)
c (spec/sub-checker e params)]
#?(:clj (fn [^java.util.List res x]
(then res (parser (fn [t] (.add res (if (utils/error? t) t (c t)))) x)))
:cljs (fn [res x]
(then res (parser (fn [t] (swap! res conj (if (utils/error? t) t (c t)))) x)))))))
(then res (parser (fn [t] (.push res (if (utils/error? t) t (c t)))) x)))))))

(defn- sequence-transformer [elts params then]
(macros/assert! (not-any? #(and (vector? %) (= (first %) ::remaining)) (butlast elts))
Expand All @@ -58,7 +57,7 @@

:cljs
(defn- has-error? [l]
(some utils/error? l)))
(.some l utils/error?)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super familiar with cljs, can you explain the change here please?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that l is a JS array it just uses Array#some.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Is that expected to be appreciably faster?


(defn subschemas [elt]
(if (map? elt)
Expand All @@ -75,9 +74,8 @@
t (sequence-transformer elements params (fn [_ x] x))]
(fn [x]
(or (pre x)
(let [res #?(:clj (java.util.ArrayList.) :cljs (atom []))
remaining (t res x)
res #?(:clj res :cljs @res)]
(let [res #?(:clj (java.util.ArrayList.) :cljs #js [])
remaining (t res x)]
(if (or (seq remaining) (has-error? res))
(utils/error (on-error x res remaining))
(constructor res))))))))
Expand Down