Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggested addition: index-and-map-by #50

Open
vvvvalvalval opened this issue Sep 4, 2020 · 11 comments
Open

Suggested addition: index-and-map-by #50

vvvvalvalval opened this issue Sep 4, 2020 · 11 comments

Comments

@vvvvalvalval
Copy link

Wondering if you'd be open to a function index-and-map-by generalizing over index-by like the following:

(defn index-and-map-by
  [kf vf coll]
  (into {} 
    (map 
      (fn [x]
        [(kf x) (vf x)]))
    coll))

(Of course, you might want to improve the peformance of this implementation, e.g with reduce and transients).

(index-and-map-by kf vf coll) would be equivalent to (->> coll (index-by kf) (map-vals vf))

I'm asking because I keep defining this index-and-map-by function in my projects.

@weavejester
Copy link
Owner

Why not just use (->> coll (index-by kf) (map-vals vf))? It's not particularly more verbose, and it makes it clear which function is the indexer and which function maps the values. Or alternatively, (->> coll (map vf) (index-by kf)) would work as well, no? And that's only four characters longer.

@vvvvalvalval
Copy link
Author

Why not just use (->> coll (index-by kf) (map-vals vf))? It's not particularly more verbose, and it makes it clear which function is the indexer and which function maps the values.

What can I say, the main reason is frequency :) . This pattern occurs so often in my code that I will eventually add this utility function anyway.

Another reason might be performance of course, doing one scan and indexing instead of two, though I concede that's less problematic in general.

Or alternatively, (->> coll (map vf) (index-by kf)) would work as well, no? And that's only four characters longer.

No, because in vf you typically do a projection that loses the information required for indexing by kf, for instance (def user-id->name (index-and-map-by :id :name users)).

@weavejester
Copy link
Owner

No, because in vf you typically do a projection that loses the information required for indexing by kf

Right. So map-vals needs to be used, which means that a single function would save 9 characters, and using map and juxt would save 4:

(index-and-map-by :id :name coll)
(into {} (map (juxt :id :name)) coll)
(->> coll (index-by :id) (map-vals :name))

To my eyes the last one is the clearest in terms of intent, though likely also the least performant. I'll need to think about this one for a while.

@borkdude
Copy link
Contributor

borkdude commented Sep 4, 2020

Maybe and in the name indications some kind of permutation problem. Maybe I'd also like to have index-and-keep-by, etc. I don't think medley can possibly host every possible combination of functions?

@borkdude
Copy link
Contributor

borkdude commented Sep 4, 2020

(defn index-and-hof-by
  [hof kf vf coll]
  (into {} 
    (f 
      (fn [x]
        [(kf x) (vf x)]))
    coll))

@vvvvalvalval
Copy link
Author

@borkdude arguable, at this point you're just looking for transduce :)

I'm really only asking for a narrow but common use case.

@vvvvalvalval
Copy link
Author

@weavejester I would argue that maybe characters count is not the best proxy for clarity in this case?

Even if (->> coll (index-by :id) (map-vals :name)) is only 9 characters longer, it means that the reader has to work out the overarching indexing intention by interpreting the two function calls in her head.

If we agreed that this pattern is frequent enough that the reader finds it worth memorizing by the index-and-map-by name, then (index-and-map-by :id :name coll) would be the most readable presentation, right?

@weavejester
Copy link
Owner

I personally don't find enough use for the function to be worth memorizing, so I'd find index-by and map-vals to be easier to understand. The use of the separate functions also makes it clear which key is being used for indexing, and which key is being used to map over the values.

@vvvvalvalval
Copy link
Author

@weavejester your call, there's no arguing against personal judgement. I guess I'll keep writing this function on every project :) thanks for considering it!

@weavejester
Copy link
Owner

Give me some time to think about it. There may also be performance improvements, or another way to write it. I'll leave this issue open.

@vvvvalvalval
Copy link
Author

If that helps, the way I would write it for performance is:

(defn index-and-map-by
  [kf vf coll]
  (persistent!
    (reduce
      (fn [tm x]
        (let [k (kf x)
              v (vf x)]
          (assoc! tm k v)))
      (transient {})
      coll)))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants