2
$\begingroup$

I've asked a question here which was resolved by tweaking my code a little. So I shall reiterate, for the following list,

l = {{"a", "b"}, {"a", "b"}, {"c", "d"}, {"e", "f"}, {"e", "f"}, {"x", "y", "z"}, {"m", "n", "o"}, {"p", "q", "s"}}; 

I want to count each sublist and make a polynomial from them, I do this by:

AssociationThread[ l -> Table[ Total[(x^-Length@l[[i]]) x^ Map[Length, Select[l, ContainsAll[l[[i]]]]]], {i, Length[l]}]] 

This code does the job correctly yet, the original list I am working with has 38968 sublists and thus the code would take a lot of time to be complied - I wonder how does one optimise this?

I used ParallelTable but for some reason it takes more time!

In[130]:= AssociationThread[ l -> Table[ Total[(x^-Length@l[[i]]) x^ Map[Length, Select[l, ContainsAll[l[[i]]]]]], {i, Length[l]}]] // AbsoluteTiming AssociationThread[ l -> ParallelTable[ Total[(x^-Length@l[[i]]) x^ Map[Length, Select[l, ContainsAll[l[[i]]]]]], {i, Length[l]}]] // AbsoluteTiming Out[130]= {0.000812, <|{"a", "b"} -> 2, {"c", "d"} -> 1, {"e", "f"} -> 2, {"x", "y", "z"} -> 1, {"m", "n", "o"} -> 1, {"p", "q", "s"} -> 1|>} Out[131]= {0.002713, <|{"a", "b"} -> 2, {"c", "d"} -> 1, {"e", "f"} -> 2, {"x", "y", "z"} -> 1, {"m", "n", "o"} -> 1, {"p", "q", "s"} -> 1|>} 
$\endgroup$
4
  • $\begingroup$ I'm confused. Isn't your desired output identical to the output of Counts[l]? What am I missing? $\endgroup$ Commented Sep 10, 2019 at 19:37
  • $\begingroup$ @MarcoB I got mislead as well: try this list: l = {{"a", "a", "b"}, {"a", "b"}, {"a", "a"}, {"b", "b"}, {"b", "b"}, {"c", "d"}, {"e", "f"}, {"e", "f"}, {"x", "y", "z"}, {"m", "n", "o"}, {"p", "q", "s"}}; and you'll see that the output is different. Maybe it's worth editing the OP with this example as well to clarify the difference, but I'm not 100% sure $\endgroup$ Commented Sep 11, 2019 at 8:06
  • $\begingroup$ @marcoB in this very specific case the output is like that because powers of x are cancelling each other - this is not a general rule for the list that I am working with so the output is indeed correct as you see and one must use the formula I used... $\endgroup$ Commented Sep 11, 2019 at 11:07
  • $\begingroup$ I suggest you to update the test list with something more clear, as the one I provide in my answer below :) $\endgroup$ Commented Sep 12, 2019 at 7:49

2 Answers 2

2
$\begingroup$

Here is an approach that is around 10 times faster (see below for timing of old approach on my machine):

al = AssociationThread[# -> 1] & /@ l; AssociationThread[ Keys /@ al -> Table[ Total[ (x^-Length@al[[i]]) x^Map[ Length, Pick[al, Times @@@ Lookup[al, Keys@al[[i]], 0], 1] ] ], {i, Length[al]} ] ] // RepeatedTiming (* {0.00012, <|{"a", "b"} -> 2, {"c", "d"} -> 1, {"e", "f"} -> 2, {"x", "y", "z"} -> 1, {"m", "n", "o"} -> 1, {"p", "q", "s"} -> 1|>} *) 

This works by (ab)using Associations as hash-maps for faster membership checks. We first convert the list l into a list of associations (al) where the keys are the elements from the sublists of l and the values are all 1. Using Lookup, we perform a lookup in all of al at once, with non-existing keys defaulting to 0. We then Pick the sublists where the product of all lookup results is 1.

Timing of old approach

Compared to the one in the question:

AssociationThread[ l -> Table[ Total[ (x^-Length@l[[i]]) x^Map[ Length, Select[l, ContainsAll[l[[i]]]] ] ], {i, Length[l]} ] ] // RepeatedTiming (* {0.0009, <|{"a", "b"} -> 2, {"c", "d"} -> 1, {"e", "f"} -> 2, {"x", "y", "z"} -> 1, {"m", "n", "o"} -> 1, {"p", "q", "s"} -> 1|>} *) 
$\endgroup$
1
$\begingroup$

Let's first create a test list:

l = Join @@ ConstantArray[{{"a", "a", "b"}, {"a", "b"}, {"a", "b"}, {"a", "a"}, {"b", "b"}, {"b", "b"}, {"c", "d"}, {"e", "f"}, {"e", "f"}, {"x", "y", "z"}, {"m", "n", "o"}, {"p", "q", "s"}}, 20]; 

One can use the following code to speed up the calculation:

RepeatedTiming[ dl = DeleteDuplicates[l]; exp1 = Length /@ dl; exp2 = Length /@ Select[l, ContainsAll[#]] & /@ dl; m3 = AssociationThread[dl -> Total /@ ((x^-exp1) (x^exp2))]] 

{0.023, <|{"a", "a", "b"} -> 20 + 40/x, {"a", "b"} -> 40 + 20 x, {"a", "a"} -> 60 + 20 x, {"b", "b"} -> 80 + 20 x, {"c", "d"} -> 20, {"e", "f"} -> 40, {"x", "y", "z"} -> 20, {"m", "n", "o"} -> 20, {"p", "q", "s"} -> 20|>}

Compared with the old method:

RepeatedTiming[m1 = AssociationThread[l -> Table[ Total[ (x^-Length@l[[i]]) x^Map[Length, Select[l, ContainsAll[l[[i]]]]] ], {i, Length[l]}] ]] 

{0.624, <|{"a", "a", "b"} -> 20 + 40/x, {"a", "b"} -> 40 + 20 x, {"a", "a"} -> 60 + 20 x, {"b", "b"} -> 80 + 20 x, {"c", "d"} -> 20, {"e", "f"} -> 40, {"x", "y", "z"} -> 20, {"m", "n", "o"} -> 20, {"p", "q", "s"} -> 20|>}

m1==m3 

True

and ~ 30 times faster.

$\endgroup$
2
  • $\begingroup$ Thank you but could you please read the comment I've posted under my question - it will answer your question. As it stands the output of yours is wrong as it is no matching with mine. $\endgroup$ Commented Sep 11, 2019 at 11:09
  • $\begingroup$ Are you sure? can you provide an example list where the output doesn't match yours? It seems to me that the two codes do the same thing :) If you notice, I changed the list so that the powers of X don't simplify, and the output of your code is the same as mine! $\endgroup$ Commented Sep 11, 2019 at 11:34

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.