0
$\begingroup$

I have a association list, S, whose size is about 5*10^4. I also have a function F which takes in ordered pairs of elements from S, and spits out ordered pairs of integers. There is some amount of repetition among these outputs of F.

I'd like to compute the exact cardinality of the image of F, if that's possible.


Edited to add: By modifying the function F, we might as well assume that S is just Range[1,2.5*10^9], and that F outputs a single 20 digit integer for each input. Computing F[S] is highly memory intensive. I was thinking that perhaps one way to handle this would be to print the outputs, according to their modulus, mod 1000, to different files, and then compute the number of distinct outputs in each file, separately. Are there fast ways to do that?

$\endgroup$
3
  • 5
    $\begingroup$ can you post F and a small part of S ? $\endgroup$ Commented Oct 5, 2019 at 1:18
  • 3
    $\begingroup$ CountDistinctBy[F][S]? $\endgroup$ Commented Oct 5, 2019 at 6:20
  • $\begingroup$ One can take S to be Range[1,5*10^4]. For the purposes of my computation, we can take F[i_,j_]:=i*j. $\endgroup$ Commented Oct 7, 2019 at 14:36

1 Answer 1

1
$\begingroup$

You can use CountDistinctBy:

CountDistinctBy[F][S] 
$\endgroup$
2
  • $\begingroup$ My function isn't defined on S, it is defined on ordered pairs from S. By modifying F a little, you can just define it on S=Range[1,2.5*10^9], but then creating the set S takes 20 GB of RAM, and running the function you gave definitely goes over my memory limits. $\endgroup$ Commented Oct 7, 2019 at 20:16
  • $\begingroup$ @PaceNielsen, maybe (for the example function F) Length[Union @@ ParallelTable[i j, {i, 10000}, {j, 1, i}]] or Length[DeleteDuplicates[Join @@ ParallelTable[i j, {i, 10000}, {j, 1, i}]]]? $\endgroup$ Commented Oct 8, 2019 at 19:28

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.