2
$\begingroup$

Problem: Let A and B be lists of lists of integers. I wish to find every k-element set, which is a subset of one of the lists in A but not a subset of any of the lists in B. In other words, given $A,B\subseteq2^V$, I wish to efficiently compute the set $$\Big\{\sigma;\;|\sigma|\!=\!k,\: \sigma\in\big(\bigcup_{a\in A}2^a\big)\setminus\big(\bigcup_{b\in B}2^b\big)\Big\}.$$ We can assume the lists in A (and B) are pairwise incomparable (otherwise we replace A by its maximal elements via Carl Woll's answer), and their elements are non-repeating (otherwise we use DeleteDuplicates).

Test Examples: In the code below, A (resp. B) consists of nA (resp. nB) randomly chosen subsets of an n-element set, with the number of elements in the range rA (resp. rB).

n=30; V=Range@n; nA=30; nB=300; rA={25,27}; rB={23,26}; A= Join@@Subsets[V,rA,{#}]& /@ RandomInteger[{1,Sum[Binomial[n,i],{i,rA[[1]],rA[[2]]}]},nA]; B= Join@@Subsets[V,rB,{#}]& /@ RandomInteger[{1,Sum[Binomial[n,i],{i,rB[[1]],rB[[2]]}]},nB]; 

Note that B={} is always an important case.

Inefficient Solutions:

subsets1[A_,B_,k_]:= Module[{X,Y}, X=ParallelCombine[DeleteDuplicates[Join@@Table[Subsets[s,{k}],{s,#}]]&,A,Union,Method->"CoarsestGrained"]; If[B=!={},Y=ParallelCombine[DeleteDuplicates[Join@@Table[Subsets[s,{k}],{s,#}]]&,B,Union,Method->"CoarsestGrained"]; X=Complement[X,Y]; ]; X]; subsets2[A_,B_,k_]:= Module[{X={},aB,Y}, Do[Y=Subsets[a,{k}]; If[B=!={},aB=Intersection[a,#]& /@B; Y=Complement[Y,ParallelCombine[DeleteDuplicates[Join@@Table[Subsets[s,{k}],{s,#}]]&,aB,Union,Method->"CoarsestGrained"]]]; X=X\[Union]Y,{a,A}]; X]; 

Remark: There are several difficulties. (1) If $a\!\in\!A$ have large intersections, then first computing all $2^a$ and than taking their union (deleting duplicates) is wasting RAM. (2) If $a\!\in\!A$ are small but $b\!\in\!B$ are large, then computing $X\!=\!\bigcup_{a\in A}\!2^a$ and $Y\!=\!\bigcup_{b\in B}\!2^b$ and then $X\!\setminus\!Y$ is wasting RAM because of $Y$. (3) If $A$ is large but its elements are small, then doing $X\!=\!\{\}$ and $X=X\cup\big(2^a\!\setminus\!\big(\!\bigcup_{b\in B}\!2^{a\cap b}\big)\Big)\big)$ for all $a\!\in\!A$ is slow because of changing $X$ many times with $\cup$.

Motivation: This is useful in math. For instance, subsets[A,{},k] are k-faces of a simplicial complex with facets A. Similarly, subsets[A,B,k] are generators of the relative chain complex of a simplicial pair with facets A and B.

$\endgroup$
4
  • $\begingroup$ If $\bigcup_{a\in A}\!a$ contains $n$ elements and we write sets as lists of 0's and 1's of length $n$, is there a quick way to check if $\sigma\in\bigcup_{b\in B}\!2^b$? Perhaps with matrix operations? $\endgroup$ Commented Dec 9, 2020 at 20:00
  • $\begingroup$ Is there a way to go through each element of $\bigcup_{a\in A}\!2^a$ only once? (iterator) $\endgroup$ Commented Dec 9, 2020 at 20:30
  • $\begingroup$ It would be useful to include code to generate example data for A and B $\endgroup$ Commented Dec 9, 2020 at 20:35
  • $\begingroup$ @SimonWoods I added the examples. $\endgroup$ Commented Dec 9, 2020 at 21:09

1 Answer 1

2
$\begingroup$

You can use BooleanCountingFunction along with SatisfiabilityInstances. BooleanCountingFunction[{k}, len][v] indicates that exactly k of the variables v are True. BooleanCountingFunction[k-1, len][v] indicates that at most k-1 of the variables v are True. A function that does this is:

KSubsets[A_, B_, k_] := Module[{a, b, s, v, set, bool, instances}, (* exactly k elements from one of the a sets *) a = Or @@ (BooleanCountingFunction[{k}, Length[#]] @@ Thread[s[#]]& /@ A); (* less than k elements from each of the b sets *) b = And @@ (BooleanCountingFunction[k-1, Length[#]] @@ Thread[s[#]]& /@ B); (* variables *) v = BooleanVariables[a && b]; (* exactly k elements *) set = BooleanCountingFunction[{k}, Length[v]] @@ v; (* boolean function *) bool = BooleanConvert[a && b && set, "BFF"]; instances = SatisfiabilityInstances[bool, v, All]; With[{i = v /. s[x_]:>x}, Pick[i, #]& /@ instances ] ] 

Your test data:

SeedRandom[1]; n=30; V=Range@n; nA=30; nB=300; rA={25,27}; rB={23,26}; A = Join@@Subsets[V,rA,{#}]&/@RandomInteger[{1,Sum[Binomial[n,i],{i,rA[[1]],rA[[2]]}]},nA]; B = Join@@Subsets[V,rB,{#}]&/@RandomInteger[{1,Sum[Binomial[n,i],{i,rB[[1]],rB[[2]]}]},nB]; 

Comparison with your function (on a reduced dataset):

r1 = subsets1[A[[;;10]], B[[;;135]], 8]; //AbsoluteTiming r2 = KSubsets[A[[;;10]], B[[;;135]], 8]; //AbsoluteTiming Sort[r1] === Sort[r2] 

{53.4153, Null}

{0.496283, Null}

True

Timing for the complete dataset:

KSubsets[A, B, 9] //Length //AbsoluteTiming KSubsets[A, B, 10] //Length //AbsoluteTiming KSubsets[A, B, 11] //Length //AbsoluteTiming 

{2.1075, 0}

{6.16393, 8}

{21.1174, 4016}

$\endgroup$
1
  • $\begingroup$ Thank you! But its performance seems worse then mine. I observed AbsoluteTiming@MaxMemoryUsed@ when computing all simplices in the triangulation of the real projective space $\mathbb{P}^5$ (i.e. $|V|\!=\!63$, $|A|\!=\!2520$, $|a|\!=\!6$ for all $a\!\in\!A$, $|B|\!=\!0$, $0\!\leq\!k\!\leq\!6$). My subsets1 needs {0.140631, 3856800}, but yours doesn't finish after 10min. Is there a way to fix this? $\endgroup$ Commented Dec 11, 2020 at 11:17

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.