Is this variation of Max-Coverage NP-hard?

Question

Setup

An instance of Max-Coverage is typically defined by a collection of $n$ sets $S = \{s_1, s_2, \dots, s_n\}$, and a budget $k$, where the objective is to select a subset $U\subset S$ such that $$\big|U\big| \leq k,~\text{ and }~\big|\cup_{s\in U}s\big|~\text{ is maximized}.$$

The variation I am interested in is as follows. Instead of being given a collection of sets, we are given a collection of $n$ pairs of sets, $S = \big\{p_1 =\{s_{1, 0}, s_{1, 1}\}, p_2=\{s_{2, 0}, s_{2, 1}\}, \dots, p_n=\{s_{n, 0}, s_{n, 1}\}\big\}$. Further, instead of selecting $k$ sets, we now have to select one set from each pair of sets say $U = \{ s_{1, i_1}, s_{2, i_2}, \dots s_{n, i_n}\}$ s.t. $i_j\in\{0, 1\}$ and $$\big|\cup_{s\in U}s\big|~\text{ is maximized}.$$

Questions

1.) Is it NP-hard to optimally select one set from each pair such their union is maximized?

2.) Let $A$ be the universe of possible set elements and for each $i \leq n$, we have $s_{i, 0} \cup s_{i, 1} = A$ and $s_{i, 0} \cap s_{i, 1} = \emptyset$.

Steven · Accepted Answer · 2021-07-13 17:48:08Z

Regarding 1. Consider a CNF-SAT formula $\phi$ with $n$ variables $x_1, \dots, x_n$ and $m$ clauses $C_1, \dots, C_m$. For $i=1,\dots,n$ define $s_{i,0} = \{ C_j \mid \overline{x}_i \in C_j\}$ and $s_{i,1} = \{ C_j \mid x_i \in C_j\}$.

The formula $\phi$ is satisfiable if and only if it is possible to select one $s^*_i \in \{s_{i,0}, s_{i,1}\}$ for each $i$ such that $\cup_{i=1}^n s^*_i = \{C_1, \dots, C_m\}$, i.e., $\left| \cup_{i=1}^n s^*_i \right|=m$. This shows that your variant of the problem is $\mathsf{NP}$-hard.

Regarding 2. The problem is equivalent to satisfying the maximum number of clauses in the CNF-SAT formula $\phi$ constructed as follows: create a variable $x_i$ for each pair $p_i = \{s_{i,0}, s_{i,1}\}$ and a clause $C_j$ for each element $a_j \in A$, where $C_j = \bigg( \bigvee\limits_{i : a_j \in s_{i,0}} \overline{x}_i \bigg) \vee \bigg( \bigvee\limits_{i : a_j \in s_{i,1}} x_i \bigg)$. Let $m = |A|$ be the number of clauses of $\phi$.

We say that two clauses are equivalent if they contain exactly the same literals and we look at the equivalence classes $\mathcal{C}_1, \mathcal{C}_2, \dots, \mathcal{C}_{\ell}$ of the set of clauses w.r.t. the relation "being equivalent".

Since each clause $C_j$ contains all variables, $C_j$ is false only in $1$ out of the $2^n$ possible truth assignments, and so are all clauses that are equivalent to $C_j$. This means that, if $\ell < 2^n$, there is always a truth assignment that satisfies all clauses.

We now consider the complementary case, namely $\ell \ge 2^n$. In this case it is impossible to satisfy all clauses simultaneously, i.e., in any truth assignment there is at least one clause equivalence class $\mathcal{C}$ such that all clauses in $\mathcal{C}$ are false.

Among all these classes we pick a class $\mathcal{C}^*$ that minimizes $|\mathcal{C}^*|$. From the above discussion we know that the formula $\phi^*$ obtained from $\phi$ by deleting the clauses in $\mathcal{C}^*$ is satisfiable and, by our choice of $\mathcal{C}^*$, we know that $m - |\mathcal{C}^*|$ is exactly the maximum number of satisfiable clauses of $\phi$.

All of the above can be done in polynomial-time and shows that this version of your problem is not $\mathsf{NP}$-hard, unless $\mathsf{P}=\mathsf{NP}$.

Stack Exchange Network

Is this variation of Max-Coverage NP-hard?

1 Answer 1

Hot Network Questions

Is this variation of Max-Coverage NP-hard?

1 Answer 1

Related

Hot Network Questions