Planted Matching in Hypergraphs

Question

I’m working on the planted perfect matching problem in random $k$-uniform hypergraphs $k \ge 3$, and I’m stuck on rigorizing the impossibility (lower bound) side of what looks like the information-theoretic threshold. I’d be very grateful for references or structural ideas.

Model

Let $k \ge 3$ and $n \in \mathbb{N}$. Consider a $k$-uniform hypergraph $H$ on $N = kn$ vertices, either:

$k$-partite with parts of size $n$, or
a non-partite $k$-uniform model (both versions behave similarly for my purposes).

We generate $H$ by:

Planted perfect matching $M^*$: a fixed perfect matching of size $n$ (partitioning the vertices).
Noise edges $U$: we either
- include each potential $k$-tuple as a noise edge independently with probability
  $$ p = \frac{d}{n^{k-1}} $$ (so $\mathbb{E}|U| \approx dn$), or
- equivalently, condition on $|U| = m = dn$.

We only observe the unlabeled hypergraph $$ H = M^* \cup U, $$ and the inference problem is to recover the planted matching $M^*$.

What is already known / what I have

For a fixed $\alpha \in (0,1)$, call a perfect matching $M$ a $t$-swap matching (or “at distance $t$ from $M^*$”) if $$ |M \triangle M^*| = 2t, $$ i.e., $M$ uses exactly $t = \alpha n$ noise edges and $n - t$ planted edges.

Using a first-moment / union-bound argument over all $t$-swap matchings (for linear $t$), I compute an explicit rate function and get a threshold of the form $$ d^*(k) \approx e^{k-1}-1 \quad (k \to \infty), $$ such that (roughly):

If $d < d^*(k)$, then for each fixed $\alpha$, the expected number of $t$-swap matchings is $\exp\{-c(\alpha,k)\,n\}$; summing over $t$ shows that with high probability $M^*$ is the unique perfect matching. This is the achievability (upper) bound.

I am now trying to prove the converse:

If $d > d^*(k)$, then exact recovery of $M^*$ is impossible, because there are exponentially many alternative perfect matchings indistinguishable from $M^*$.

Approach: Kahn-style reverse deletion and the obstruction

I’m trying to adapt Jeff Kahn’s reverse-deletion martingale method from his proof of Shamir’s problem.

Start with the full noise set $U$ and delete noise edges in a uniform random order: $$ e_1, e_2, \dots, e_{|U|}. $$
Let $$ H_i = M^* \cup (U \setminus \{e_1,\dots,e_i\}) $$ be the hypergraph after $i$ deletions.

Fix a deviation $t = \alpha n$. Let $\mathcal{F}_t(H_i)$ be the set of $t$-swap perfect matchings in $H_i$, and define $$ A_i := |\mathcal{F}_t(H_i)|. $$

When we delete $e_i$, a fraction of the matchings are killed: $$ A_i = A_{i-1}(1 - \xi_i), $$ where $\xi_i$ is the “heaviness” of edge $e_i$: the fraction of $t$-swap matchings that contain $e_i$.

Conditionally, one can compute $$ \gamma_i := \mathbb{E}[\xi_i \mid H_{i-1}] = \frac{t}{m_i}, $$ where $m_i = |U \cap E(H_{i-1})|$. Summing $\gamma_i$ over a certain “log–linear” window $m \in [dn, \theta n \log n]$, one exactly recovers the exponent from the first-moment threshold $d^*(k)$. So if we could replace $\log(1-\xi_i)$ by $-\xi_i$, we would match the upper bound.

The problem is controlling the second-order loss $$ Z := \sum_i \big(\log(1-\xi_i) + \xi_i\big) \le 0, $$ which is essentially dictated by the sum of squares $\sum_i \xi_i^2$.

Define $$ D_m := \log(1 - \bar{p}_m) - \frac{1}{m}\sum_e \log(1 - p_e), $$ where:

$p_e$ is the inclusion probability of edge $e$ under the uniform measure on $\mathcal{F}_t(H_m)$, and
$\bar{p}_m = t/m$.

Then one has $$ \mathbb{E}[\log(1-\xi_i)\mid H_m] = \log(1 - \bar{p}_m) - D_m, $$ so $$ Z = -\sum_m D_m. $$

Analytically, using a Jensen / Taylor argument for $f(x) = \log(1-x)$, one can show $$ D_m \;\lesssim\; \operatorname{Var}_m(p_e), $$ where the variance is over a uniform random edge $e$ in $H_m$.

Thus, what I really need is a bound of the form $$ \sum_m D_m \;\le\; K_k\,n $$ for some $K_k = K(k)$ (ideally decaying like $1/d$ as $d$ increases). A sufficient condition would be something like $$ \operatorname{Var}_m(p_e) \ll \bar{p}_m^2 $$ uniformly in $m$ in the $[dn, \theta n \log n]$ window, or, more concretely, a “no heavy edges” statement: no edge is contained in more than a constant multiple of the typical fraction $\bar{p}_m = t/m$ of $t$-swap matchings.

This is precisely where I get stuck: I don’t know how to control the edge marginals $p_e$ in this conditional planted ensemble.

I’ve been working on this for a while; another professor recently told me the Kahn-style martingale approach is probably doomed in this sparse regime, but I’d really like to understand whether there is already something in the literature that rules it out or, conversely, a technique (SSM, cluster expansion, switching, …) that could give the needed variance bounds. As a sanity check, I’ve simulated the $k=3$ case with $n$ from $[50,100]$ using a SAT solver to test existence of alternate perfect matchings; empirically the threshold where the probability of an alternate matching jumps from $\sim 0.5$ to $\sim 1$ matches the $d^*(3)$ I computed, but of course that’s not a proof.

Cross posted on Math Overflow

Why can't you black-box JKV? It seems to me like you just need their result that with very high probability you're within an exponential factor of the expected number of matchings, or am I missing something? — Bruno Andrades
– Bruno Andrades, Commented Nov 19 at 1:43
And that every set is in the right number of perfect matchings* — Bruno Andrades
– Bruno Andrades, Commented Nov 19 at 1:46
The problem is the jkv result assumes some uniformity/symmetry because they are working in the dense regime i.e. the martingale stops early but dealing with xi_I squared term later on is very hard because the edge marginals are correlated I.e there could just be a couple of edges that matter a lot (heavy so no symmetric) — Leo V.
– Leo V., Commented Nov 19 at 1:47
That's why I'm saying to black-box their result. What they do already implies there's no heavy edges a.a.s. unless I'm misremembering — Bruno Andrades
– Bruno Andrades, Commented Nov 19 at 1:50
It could also help a lot if you could take a moment to state the problem properly and redact your question a bit more clearly — Bruno Andrades
– Bruno Andrades, Commented Nov 19 at 2:01

Stack Exchange Network

Planted Matching in Hypergraphs

Model

What is already known / what I have

Approach: Kahn-style reverse deletion and the obstruction

0

You must log in to answer this question.

Hot Network Questions

Planted Matching in Hypergraphs

Model

What is already known / what I have

Approach: Kahn-style reverse deletion and the obstruction

0

You must log in to answer this question.

Related

Hot Network Questions