I am considering the following probabilistic balls-into-bins model. There are $n$ bins and two types of balls. For each type, there are $\rho$ balls. Each ball independently lands in bin $i$ with probability $p_i$, where $\sum_{i=1}^{n} p_i = 1$.
Define the following random variables:
- Let $X_i$ be the number of type-1 balls in bin $i$.
- Let $Y_i$ be the number of type-2 balls in bin $i$.
- A collision in bin $i$ occurs if there is at least one ball of each type in bin $i$, i.e., if $X_i \geq 1$ and $Y_i \geq 1$.
- Let $C_i$ be the indicator random variable for a collision in bin $i$, so $C_i = \mathbf{1}_{X_i \geq 1, Y_i \geq 1}$.
I am interested in bounding the probability that there is no collision anywhere, i.e., $$ P\left( \sum_{i=1}^{n} C_i = 0 \right) = P(\forall i, C_i = 0), $$ with an upper bound $O(n^{-c})$ where $\rho = O(\sqrt{n \log n})$ (the constant $c$ is hidden inside $\rho$).
Challenges:
A natural approach is to use the independence of the bins. However, the random variables $C_i$ are not independent, since they depend on the shared total number of balls of each type. This prevents a straightforward product-bound.
A common technique for approximating such problems is the Poisson approximation (see e.g., Mitzenmacher & Upfal, Probability and Computing), which often gives a good upper bound up to a factor of 2. My goal is to use this approximation to obtain a tight upper bound, possibly followed by Jensen’s inequality or other methods.
Roadblock:
The usual Poisson approximation applies to a single balls-into-bins process, while here we have two dependent processes. One idea I had was to define an alternative single-process model:
- Consider $2\rho$ balls of one type.
- Instead of $n$ bins, use $2n$ bins, indexed by $\{1, \dots, n, n+1, \dots, 2n\}$.
- Place each ball in bin $i$ or bin $i+n$ with probability $0.5 p_i$, so that the probability of either bin $i$ or bin $i+n$ receiving a ball is still $p_i$.
- I define a collision in this model as bins $i$ and $i+n$ both containing a ball.
I thought that this new process may upper bound the original one, and I was hoping to use the Poisson approximation on it. However, I am unsure how to formally argue that this new model indeed provides the desired upper bound for the original process.
Question:
How can I rigorously bound $P(\forall i, C_i = 0)$ using the Poisson approximation or another technique? Specifically, is my proposed transformation into a single-process model valid for upper bounding the original probability? If not, is there another way to effectively apply the Poisson approximation in this setting?