Skip to main content

Minimum number of searches to find matchinglength of sublist so that they have one element in two listcommon

(Since this post is just a copy of math SE question I am modifying it to Cryoto SE)

I have two list, say list1 and list2, containing random integers and both having size $N$. We have a fixed overlap ratio in range $(0, 1)$between them. If overlap ratio is $0.3$ and $N = 10$ it means both list have around $3$ integers in common. However this doesn’t guarantee that the indices for these elements will be same. In most of the cases they will be different.

I want help regarding the minimum number of brute force searches I need to perform to find an element that occurs in both list.

Since we need to find a common element, which is guaranteed by the overlap ratio, We start by creating two list, say $A$ and $B$, and add element into them from list1 and list2 until we have: $|A \cap B| = 1$. That is both sub lists have at least one element in common.

The final goal: Minimum number of length of sublists (number of elements in them) so that both have at-least one element in common.

Here is how search is performedsublist are created: YouWe pick a random index, $i$, in range $(0, N)$ and create two sub lists: sublist1 and sublist2.

Step 3: Check if sublist1 and sublist2 have common elements irrespective of their indices in sublist. Assume that this step is also very simple and completes instantly.

I wrote a python program with overlap rationratio $0.3$ and here are the results

  1. Given $N = 10^{80}$ and overlap ratio $=0.3$ what will be the minimum number of trials that ensure there is atleast one element in both sublist?
  2. What will be the cost of implementing this (as asked in 1) and whether it's feasible?
  3. IfWhether this matching follows a birthday bound, if yes, then why?

The final goal: Minimum number of trials required to find an element of list2 that is also present in list1. And in both sublist.

Example:

Here we have $9651$ in sublist2 which occur in sublist1. Hence here trials werethe length is $7$.

Minimum number of searches to find matching element in two list

I have two list, say list1 and list2, containing integers and both having size $N$. We have a fixed overlap ratio in range $(0, 1)$. If overlap ratio is $0.3$ and $N = 10$ it means both list have around $3$ integers in common. However this doesn’t guarantee that the indices for these elements will be same. In most of the cases they will be different.

Here is how search is performed: You pick a random index, $i$, in range $(0, N)$ and create two sub lists: sublist1 and sublist2.

Step 3: Check if sublist1 and sublist2 have common elements irrespective of their indices in sublist

I wrote a python program with overlap ration $0.3$ and here are the results

  1. Given $N = 10^{80}$ and overlap ratio $=0.3$ what will be the minimum number of trials that ensure there is atleast one element in both sublist?
  2. What will be the cost of implementing this (as asked in 1) and whether it's feasible?
  3. If this matching follows a birthday bound, if yes, then why?

The final goal: Minimum number of trials required to find an element of list2 that is also present in list1. And in both sublist.

Example:

Here we have $9651$ in sublist2 which occur in sublist1. Hence here trials were $7$.

Minimum number of length of sublist so that they have one element in common

(Since this post is just a copy of math SE question I am modifying it to Cryoto SE)

I have two list, say list1 and list2, containing random integers and both having size $N$. We have a fixed overlap ratio between them. If overlap ratio is $0.3$ and $N = 10$ it means both list have around $3$ integers in common. However this doesn’t guarantee that the indices for these elements will be same. In most of the cases they will be different.

I want help regarding the minimum number of brute force searches I need to perform to find an element that occurs in both list.

Since we need to find a common element, which is guaranteed by the overlap ratio, We start by creating two list, say $A$ and $B$, and add element into them from list1 and list2 until we have: $|A \cap B| = 1$. That is both sub lists have at least one element in common.

The final goal: Minimum number of length of sublists (number of elements in them) so that both have at-least one element in common.

Here is how sublist are created: We pick a random index, $i$, in range $(0, N)$ and create two sub lists: sublist1 and sublist2.

Step 3: Check if sublist1 and sublist2 have common elements irrespective of their indices in sublist. Assume that this step is also very simple and completes instantly.

I wrote a python program with overlap ratio $0.3$ and here are the results

  1. Given $N = 10^{80}$ and overlap ratio $=0.3$ what will be the minimum number of trials that ensure there is atleast one element in both sublist?
  2. What will be the cost of implementing this (as asked in 1) and whether it's feasible?
  3. Whether this matching follows a birthday bound, if yes, then why?

Example:

Here we have $9651$ in sublist2 which occur in sublist1. Hence here the length is $7$.

Minor formatting
Source Link

I have two list, say list1list1 and list2list2, containing integers and both having size $N$. We have a fixed overlap ratio in range $(0, 1)$. If overlap ratio is $0.3$ and $N = 10$ it means both list have around $3$ integers in common. However this doesn’t guarantee that the indices for these elements will be same. In most of the cases they will be different.

$N$ denotes the length of list. Then I ran the code multiple times and after running it multiple times I calculated the average trials needed that is in Avg Trials. % of N% of N indicates the percentage of range needed to search (average).

The final goal: Minimum number of trials required to find an element of list2list2 that is also present in list1list1. And in both sublist.

Trial 7At 7th trial the picture looks like

Here we have $9651$ in sublist2 which occur in sublist1. Hence here trials arewere $7$.

I have two list, say list1 and list2, containing integers and both having size $N$. We have a fixed overlap ratio in range $(0, 1)$. If overlap ratio is $0.3$ and $N = 10$ it means both list have around $3$ integers in common. However this doesn’t guarantee that the indices for these elements will be same. In most of the cases they will be different.

$N$ denotes the length of list. Then I ran the code multiple times and after running it multiple times I calculated the average trials needed that is in Avg Trials. % of N indicates the percentage of range needed to search (average).

The final goal: Minimum number of trials required to find an element of list2 that is also present in list1. And in both sublist.

Trial 7 the picture looks like

Here we have $9651$ in sublist2 which occur in sublist1. Hence here trials are $7$.

I have two list, say list1 and list2, containing integers and both having size $N$. We have a fixed overlap ratio in range $(0, 1)$. If overlap ratio is $0.3$ and $N = 10$ it means both list have around $3$ integers in common. However this doesn’t guarantee that the indices for these elements will be same. In most of the cases they will be different.

$N$ denotes the length of list. Then I ran the code multiple times and after running it multiple times I calculated the average trials needed that is in Avg Trials. % of N indicates the percentage of range needed to search (average).

The final goal: Minimum number of trials required to find an element of list2 that is also present in list1. And in both sublist.

At 7th trial the picture looks like

Here we have $9651$ in sublist2 which occur in sublist1. Hence here trials were $7$.

Source Link

Minimum number of searches to find matching element in two list

I am aware that cross posting isn't typically allowed on SE but I think this question, earlier posted on math SE, is worth considering on crypto SE.

I have two list, say list1 and list2, containing integers and both having size $N$. We have a fixed overlap ratio in range $(0, 1)$. If overlap ratio is $0.3$ and $N = 10$ it means both list have around $3$ integers in common. However this doesn’t guarantee that the indices for these elements will be same. In most of the cases they will be different.

Here is how search is performed: You pick a random index, $i$, in range $(0, N)$ and create two sub lists: sublist1 and sublist2.

Step 1: Send list1[i] element to sublist1

Step 2: Send list2[i] element to sublist2

Step 3: Check if sublist1 and sublist2 have common elements irrespective of their indices in sublist

Step 4: If no common element is found increase $i$ with $1$ and repeat. If $i = N +1$ set $i = 1$ again.

I wrote a python program with overlap ration $0.3$ and here are the results

N Avg Trials % of N √N Ratio (Trials/√N) 120 17 14.17% 11 1.55 12,000 177 1.47% 110 1.61 100,000 539 0.539% 316 1.70 1,000,000 1,584 0.1584% 1,000 1.58 10,000,000 4,975 0.04975% 3,162 1.57 

$N$ denotes the length of list. Then I ran the code multiple times and after running it multiple times I calculated the average trials needed that is in Avg Trials. % of N indicates the percentage of range needed to search (average).

I am not good at probability. This suggests that it might be following the birthday bound. If someone could help me with more details and help generalise these results. I seek help for following:

  1. Given $N = 10^{80}$ and overlap ratio $=0.3$ what will be the minimum number of trials that ensure there is atleast one element in both sublist?
  2. What will be the cost of implementing this (as asked in 1) and whether it's feasible?
  3. If this matching follows a birthday bound, if yes, then why?

The final goal: Minimum number of trials required to find an element of list2 that is also present in list1. And in both sublist.

Example:

List1 = $[52579, 46780, 36228, 40369, 26694, 36744, 9651, 98806, 5305, 64566]$

List2 = $[46780, 56910, 80182, 63250, 53981, 5305, 76771, 9651, 61251, 76094]$

Let $i = 2$

Trial 1

sublist1 = $[46780]$

sublist2 = $[56910]$

Since there is no matching element we include next elements

Thus $i = 2+1 = 3$.

sublist1 = $[46780, 36228]$

sublist2 = $[56910, 80182]$

Again there is no matching element so we keep including more additional element.

Trial 7 the picture looks like

sublist1 = $[46780, 36228, 40369, 26694, 36744, 9651, 98806]$

sublist2 = $[56910, 80182, 63250, 53981, 5305, 76771, 9651]$

Here we have $9651$ in sublist2 which occur in sublist1. Hence here trials are $7$.