0

I have two list of tuples,A and B, that stores the pairs of data ids, and I want to remove pairs from A if the pair (x,y) or (y,x) in B are also in A.

I tried to do it by using for loop like below,

A = [(321, 33), (56, 991), (645, 2), (8876, 556)] B = [(33, 321), (645, 2)] for pair in B: if pair in A: A.remove(pair) elif (pair[1], pair[0]) in A: A.remove((pair[1], pair[0])) print(A) # [(56, 991), (8876, 556)] 

but when the elements in list is large, this code runs very slowly.

So I want to make this code faster, possibly avoid using for loops or completely different way. Can someone help me to solve this problem?

2
  • 0) dont user remove. You are iterating over the whole list everytime trying to match the value to remove. Iterate over your list using an index value; 1) which list is larger, A or B? currently you are iterating over B once and over A multiple times (len(B) times); 2) can you make A sorted, somehow? 3) Dont Commented Dec 21, 2021 at 7:08
  • Sounds like it might make more sense for you to be using sets instead of lists in the first place; and possibly frozensets instead of tuples. Commented Dec 21, 2021 at 7:58

3 Answers 3

2

If A has unique items, you can convert both lists to sets and use set difference, with the caveat that you add the reversed tuples into the B_set:

set_B = set(B).union([(j,i) for (i,j) in B]) out = list(set(A) - set_B) 

Output:

[(321, 33), (645, 2)] 

If A doesn't have unique items and you want to keep duplicate items, you can use list comprehension for the second line instead:

set_B = set(B).union([(j,i) for (i,j) in B]) out = [tpl for tpl in A if tpl in set_B] 
Sign up to request clarification or add additional context in comments.

1 Comment

@AppleBS I mean the real performance will show for large lists. I ran my solution and OP's solution for len(A)=1000, len(B)=100 lists and my solution is 58 times faster at 159us vs 9.39 ms.
0

Provided that you use sets instead of lists and that the second set contains both variants of elements ((x, y) and (x, y))…

A = {(321, 33), (56, 991), (645, 2), (8876, 556)} B = {(33, 321), (645, 2), (321, 33), (2, 645)} 

You only need a set subtraction:

print(A - B) 

Comments

0

Instead of removing, create a new list by excluding the common elements:

[(x,y) for x,y in A if (x,y) not in B and (y,x) not in B] 

Output:

[(56, 991), (8876, 556)]

2 Comments

[(x, y) for x, y in A if (x, y) not in B and (y, x) not in B]
I ran a simple benchmark of your code vs kalcium's code. Your code is significantly slower at 145ms vs 101ms when looped 100000 times.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.