Remove sublist from list

Question

I want to do the following in Python:

A = [1, 2, 3, 4, 5, 6, 7, 7, 7] C = A - [3, 4] # Should be [1, 2, 5, 6, 7, 7, 7] C = A - [4, 3] # Should not be removing anything, because sequence 4, 3 is not found

So, I simply want to remove the first appearance of a sublist (as a sequence) from another list. How can I do that?

Edit: I am talking about lists, not sets. Which implies that ordering (sequence) of items matter (both in A and B), as well as duplicates.

does order matter? what is the output if A = [1, 2, 3, 4, 5, 6, 7, 3, 8, 5, 4]? — pault
– pault, Commented Apr 9, 2018 at 16:39
so just removed all when multiple matches? or only first match? — Sphinx
– Sphinx, Commented Apr 9, 2018 at 16:40
@Sphinx Only the first match. I updated my question. Apparently a lot of things were not clear. — dimitris93
– dimitris93, Commented Apr 9, 2018 at 16:50

Netwave · Accepted Answer · 2020-12-10 11:28:22Z

31

Use sets:

C = list(set(A) - set(B))

In case you want to mantain duplicates and/or oder:

filter_set = set(B) C = [x for x in A if x not in filter_set]

edited Dec 10, 2020 at 11:28

answered Apr 9, 2018 at 16:39

Netwave

43.2k6 gold badges69 silver badges111 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

dimitris93 Over a year ago

This is not going to maintain duplicates... Right?

dimitris93 Over a year ago

Look at my updated question. I am talking about lists, not sets. Order also matters.

Feng Over a year ago

This will not maintain the original order of the elements. Do something like [i for i in A if i not in B]

Netwave Over a year ago

@Feng, care to read the second part of the answer...

Martin Over a year ago

None of these answers work for B= [4, 3], which as the OP said, should not remove anything. Thanks for the answers though, as they would help those seeking for removing values from lists in a simple and straightforward manner.

|

pault · Accepted Answer · 2018-04-09 18:25:15Z

If you want to remove exact sequences, here is one way:

Find the bad indices by checking to see if the sublist matches the desired sequence:

bad_ind = [range(i,i+len(B)) for i,x in enumerate(A) if A[i:i+len(B)] == B] print(bad_ind) #[[2, 3]]

Since this returns a list of lists, flatten it and turn it into a set:

bad_ind_set = set([item for sublist in bad_ind for item in sublist]) print(bad_ind_set) #set([2, 3])

Now use this set to filter your original list, by index:

C = [x for i,x in enumerate(A) if i not in bad_ind_set] print(C) #[1, 2, 5, 6, 7, 7, 7]

The above bad_ind_set will remove all matches of the sequence. If you only want to remove the first match, it's even simpler. You just need the first element of bad_ind (no need to flatten the list):

bad_ind_set = set(bad_ind[0])

Update: Here is a way to find and remove the first matching sub-sequence using a short circuiting for loop. This will be faster because it will break out once the first match is found.

start_ind = None for i in range(len(A)): if A[i:i+len(B)] == B: start_ind = i break C = [x for i, x in enumerate(A) if start_ind is None or not(start_ind <= i < (start_ind + len(B)))] print(C) #[1, 2, 5, 6, 7, 7, 7]

I am thinking about this question is similar with substring query, why not consider one solution like KMP etc

Sphinx · Accepted Answer · 2018-04-09 17:25:47Z

I considered this question was like one substring search, so KMP, BM etc sub-string search algorithm could be applied at here. Even you'd like support multiple patterns, there are some multiple pattern algorithms like Aho-Corasick, Wu-Manber etc.

Below is KMP algorithm implemented by Python which is from GitHub Gist. PS: the author is not me. I just want to share my idea.

class KMP: def partial(self, pattern): """ Calculate partial match table: String -> [Int]""" ret = [0] for i in range(1, len(pattern)): j = ret[i - 1] while j > 0 and pattern[j] != pattern[i]: j = ret[j - 1] ret.append(j + 1 if pattern[j] == pattern[i] else j) return ret def search(self, T, P): """ KMP search main algorithm: String -> String -> [Int] Return all the matching position of pattern string P in S """ partial, ret, j = self.partial(P), [], 0 for i in range(len(T)): while j > 0 and T[i] != P[j]: j = partial[j - 1] if T[i] == P[j]: j += 1 if j == len(P): ret.append(i - (j - 1)) j = 0 return ret

Then use it to calcuate out the matched position, finally remove the match:

A = [1, 2, 3, 4, 5, 6, 7, 7, 7, 3, 4] B = [3, 4] result = KMP().search(A, B) print(result) #assuming at least one match is found print(A[:result[0]:] + A[result[0]+len(B):])

Output:

[2, 9] [1, 2, 5, 6, 7, 7, 7, 3, 4] [Finished in 0.201s]

PS: You can try other algorithms also. And @Pault 's answers is good enough unless you care about the performance a lot.

Vasilis G. · Accepted Answer · 2021-05-09 23:24:41Z

Here is another approach:

# Returns that starting and ending point (index) of the sublist, if it exists, otherwise 'None'. def findSublist(subList, inList): subListLength = len(subList) for i in range(len(inList)-subListLength): if subList == inList[i:i+subListLength]: return (i, i+subListLength) return None # Removes the sublist, if it exists and returns a new list, otherwise returns the old list. def removeSublistFromList(subList, inList): indices = findSublist(subList, inList) if not indices is None: return inList[0:indices[0]] + inList[indices[1]:] else: return inList A = [1, 2, 3, 4, 5, 6, 7, 7, 7] s1 = [3,4] B = removeSublistFromList(s1, A) print(B) s2 = [4,3] C = removeSublistFromList(s2, A) print(C)

Thanks, you just wrote and s1 when you wanted to say C = removeSublistFromList(s2, A), at the end.

Sam S. · Accepted Answer · 2023-05-31 02:38:41Z

Using numpy:

import numpy as np A = [1, 2, 3, 4, 5, 6, 7, 7, 7] B = [3,4] C = [4,3] list(np.setdiff1d(A,B, assume_unique=True)) output: [1, 2, 5, 6, 7, 7, 7] list(np.setdiff1d(A,C, assume_unique=True)) output: [1, 2, 5, 6, 7, 7, 7]

Collectives™ on Stack Overflow

Remove sublist from list

5 Answers 5

7 Comments

1 Comment

Comments

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

7 Comments

1 Comment

Comments

1 Comment

Comments

Linked

Related