Given the clarifications added to the question, several of the answers posted here do not actually provide the expected answer, and others are not efficient.
One solution is to take @dreamcrash's answer but using sets for the look-up:
def order_by1(vals, xor): set_vals = set(vals) set_xor = set(xor) return [v for v in xor if v in set_vals] + [v for v in vals if v not in set_xor]
This should increase efficiency by removing the time-consuming (O(n)) list lookup on each iteration of the loop.
Since you don't care about the order of the values that are not found in xor, a variant of this is to replace the 2nd list comprehension with a set operation:
def order_by2(vals, xor): set_vals = set(vals) return [v for v in xor if v in set_vals] + list(set_vals - set(xor))
Another solution is to use ordered sets
def order_by3(vals, xor): vals_set = OrderedSet(vals) return (OrderedSet(xor) & vals_set) | vals_set
Ordered sets use dictionaries under the hood. We can instead use dictionaries directly, making use of the fact that they are ordered:
def order_by4(vals, xor): """ Order list a by each item's position in list xor, if it is found in xor, otherwise by its position in list vals """ d = {k: False for k in xor} for k in vals: d[k] = True return [k for k, v in d.items() if v]
The only other solution that actually does what is required is the first one in @schwobaseggl's answer. Timing these with lists of 800,000 and 300,000 random phrases of which 150,000 overlap:
import random import string import timeit def random_phrase(): return "".join(random.choices(string.ascii_letters, k=10)) def generate_lists(M, N, overlap): a = [random_phrase() for _ in range(M)] b = ([random_phrase() for _ in range(N - overlap)] + random.sample(a, k=overlap)) random.shuffle(b) return a, b def dreamcrash(vals, xor): return [v for v in xor if v in vals] + [v for v in vals if v not in xor] def schwobaseggl(vals, xor): order = {n: i for i, n in enumerate(xor)} len_xor = len(xor) # saves some time when sorting return sorted(vals, key=lambda x: order.get(x, len_xor)) vals, xor = generate_lists(800000, 300000, overlap=150000) # this takes a while for f in [dreamcrash, order_by1, order_by2, order_by3, order_by4, schwobaseggl]: print(f, end='...') print(timeit.timeit(stmt=f"{f.__name__}(vals, xor)", number=1, globals=globals()))
I gave up on timing dreamcrash because it was taking too long. order_by2 seems fastest, followed by order_by4, order_by1, and schwobaseggl, each taking around .5 - 1.5s on my computer. The ordered set solution is a lot slower. Checking whether a set contains an item, and setting an item in a dictionary, are both O(1) in the average case, O(n) in the worst case which explains why the dictionary and set based versions have similar performance.