1

I need to fill a column with values, that are present in a set and not present in any other columns.

initial df

 c0 c1 c2 c3 c4 c5 0 4 5 6 3 2 1 1 1 5 4 0 2 3 2 5 6 4 0 1 3 3 5 4 6 2 0 1 4 5 6 4 0 1 3 5 0 1 4 5 6 2 

I need df['c6'] column that is a set-like difference operation product between a set of set([0,1,2,3,4,5,6]) and each row of df

so that the result df is

 c0 c1 c2 c3 c4 c5 c6 0 4 5 6 3 2 1 0 1 1 5 4 0 2 3 6 2 5 6 4 0 1 3 2 3 5 4 6 2 0 1 3 4 5 6 4 0 1 3 2 5 0 1 4 5 6 2 3 

Thank you!

3 Answers 3

2

Slightly different approach:

df['c6'] = sum(range(7)) - df.sum(axis=1) 

or if you want to be more verbose:

df['c6'] = sum([0,1,2,3,4,5,6]) - df.sum(axis=1) 
Sign up to request clarification or add additional context in comments.

2 Comments

but would scale quite well for large dataframes because of vectorisation, I think.
though this approach is not universal, it works best for my particular problem. Thank you!
0

Use numpy setdiff1d to find the difference between the two arrays and assign the output to column c6

ck = np.array([0,1,2,3,4,5,6]) M = df.to_numpy() df['c6'] = [np.setdiff1d(ck,i)[0] for i in M] c0 c1 c2 c3 c4 c5 c6 0 4 5 6 3 2 1 0 1 1 5 4 0 2 3 6 2 5 6 4 0 1 3 2 3 5 4 6 2 0 1 3 4 5 6 4 0 1 3 2 5 0 1 4 5 6 2 3 

Comments

0

A simple way I could think of is using a list comprehension and set difference:

s = {0, 1, 2, 3, 4, 5, 6} s {0, 1, 2, 3, 4, 5, 6} df['c6'] = [tuple(s.difference(vals))[0] for vals in df.values] df c0 c1 c2 c3 c4 c5 c6 0 4 5 6 3 2 1 0 1 1 5 4 0 2 3 6 2 5 6 4 0 1 3 2 3 5 4 6 2 0 1 3 4 5 6 4 0 1 3 2 5 0 1 4 5 6 2 3 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.