40

Given an array 'a' I would like to sort the array by columns sort(a, axis=0) do some stuff to the array and then undo the sort. By that I don't mean re sort but basically reversing how each element was moved. I assume argsort() is what I need but it is not clear to me how to sort an array with the results of argsort() or more importantly apply the reverse/inverse of argsort()

Here is a little more detail

I have an array a, shape(a) = rXc I need to sort each column

aargsort = a.argsort(axis=0) # May use this later aSort = a.sort(axis=0) 

now average each row

aSortRM = asort.mean(axis=1) 

now replace each col in a row with the row mean. is there a better way than this

aWithMeans = ones_like(a) for ind in range(r) # r = number of rows aWithMeans[ind]* aSortRM[ind] 

Now I need to undo the sort I did in the first step. ????

2
  • 1
    Why can't you just make a copy of the array: a.copy() before any transformations or use aSort = numpy.sort(axis=0) (that will return sorted copy)? btw, a.sort() returns nothing therefore there is no point to assign its return value. Commented Mar 20, 2010 at 16:05
  • @J.F. Sebastian, Thanks you are right I fixed it Commented Mar 20, 2010 at 16:46

7 Answers 7

87

There are probably better solutions to the problem you are actually trying to solve than this (performing an argsort usually precludes the need to actually sort), but here you go:

>>> import numpy as np >>> a = np.random.randint(0,10,10) >>> aa = np.argsort(a) >>> aaa = np.argsort(aa) >>> a # original array([6, 4, 4, 6, 2, 5, 4, 0, 7, 4]) >>> a[aa] # sorted array([0, 2, 4, 4, 4, 4, 5, 6, 6, 7]) >>> a[aa][aaa] # undone array([6, 4, 4, 6, 2, 5, 4, 0, 7, 4]) 
Sign up to request clarification or add additional context in comments.

1 Comment

Very descriptive variable names ;)
13

For all those still looking for an answer:

In [135]: r = rand(10) In [136]: i = argsort(r) In [137]: r_sorted = r[i] In [138]: i_rev = zeros(10, dtype=int) In [139]: i_rev[i] = arange(10) In [140]: allclose(r, r_sorted[i_rev]) Out[140]: True 

1 Comment

BTW, zeros(10, dtype=int) can be replaced by zeros_like(i).
9

I'm not sure how best to do it in numpy, but, in pure Python, the reasoning would be:

aargsort is holding a permutation of range(len(a)) telling you where the items of aSort came from -- much like, in pure Python:

>>> x = list('ciaobelu') >>> r = range(len(x)) >>> r.sort(key=x.__getitem__) >>> r [2, 4, 0, 5, 1, 6, 3, 7] >>> 

i.e., the first argument of sorted(x) will be x[2], the second one x[4], and so forth.

So given the sorted version, you can reconstruct the original by "putting items back where they came from":

>>> s = sorted(x) >>> s ['a', 'b', 'c', 'e', 'i', 'l', 'o', 'u'] >>> original = [None] * len(s) >>> for i, c in zip(r, s): original[i] = c ... >>> original ['c', 'i', 'a', 'o', 'b', 'e', 'l', 'u'] >>> 

Of course there are going to be tighter and faster ways to express this in numpy (which unfortunately I don't know inside-out as much as I know Python itself;-), but I hope this helps by showing the underlying logic of the "putting things back in place" operation you need to perform.

Comments

8

Super late to the game, but here:

import numpy as np N = 1000 # or any large integer x = np.random.randn( N ) I = np.argsort( x ) J = np.argsort( I ) print( np.allclose( x[I[J]] , x ) ) >> True 

Basically, argsort the argsort because the nth element of the reverse sort is J[n] = k : I[k] = n. That is, I[J[n]] = n, so J sorts I.

4 Comments

This is by far the best solution
It is a good answer, but how does it differ from @paul's solution?
Thanks Dexter. Its the same. Clearer, more generally tested, and actually explained instead of just given. I'm obviously biased though. But same as @paul.
that doesn't undo the argsort. I[J] is just 0,1,2,3,4,5,6,... so it is just a no-op. Instead you should do x[I][J]
2

indices=np.argsort(a) gives you the sorting indices, such that x = a[indices] is the sorted array. y = b[indices] pushs forward the array b to sorted domain. c[indices] = z pulls back z from sorted domain into c in source domain.

For instance,

import numpy as np n = 3 a = np.random.randint(0,10,n) # [1, 5, 2] b = np.random.randn(n) # [-.1, .5, .2] c = np.empty_like(a) # indices that sort a: x=a[indices], x==np.sort(a) is all True indices = np.argsort(a) # [0,2,1] # y[i] is the value in b at the index of the i-th smallest value in a y = b[indices] # [-.1, .2, .5] # say z[i] is some value related to the i-th smallest entry in a z = np.random.randn(n) # [-1.1, 1.2, 1.3] c[indices] = z # inverted the sorting map, c = [-1.1, 1.3, 1.2] 

Comments

1

I was not able to follow your example, but the more abstract problem--i.e., how to sort an array then reverse the sort--is straightforward.

import numpy as NP # create an 10x6 array to work with A = NP.random.randint(10, 99, 60).reshape(10, 6) # for example, sort this array on the second-to-last column, # breaking ties using the second column (numpy requires keys in # "reverse" order for some reason) keys = (A[:,1], A[:,4]) ndx = NP.lexsort(keys, axis=0) A_sorted = NP.take(A, ndx, axis=0) 

To "reconstruct" A from A_sorted is trivial because remember that you used an index array ('ndx') to sort the array in the first place.

# ndx array for example above: array([6, 9, 8, 0, 1, 2, 4, 7, 3, 5]) 

In other words, the 4th row in A_sorted was the 1st row in the original array, A, etc.

1 Comment

I actually want to sort each column individually, I correct my code at the top but I need to work with np.sort(A, axis=0) so the inex would be np.argsort(x, axis=0)
1

A faster alternative to getting the inverse of argsort, inspired by Alex Martelli

def get_inv (argsort): argsort_inv = np.arange(len(argsort)) argsort_inv[argsort] = argsort_inv.copy() return argsort_inv 

For arrays greater in size than 1E4 I see a 10x performance gain

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.