10

I can't wrap my head around csr_matrix examples in scipy documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html

Can someone explain how this example work?

>>> row = np.array([0, 0, 1, 2, 2, 2]) >>> col = np.array([0, 2, 2, 0, 1, 2]) >>> data = np.array([1, 2, 3, 4, 5, 6]) >>> csr_matrix((data, (row, col)), shape=(3, 3)).toarray() array([[1, 0, 2], [0, 0, 3], [4, 5, 6]]) 

I believe this is following this format.

csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)])

where data, row_ind and col_ind satisfy the relationship a[row_ind[k], col_ind[k]] = data[k].

What is a here?

1
  • 2
    a is the matrix. Commented Nov 11, 2018 at 23:44

5 Answers 5

19

row = np.array([0, 0, 1, 2, 2, 2])
col = np.array([0, 2, 2, 0, 1, 2])
data = np.array([1, 2, 3, 4, 5, 6])

from the above arrays;

for k in 0~5
a[row_ind[k], col_ind[k]] = data[k]

 a row[0],col[0] = [0,0] = 1 (from data[0]) row[1],col[1] = [0,2] = 2 (from data[1]) row[2],col[2] = [1,2] = 3 (from data[2]) row[3],col[3] = [2,0] = 4 (from data[3]) row[4],col[4] = [2,1] = 5 (from data[4]) row[5],col[5] = [2,2] = 6 (from data[5]) 

so let's arrange matrix 'a' in shape(3X3)

a 0 1 2 0 [1, 0, 2] 1 [0, 0, 3] 2 [4, 5, 6] 
Sign up to request clarification or add additional context in comments.

Comments

9

This is a sparse matrix. So, it stores the explicit indices and values at those indices. So for example, since row=0 and col=0 corresponds to 1 (the first entries of all three arrays in your example). Hence, the [0,0] entry of the matrix is 1. And so on.

Comments

4

Represent the "data" in a 4 X 4 Matrix:

data = np.array([10,0,5,99,25,9,3,90,12,87,20,38,1,8]) indices = np.array([0,1,2,3,0,2,3,0,1,2,3,1,2,3]) indptr = np.array([0,4,7,11,14]) 

illustration of CSR_Matrix

  • 'indptr'- Index pointers is linked list of pointers to 'indices' (Column index Pointers)...
  • indptr[i:i+1] represents i to i+1 index of pointer
  • 14 reprents len of Data len(data)... indptr = np.array([0,4,7,11,len(data)]) other way of represenint 'indptr'
  • 0,4 --> 0:4 represents pointers to indices 0,1,2,3
  • 4,7 --> 4:7 represents the pointers of indices 0,2,3
  • 7,11 --> 7:11 represents the pointers of 0,1,2,3
  • 11,14 --> 11:14 represents pointers 1,2,3
# Representing the data in a 4,4 matrix a = csr_matrix((data,indices,indptr),shape=(4,4),dtype=np.int) a.todense() matrix([[10, 0, 5, 99], [25, 0, 9, 3], [90, 12, 87, 20], [ 0, 38, 1, 8]]) 

Another Stackoverflow explanation

Comments

1

As far as I understand, in row and col arrays we have indices which corrensponds to non-zero values in matrix. a[0, 0] = 1, a[0, 2] = 2, a[1, 2] = 3 and so on. As we have no indices for a[0, 1], a[1, 0], a[1, 1] so appropriate values in matrix are equal to 0.

Also, maybe this little intro will be helpful for you: https://www.youtube.com/watch?v=Lhef_jxzqCg

Comments

0

@Rohit Pandey stated correctly, I just want to add an example on that.

When most of the elements of a matrix have 0 values, then we call this a sparse matrix. The process includes removing zero elements from the matrix and thus saving memory space and computing time. We only store non-zero items with their respected row and column index. i.e.

0 3 0 4

0 5 7 0

0 0 0 0

0 2 6 0

We calculate the sparse matrix by putting non-zero items row index first, then column index, and finally non-zero values like the following:

Row 0 0 1 1 3 3
Column 1 3 1 2 1 2
Value 3 4 5 7 2 6

By reversing the process we get the simple matrix form from the sparse form.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.