2

I am trying to convert an Eigen Sparse Matrix of size NxN to CuSparse matrix so that I can use CUDA to solver the matrix Ax=B. Came across this thread which is pretty old (Convert Eigen::SparseMatrix to cuSparse and vice versa). I am not sure what is the size of *val in this example and what it represents. Would appreciate any help. I am new to CUDA.

1 Answer 1

4

In Eigen's sparse matrix representation:

  • A.valuePtr() / *val points to the non-zero values (length = A.nonZeros()).

  • A.innerIndexPtr() contains the column indices (if row-major) or row indices (if column-major) corresponding to each non-zero value (length = A.nonZeros()).

  • A.outerIndexPtr() contains offsets marking the start of each row (in row-major CSR) or column (in column-major CSC) in the above arrays (length = N + 1).

First, make sure your matrix is compressed:

A.makeCompressed(); 

If you want CSR format explicitly (CuSparse commonly uses CSR), either declare Eigen's matrix as row-major from the beginning:

Eigen::SparseMatrix<float, Eigen::RowMajor> A(N, N); 

or transpose before extracting pointers (do not transpose after):

Eigen::SparseMatrix<float, Eigen::RowMajor> A_CSR = A.transpose(); A_CSR.makeCompressed(); 

Then extract data pointers:

int N = A_CSR.rows(); int nnz = A_CSR.nonZeros(); float* h_val = A_CSR.valuePtr(); int* h_col = A_CSR.innerIndexPtr(); int* h_row = A_CSR.outerIndexPtr(); 

Allocate GPU memory and copy data:

float* d_val; int* d_col; int* d_row; cudaMalloc(&d_val, nnz * sizeof(float)); cudaMalloc(&d_col, nnz * sizeof(int)); cudaMalloc(&d_row, (N + 1) * sizeof(int)); cudaMemcpy(d_val, h_val, nnz * sizeof(float), cudaMemcpyHostToDevice); cudaMemcpy(d_col, h_col, nnz * sizeof(int), cudaMemcpyHostToDevice); cudaMemcpy(d_row, h_row, (N + 1) * sizeof(int), cudaMemcpyHostToDevice); 

Finally, use CuSparse functions (for example, y = A*x):

cusparseScsrmv(handle, CUSPARSE_OPERATION_NON_TRANSPOSE, N, N, nnz, &alpha, descr, d_val, d_row, d_col, d_x, &beta, d_y); 

You can use Eigen's valuePtr/innerIndexPtr/outerIndexPtr this way to copy them to the GPU and let CuSparse handle the rest.

Sign up to request clarification or add additional context in comments.

2 Comments

Not sure why you are transposing. CuSparse supports CSC and CSR format, just like Eigen. Plus, I don't think it's a good idea to transpose after extracting the pointers. But what you should do before getting the pointers is call A.makeCompressed() to ensure that it is indeed that exact format
Thank you for the comment, I am a bit rusty at the topic. Edited my answer based on yours

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.