2

I have numpy array:

A = np.array(['abcd','bcde','cdef']) 

I need hash array of A: with function

B[i] = ord(A[i][1]) * 256 + ord(A[i][2]) B = np.array([ord('b') * 256 + ord('c'), ord('c') * 256 + ord('d'), ord('d') * 256 + ord('e')]) 

How I can do it?

2 Answers 2

1

Based on the question, I assume the string are ASCII one and all strings have a size bigger than 3 characters.

You can start by converting strings to ASCII one for sake of performance and simplicity (by creating a new temporary array). Then you can merge all the string in one big array without any copy thanks to views (since Numpy strings are contiguously stored in memory) and you can actually convert characters to integers at the same time (still without any copy). Then you can use the stride so to compute all the hash in a vectorized way. Here is how:

ascii = A.astype('S') buff = ascii.view(np.uint8) result = buff[1::ascii.itemsize]*256 + buff[2::ascii.itemsize] 
Sign up to request clarification or add additional context in comments.

2 Comments

Super!! And if I have input "A" array with different length of string, how I can cat it to length 3: ascii = A.astype('S3') ?
Yes, S3 does the job. Not that this is not required here (Numpy use a fixed-size buffer per string internally and use null delimiter to support variable-sized string). Howerver, this should be faster to convert only 3 characters per string rather than much more from the unicode string.
0

Congratulation! Speed increase four times!

import time import numpy as np Iter = 1000000 A = np.array(['abcd','bcde','cdef','defg'] * Iter) Ti = time.time() B = np.zeros(A.size) for i in range(A.size): B[i] = ord(A[i][1]) * 256 + ord(A[i][2]) DT1 = time.time() - Ti Ti = time.time() ascii = A.astype('S') buff = ascii.view(np.uint8) result = buff[1::ascii.itemsize]*256 + buff[2::ascii.itemsize] DT2 = time.time() - Ti print("Equal = %s" % np.array_equal(B, result)) print("DT1=%7.2f Sec, DT2=%7.2f Sec, DT1/DT2=%6.2f" % (DT1, DT2, DT1/DT2)) 

Output:

Equal = True

DT1= 3.37 Sec, DT2= 0.82 Sec, DT1/DT2= 4.11

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.