I am trying to understand values output from an example python tutorial. The output doesent seem to be in any order that I can understand. The particular python lines are causing me trouble :
vocab_size = 13 #just to provide all variable values m = 84 #just to provide all variable values Y_one_hot = np.zeros((vocab_size, m)) Y_one_hot[Y.flatten(), np.arange(m)] = 1 The input Y.flatten() is evaluated as the following numpy-array :
[ 8 9 7 4 9 7 8 4 8 7 8 12 4 8 9 8 12 7 8 9 7 12 7 2 9 7 8 7 2 0 7 8 12 2 0 8 8 12 7 0 8 6 12 7 2 8 6 5 7 2 0 6 5 10 2 0 8 5 10 1 0 8 6 10 1 3 8 6 5 1 3 11 6 5 10 3 11 5 10 1 11 10 1 3] np arrange is a tensor ranging from 0-83
np.arange(m) [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83] Ok so the output that I am having trouble understanding from the new Y_one_hot is that I recieve a numpy array of size 13 (as expected) but I do not understand why the positions of the ones are located where they are located based on the Y.flatten() input for example here is the first array of the 13:
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] Could someone please explain how I got from that input value to that output array from that single line? It seems like the ones are in random positions and in some other arrays of the 13 the number of ones also seems to be random. Is this the intended behavior?
here is a full runnable example:
import numpy as np import sys import re # turn Y into one hot encoding Y = np.array([ 8, 9, 7, 4 , 9, 7, 8, 4, 8, 7, 8, 12, 4, 8, 9, 8, 12, 7, 8, 9, 7, 12, 7, 2, 9, 7, 8, 7, 2, 0, 7, 8, 12, 2, 0, 8, 8, 12, 7, 0, 8, 6, 12, 7, 2, 8, 6, 5, 7, 2, 0, 6, 5, 10, 2, 0, 8, 5, 10, 1, 0, 8, 6, 10, 1, 3, 8, 6, 5, 1, 3, 11, 6, 5, 10, 3, 11, 5, 10, 1, 11, 10, 1, 3]) m = 84 vocab_size = 13 Y_one_hot = np.zeros((vocab_size, m)) Y_one_hot[Y.flatten(), np.arange(m)] = 1 np.set_printoptions(threshold=sys.maxsize) print(Y_one_hot.astype(int))
Y.flatten()is selecting indices in the first dimension.np.arange(m)is selecting indices in the second dimension. - Using the first item from each -Y_one_hot[8,0] = 1.Is this the intended behavior?- are you asking why your assignment expression worked that way or are you asking if that is the correct way to make the encoding?np.vstack((Y,np.arange(m))).Twill show you how the indices are being paired up. You can see the the 30th entry (np.vstack((Y,np.arange(m))).T[29]) is[0,29]. So your expression is assigning a one toY_one_hot[0,29]- if that still is not making sense to you, you need to spend more time with the Numpy documentation and playing around with the examples - SO isn't a Tutorial. The doc reference linked to in jakevdp's answer is relevant to your question.