0

I have a csv file that I'm trying to convert its data into Dense Design Matrix using Pylearn2 module. However what causes error is not related to pylearn2 but to my personal implementation.

import csv def load_data(fileName_X, start, stop): with open(fileName_X, 'r') as f: reader_X = csv.reader(f, delimiter=';') X = [] header = True size=0 for row_X in reader_X: if header: header = False continue row_X = [float(elem_X) for elem_X in row_X] size+=1 X.append(row_X) X = np.asarray(X) if __name__ == "__main__": train = load_data(sys.argv[1], 0, 10) 

Here is some rows of the .csv file:

-1;-0.844511;-0.339286;-1;0.0769231;-0.25;-0.929825;1;1;-0.880597;1;0;-0.92;-0.99 1;-0.796992;-0.8275;-1;0.0769231;-0.25;-0.859649;1;1;-0.671642;1;-1;-0.8;-0.94 1;-0.611429;-0.875;-1;0.0769231;-0.25;-0.929825;1;1;-0.850746;1;-1;-0.84;-0.88446 -1;-0.661654;-0.119286;-1;0.846154;0.75;-0.754386;1;1;-0.820896;-1;-1;-0.6;-0.99084 

Normally what I expect is to have all the values of each row which are separated by ; in row_X. But as I run the program, I get this error:

Traceback (most recent call last): File "make_dataset.py", line 68, in <module> train = load_data(sys.argv[1], 0, 10) File "make_dataset.py", line 53, in load_data row_X = [float(elem_X) for elem_X in row_X] ValueError: invalid literal for float(): 1;-0.796992;-0.8275;-1;0.0769231;-0.25;-0.859649;1;1;-0.671642;1;-1;-0.8;-0.94 

What I can't find out the reason is that the program works properly with another csv file which contains these rows (just to compare its possible different form with the first one):

7.4;0.7;0;1.9;0.076;11;34;0.9978;3.51;0.56;9.4;5 7.8;0.88;0;2.6;0.098;25;67;0.9968;3.2;0.68;9.8;5 7.8;0.76;0.04;2.3;0.092;15;54;0.997;3.26;0.65;9.8;5 11.2;0.28;0.56;1.9;0.075;17;60;0.998;3.16;0.58;9.8;6 
18
  • Did you read this? stackoverflow.com/questions/14145460/… Commented Jul 16, 2015 at 19:46
  • 1
    It's trying to parse the entire line into a single number. You need to split it up and parse each element in the line. Commented Jul 16, 2015 at 19:55
  • 1
    @TigerhawkT3: You are correct, but he's using csv.reader which should be splitting the line correctly. Commented Jul 16, 2015 at 19:56
  • 1
    @MarkRansom: It's important to note that the binary flag is needed in Python 2 only. So if OP is using Python 2, he should try this. In Python 3 the docs change to "If csvfile is a file object, it should be opened with newline=''." Commented Jul 16, 2015 at 19:58
  • 1
    @StevenRumbalski it's not just Python 2, but Python 2 on Windows - Linux qualifies as a platform where 'b' doesn't make a difference. Commented Jul 16, 2015 at 20:00

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.