0

I have a large csv file with columns that encode the name and index of the array below. eg:

time, dataset1[0], dataset1[1], dataset1[2], dataset2[0], dataset2[1], dataset2[2]\n 0, 43, 35, 29, 21, 59, 39\n 1, 21, 59, 39, 43, 35, 29\n 

You get the idea (obviously there is far more data in the arrays).

Any ideas how can I easily parse/strip this into an efficient dataframes?

[EDIT]

Ideally I'm after a structure like this:

 time dataset1 dataset2 0 0 [43,35,29] [21,59,39] 1 1 [21,59,39] [43,35,29] 

where the index's have been stripped from the labels and turned into nparray indices.

0

1 Answer 1

2
from pandas import read_csv df = read_csv('data.csv') print df 

Gives as output:

>>> time dataset1[0] dataset1[1] dataset1[2] dataset2[0] dataset2[1] \ 0 0 43 35 29 21 59 1 1 21 59 39 43 35 dataset2[2] 0 39 1 29 
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks, but I'd like the indices stripped from the column labels to form array indices (would make it easier). I've clarified this a little in the OP.
@AndyM Can you show the code of what you have tried yourself and where you are stuck?
I don't really have any code for this as I can't work out how to do it. Whether I should be trying to split the string by the opening square bracket. As it stands I can visually inspect the file and tailor a script for each file (ignoring the headers), but I wondered if there was a clever easy way.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.