I've been working on Python for around 2 months now so I have a OK understanding of it.
My goal is to create a matrix using CSV data, then populating that matrix from the data in the 3rd column of that CSV file.
I came up with this code thus far:
import csv import csv def readcsv(csvfile_name): with open(csvfile_name) as csvfile: file=csv.reader(csvfile, delimiter=",") #remove rubbish data in first few rows skiprows = int(input('Number of rows to skip? ')) for i in range(skiprows): _ = next(file) #change strings into integers/floats for z in file: z[:2]=map(int, z[:2]) z[2:]=map(float, z[2:]) print(z[:2]) return After removing the rubbish data with the above code, the data in the CSV file looks like this:
Input: 1 1 51 9 3 1 2 39 4 4 1 3 40 3 9 1 4 60 2 . 1 5 80 2 . 2 1 40 6 . 2 2 28 4 . 2 3 40 2 . 2 4 39 3 . 3 1 10 . . 3 2 20 . . 3 3 30 . . 3 4 40 . . . . . . . The output should look like this:
1 2 3 4 . . 1 51 39 40 60 2 40 28 40 39 3 10 20 30 40 . . There are about a few thousand rows and columns in this CSV file, however I'm only interested is the first 3 columns of the CSV file. So the first and second columns are basically like co-ordinates for the matrix, and then populating the matrix with data in the 3rd column.
After lots of trial and error, I realised that numpy was the way to go with matrices. This is what I tried thus far with example data:
left_column = [1, 2, 1, 2, 1, 2, 1, 2] middle_column = [1, 1, 3, 3, 2, 2, 4, 4] right_column = [1., 5., 3., 7., 2., 6., 4., 8.] import numpy as np m = np.zeros((max(left_column), max(middle_column)), dtype=np.float) for x, y, z in zip(left_column, middle_column, right_column): x -= 1 # Because the indicies are 1-based y -= 1 # Need to be 0-based m[x, y] = z print(m) #: array([[ 1., 2., 3., 4.], #: [ 5., 6., 7., 8.]]) However, it is unrealistic for me to specify all of my data in my script to generate the matrix. I tried using the generators to pull the data out of my CSV file but it didn't work well for me.
I learnt as much numpy as I could, however it appears like it requires my data to already be in matrix form, which it isn't.