I have a perfectly working code. But, when I run a large CSV file (around 2GB) it takes about 15-20 minutes for the complete execution of the code. Is there a way I could optimise my below code to take less time to finsh execution and thus improve performance?
from csv import reader, writer import pandas as pd path = (r"data.csv") data = pd.read_csv(path, header=None) last_column = data.iloc[: , -1] arr = [i+1 for i in range(len(last_column)-1) if (last_column[i] == 1 and last_column[i+1] == 0)] ch_0_6 = [] ch_7_14 = [] ch_16_22 = [] with open(path, 'r') as read_obj: csv_reader = reader(read_obj) rows = list(csv_reader) for j in arr: # Channel 1-7 ch_0_6_init = [int(rows[j][k]) for k in range(1,8)] bin_num = ''.join([str(x) for x in ch_0_6_init]) dec_num = int(f'{bin_num}', 2) ch_0_6.append(dec_num) ch_0_6_init = [] # Channel 8-15 ch_7_14_init = [int(rows[j][k]) for k in range(8,16)] bin_num = ''.join([str(x) for x in ch_7_14_init]) dec_num = int(f'{bin_num}', 2) ch_7_14.append(dec_num) ch_7_14_init = [] # Channel 16-22 ch_16_22_init = [int(rows[j][k]) for k in range(16,23)] bin_num = ''.join([str(x) for x in ch_16_22_init]) dec_num = int(f'{bin_num}', 2) ch_16_22.append(dec_num) ch_16_22_init = [] Sample Data:
0.0114,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,1,0,0,0,1 0.0112,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,1,0,0,0,0 0.0115,0,1,0,1,1,1,0,1,0,0,1,0,0,0,1,1,1,0,1,0,0,0,1 0.0117,0,1,0,1,1,1,0,1,0,0,1,0,0,0,1,1,1,0,1,0,0,0,0 0.0118,0,1,0,0,1,1,0,0,0,1,0,1,0,0,1,1,1,0,1,0,0,0,1 Join the binary digits to form a decimal number depending upon the channels chosen.