I have a base CSV. It is in the source:311-Service This base has about 11 GB. It is 19 million rows and 41 columns.
I want to take the information only about city: NEW JERSEY form column City. I can use this inquiry only for 500,000 rows. It works!
NYPD = pd.read_csv('c:/1/311_Service_Requests_from_2010_to_Present.csv', nrows=500000, low_memory=False) M = NYPD.loc[NYPD.City=='NEW JERSEY', :] M.to_csv('c:/1/NJ_NYPD.csv') I need information from all rows of the CSV file, not from only 500 000 rows. I think I need to use a loop and chunksize = 500,000, but I don't know how.
hunksize =500000 i = 0 j = 1 for df in pd.read_csv('c:/1/311_Service_Requests_from_2010_to_Present.csv', chunksize=chunksize, iterator=True, low_memory=False): df.loc[df.City=='NEW JERSEY', :] df.index += j i+=1 df.to_csv('c:/1/NJ_NYPD.csv') I don't want to translate CSV in to dbase method.

skiprowsargument