I've to process say thousands of records in an array. I did the normal for loop like this
for record in records: results = processFile(record) write_output_record(o, results) The script above took 427.270612955 seconds!
As there is no dependancy between these records. I used Python multi threading module in a hope to speedup the process. below is my implementation
import multiprocessing from multiprocessing.dummy import Pool as ThreadPool pool = ThreadPool(processes=threads) results = pool.map(processFile, records) pool.close() pool.join() write_output(o, results) My computer has 8 cpu's. And it takes 852.153398991 second. Can somebody help me as in what am I doing wrong?
PS: processFile function has no i/o's. its mostly processing the records and sending back the update record
records(ie, list, queue, tuple, deque, etc)recordandprocessFilelook like. But I think that it is suspicious that the run time is now almost exactly double what it was. That hints at one single thing somewhere being doubled up and dominating the run time.