I'm a novice python hobbyist and have started experimenting with multi-threading using concurrent.futures.
Each individual thread is supposed to analyse an HTML file and then append certain items to a list. Once all threads have finished, the resulting list is then written to a CSV file.
The surprising result is that certain parts of a row seem to be offset by 1 row in the list, e.g.:
Expected result:
caseList = [ [a1, a2, a3], [b1, b2, b3], [c1, c2, c3], [d1, d2, d3], ] Actual result:
caseList = [ [a1, a2, a3], [b1, a2, a3], [c1, b2, b3], [d1, c2, c3] ] Where the letters represent exactly one HTML file that is supposed to be analysed by one thread. I can't exactly pinpoint where it changes, but it starts off correct but then certain rows partly contain items that should belong to the previous row.
I have read about race conditions and locking, but have also read comments that list.append should be thread safe. So not entirely sure what's at play here.
Here's my code:
caseList = [] with concurrent.futures.ThreadPoolExecutor() as executor: results = [executor.submit(searchCase, filename, pattern) for filename in logContents] for f in concurrent.futures.as_completed(results): caseList.append(f.result()) print(f.result()) Is there anything that I am obviously doing wrong here?
list.append()isn't an issue here, since you are doing that entirely in the main thread. This looks like your threads are somehow sharing working variables.searchCaseonly calls other functions which all use local variables only, so I'm unsure how this could happen. I will go back and double-check that again!