Which is faster to process a 1TB file: a single machine or 5 networked machines? ("To process" refers to finding the single UTF-16 character with the most occurrences in that 1TB file). The rate of data transfer is 1Gbit/sec, the entire 1TB file resides in 1 computer, and each computer has a quad core CPU.
Below is my attempt at the question using an array of longs (with array size of 2^16) to keep track of the character count. This should fit into memory of a single machine, since 2^16 x 2^3 (size of long) = 2^19 = 0.5MB. Any help (links, comments, suggestions) would be much appreciated. I used the latency times cited by Jeff Dean, and I tried my best to use the best approximations that I knew of. The final answer is:
Single Machine: 5.8 hrs (due to slowness of reading from disk)
5 Networked Machines: 7.64 hrs (due to reading from disk and network)
1) Single Machine a) Time to Read File from Disk --> 5.8 hrs -If it takes 20ms to read 1MB seq from disk, then to read 1TB from disk takes: 20ms/1MB x 1024MB/GB x 1024GB/TB = 20,972 secs = 350 mins = 5.8 hrs b) Time needed to fill array w/complete count data --> 0 sec since it is computed while doing step 1a -At 0.5 MB, the count array fits into L2 cache. Since L2 cache takes only 7 ns to access, the CPU can read & write to the count array while waiting for the disk read. Time: 0 sec since it is computed while doing step 1a c) Iterate thru entire array to find max count --> 0.00625ms -Since it takes 0.0125ms to read & write 1MB from L2 cache and array size is 0.5MB, then the time to iterate through the array is: 0.0125ms/MB x 0.5MB = 0.00625ms d) Total Time Total=a+b+c=~5.8 hrs (due to slowness of reading from disk) 2) 5 Networked Machines a) Time to transfr 1TB over 1Gbit/s --> 6.48 hrs 1TB x 1024GB/TB x 8bits/B x 1s/Gbit = 8,192s = 137m = 2.3hr But since the original machine keeps a fifth of the data, it only needs to send (4/5)ths of data, so the time required is: 2.3 hr x 4/5 = 1.84 hrs *But to send the data, the data needs to be read, which is (4/5)(answer 1a) = (4/5)(5.8 hrs) = 4.64 hrs So total time = 1.84hrs + 4.64 hrs = 6.48 hrs b) Time to fill array w/count data from original machine --> 1.16 hrs -The original machine (that had the 1TB file) still needs to read the remainder of the data in order to fill the array with count data. So this requires (1/5)(answer 1a)=1.16 hrs. The CPU time to read & write to the array is negligible, as shown in 1b. c) Time to fill other machine's array w/counts --> not counted -As the file is being transferred, the count array can be computed. This time is not counted. d) Time required to receive 4 arrays --> (2^-6)s -Each count array is 0.5MB 0.5MB x 4 arrays x 8bits/B x 1s/Gbit = 2^20B/2 x 2^2 x 2^3 bits/B x 1s/2^30bits = 2^25/2^31s = (2^-6)s d) Time to merge arrays --> 0 sec(since it can be merge while receiving) e) Total time Total=a+b+c+d+e =~ a+b =~ 6.48 hrs + 1.16 hrs = 7.64 hrs
(2^16)*sizeof(uint64)bytes, that is to say2^16*8 == 2^19bytes == 0.5Mb.