2

I tried to find a way to copy large files in fastest way possible...

import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.util.ArrayList; public class FastFileCopy { public static void main(String[] args) { try { String from = "..."; String to = "..."; FileInputStream fis = new FileInputStream(from); FileOutputStream fos = new FileOutputStream(to); ArrayList<Transfer> transfers = new ArrayList<>(); long position = 0, estimate; int count = 1024 * 64; boolean lastChunk = false; while (true) { if (position + count < fis.getChannel().size()) { transfers.add(new Transfer(fis, fos, position, position + count)); position += count + 1; estimate = position + count; if (estimate >= fis.getChannel().size()) { lastChunk = true; } } else { lastChunk = true; } if (lastChunk) { transfers.add(new Transfer(fis, fos, position, fis.getChannel().size())); break; } } for (Transfer transfer : transfers) { transfer.start(); } } catch (IOException ex) { ex.printStackTrace(); } } } 

then create this class :

import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.nio.channels.FileChannel; public class Transfer extends Thread { private FileChannel inChannel = null; private FileChannel outChannel = null; private long position, count; public Transfer(FileInputStream fis, FileOutputStream fos, long position, long count) { this.position = position; this.count = count; inChannel = fis.getChannel(); outChannel = fos.getChannel(); } @Override public void run() { try { inChannel.transferTo(position, count, outChannel); } catch (IOException e) { e.printStackTrace(); } } } 

I tested it and the result was very very impressive... but there is a big problem, the copied file is veryyyyy larger than the current file !!!

so, please check it and help me to find the problem, thank you :))

9
  • 4
    Is Files.copy(source, destination) not fast enough for you? Also if the file is on a single hard drive, using more than one thread will decrease performance. Commented Mar 19, 2014 at 10:53
  • no :)) ... you can copy 3 GB within just 20 sec in this way Commented Mar 19, 2014 at 10:55
  • Have you at least tried Files.copy()? Commented Mar 19, 2014 at 10:56
  • Bugs in your code aside, what exactly made you think multithreading would make it faster? Do you have a multithreaded disk? Striped perhaps? Commented Mar 19, 2014 at 10:58
  • 2
    Threads can't possibly speed up an operation which uses maybe 1% of CPU. Commented Mar 19, 2014 at 10:59

2 Answers 2

7

This is a XY problem. Just use Files.copy().

Look at that and see if this is not fast enough for you:

$ ls -lh ~/ubuntu-13.04-desktop-amd64.iso -rw-rw-r-- 1 fge fge 785M Jul 12 2013 /home/fge/ubuntu-13.04-desktop-amd64.iso $ cat Foo.java import java.io.IOException; import java.nio.file.Files; import java.nio.file.Paths; import java.nio.file.StandardCopyOption; public class Foo { public static void main(final String... args) throws IOException { Files.copy(Paths.get("/home/fge/ubuntu-13.04-desktop-amd64.iso"), Paths.get("/tmp/t.iso"), StandardCopyOption.REPLACE_EXISTING); } } $ time java Foo real 0m1.860s user 0m0.077s sys 0m0.648s $ time java Foo real 0m1.851s user 0m0.101s sys 0m0.598s 

And it could be even faster. God knows why, Oracle doesn't use sendfile(2) even though this is Java 8 and Linux 2.2 has been here for quite some time.

Sign up to request clarification or add additional context in comments.

1 Comment

On the machine of any self-respecting developer this program will cache the file in memory, and the OS will write back the dirty FS cache after the program ends. Add a time sync after the progream's execution and add the times.
2

Since each loop you increment position by count+1, and you make a Transfer with `(fis,fos,position,position+count), your code will create Transfer objects as follows:

new Transfer(fis, fos, 0,count) new Transfer(fis, fos, count+1, 2count+1) new Transfer(fis, fos, 2count+2, 3count+2) new Transfer(fis, fos, 3count+3, 4count+3) ... 

So although you'll create filesize / count Transfer classes, you're asking to transfer (count + 1) * (1 + 2 + 3 + ...) bytes in total.

Furthermore, I don't think FileChannel.TransferTo() works the way you think it does. position specifies the position in the source file where you start reading. It doesn't specify the position to which you write in the destination channel. So even when you get the sizes right, you'll end up with the right size output file, but the contents will be jumbled up in whatever order the threads happen to write them.

You could call outChannel.position() to skip to the right place. It's not clear to me what kind of chaos might happen as multiple threads extend the filesize in this way.


It's fine to experiment, and I encourage you to try this out and benchmark. However the comments are correct that the approach is misguided. There is only one disk, backed by just one filesystem buffer, and having multiple threads fight over it won't make it work faster -- and may make it slower.

You're unlikely to improve on:

 long count = 0; long size = src.size(); while(count < size) { count += src.transferTo(count, size - count, dest); } 

Do also note that it's very difficult to make judgements about the performance of file operations, since the filesystem will be caching both reads and writes, so an awful lot of what you do will just be super-cheap operations on RAM.

Also note that, at least when benchmarking, you're going to need to join() with all the threads you've started, before considering the copy complete.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.