Timeline for Is it necessary to read every single byte to check if a copied file is identical to the original?

Current License: CC BY-SA 3.0

21 events

when toggle format	what		by	license	comment
Jun 27, 2013 at 19:35	answer	added	user14517		timeline score: 0
Feb 9, 2012 at 7:43	history	edited	user8		Fix tags
Jan 20, 2012 at 1:58	answer	added	Loren Pechtel		timeline score: 1
Jan 19, 2012 at 23:30	comment	added	psr		Short answer - No, it's best just to have your computer do it for you.
Jan 19, 2012 at 23:18	comment	added	jasonk		There's also a non-zero probability of two fluke disk read errors covering up an issue, or of a solar flare corrupting a single bit. It all depends on your comfort level. Servers have ECC ram for this reason.
Dec 5, 2011 at 7:26	vote	accept	Koen027
Dec 3, 2011 at 23:46	history	edited	user1249	CC BY-SA 3.0	edited title
Dec 3, 2011 at 23:37	answer	added	Keith Thompson		timeline score: 45
Dec 3, 2011 at 23:26	comment	added	Dean Harding		@KeithThompson: I think your first comment should be an answer :-)
Dec 3, 2011 at 23:10	comment	added	Keith Thompson		As for the likelihood of collision, if you use a decent hash like `sha1sum` you pretty much don't have to worry about it, unless someone is deliberately and expensively constructing files whose sha1sums collide. I don't have a source for this, but I've heard (in the context of git) that the probability of two different files having the same sha1sum is about the same as the probability of every member of your development team being eaten by wolves. On the same day. In completely unrelated incidents.
Dec 3, 2011 at 23:09	comment	added	user1249		Also how will you ensure that the checksums/hashes are correct?
Dec 3, 2011 at 23:07	comment	added	Keith Thompson		Calculating CRCs (or, better, sha1sums) on both files requires reading every byte anyway. If you do a byte-by-byte comparison, you can quit as soon as you see a mismatch -- and you don't have to worry about two different files that happen to have the same checksum (though that's vanishingly unlikely for sha1sum). On the other hand, checksum comparisons are useful when you're comparing files that aren't on the same machine; the checksums can be computed locally, and you don't have to transfer the entire content over the network.
Dec 3, 2011 at 22:48	answer	added	NoChance		timeline score: 0
Dec 3, 2011 at 22:30	comment	added	user1249		Have a look at how "rsync" handles this.
Dec 3, 2011 at 21:55	history	edited	yannis	CC BY-SA 3.0	deleted 5 characters in body; edited title
Dec 3, 2011 at 20:23	history	tweeted			twitter.com/#!/StackProgrammer/status/143062801143959552
Dec 3, 2011 at 16:21	comment	added	Joey Adams		Even that isn't perfect if the file's content is cached in RAM or on the disk's write cache.
Dec 3, 2011 at 15:57	answer	added	JohnFx		timeline score: 11
Dec 3, 2011 at 15:21	answer	added	user7007		timeline score: 3
Dec 3, 2011 at 15:20	answer	added	Dave Rager		timeline score: 5
Dec 3, 2011 at 15:08	history	asked	Koen027	CC BY-SA 3.0

toggle format