Recently I was looking for a program that will run as a daemon and find files that have the same size/type, check if they're the same, then make both a hard link to a single copy if they are. And I started wondering why Operating Systems don't do this automatically.
I thought maybe because it would be time consuming, but it wouldn't need to check if no new files were added outside of the cache directory, and checking the size would rapidly cut the search space. Then I thought maybe because it doesn't come up very often; but if that were the case then I would expect game consoles to do this, because most games will use the same stock sound effects package for instance, but they don't. Having two games from one series takes the same amount of space as just summing the two sizes, even though tons of assets would be reused.
Or in a system like youtube, they check videos against other videos when checking for copyright violations, but they don't seem to cause two identical videos to be stored only once, considering how mirroring a video can prevent it being taken off the site, (e.g. when 'youtube vs the users' kept being mirrored they took it out of the search results rather than continuing to take them off the site).
So, what's the reason the system doesn't compress things this way?