I'd like to get a list of all files in my Gentoo Linux system that were not installed by the package manager (Portage). This is because I want to keep my system as clean as possible, removing all useless files lying around.
Let me tell you what I've tried until now. First of all, I generate the list of all files that belong to some package tracked by Portage:
equery files "*" | sort | uniq > portage.txt Then I generate the list of all files on my system, except those that I don't care about:
find / \( -path /dev -o -path /proc -o -path /sys -o -path /media \ -o -path /mnt -o -path /usr/portage -o -path /var/db/pkg \ -o -path /var/www/localhost/htdocs -o -path /lib64/modules \ -o -path /usr/src -o -path /var/cache -o -path /home \ -o -path /root -o -path /run -o -path /var/run -o -path /var/tmp \ -o -path /var/log -o -path /tmp -o -path /etc/config-archive \ -o -path /usr/local/portage -o -path /boot \) -prune \ -o -type f | sort | uniq > all.txt Finally, I get the list of all files that are not tracked by Portage:
comm -13 portage.txt all.txt > extra.txt Some statistics:
wc -l portage.txt all.txt extra.txt 127724 portage.txt 78371 all.txt 8438 extra.txt As you can see I still get more than eight thousands extra files. I'd like to reduce that number, in order to focus more on files that really need to be deleted.
I noticed that in extra.txt there are thousands of files in a small number of directories, such as /usr/lib64/gcc, /usr/lib64/python2.7 and /usr/lib64/python3.2. The /usr/lib64/gcc/x86_64-pc-linux-gnu/4.6.3/crtbegin.o file, for example, is not in portage.txt because, in its place, there is /usr/lib/gcc/x86_64-pc-linux-gnu/4.6.3/crtbegin.o. On my system /usr/lib is a symlink to /usr/lib64. So it seems that I need to properly handle symlinks to get better results. Perhaps by adding in portage.txt all files they point to. I don't really know how to do that.
Also, why portage.txt is bigger than all.txt? Shouldn't be the opposite since files tracked by Portage are a subset of all files in my system?
Finally, am I forgetting any other location in the find command that should be also excluded?