Comparing two directories

Sometimes it's useful to check whether two directories contain the same files, and whether the files that match are exactly the same.

Assume you have two directories: dir1 and dir2

First, compare the listing of files in the dirs:

find dir1 |sort > contents_dir1.txt
find dir2 |sort > contents_dir2.txt

(I'm sure there are ways to exclude certain patterns, but I can't be bothered because I don't need it at this point.)

Next, compare the two listings using diff ( piping it to colordiff gives pretty colours, using less enables you to actually see what is different if the list is very long).

diff -u contents_dir1.txt contents_dir2.txt |colordiff |less

If you see files here for which you don't care about whether they are identical, remove them. Sed is your friend if there are many of them.

sed -i '//d' contents_dir1.txt
sed -i '//d' contents_dir2.txt

Now make md5sums of all files listed in contents_dir1.txt and contents_dir2.txt. Redirecting stderr to /dev/null suppresses " is a directory" warnings from the md5sum command

cat contents_dir1.txt |xargs md5sum > md5sums_dir1.txt 2> /dev/null
cat contents_dir2.txt |xargs md5sum > md5sums_dir2.txt 2> /dev/null

And once again, use diff to compare the md5sums. From here, it's up to you to decide what to do with files that are different :-)

diff -u md5sums_dir1.txt md5sums_dir2.txt |colordiff |less
Bash tricks 0 comments
This item is closed, it's not possible to add new comments to it or to vote on it