Copying lots of small files

25 12 2013

So I have this situation where I need to copy lots of small files (10s of thousands of files each around 100KB) across linux machines. The problem with scp is that it is slow (guess it has something to do with having to setup and tear down a connection for each file). Another option – bbcp. Not sure if this can solve the problem as it is more tuned for large files being split across multiple streams in parallel.

Solution? tar, pipe the output to ssh on the remote machine and untar 🙂

Say you want to transfer the contents of dir1 from machine1 to dir1 on machine2

on machine1/dir1 :~$ 
cd dir1
tar czf - * | ssh machine2 "cd /dir2/; tar xzf -"

Much faster 🙂