1

I've got a site-to-site vpn connection between two data centers (one in San Jose, the other in Toronto).

I need to send a 32GB file from one dc to another - FAST AS POSSIBLE.

I've found a shell script that splices up the 32GB file smaller files and then uses scp to transfer over in parallel fashion.

The question is how do I determine the optimal file size for sending over the various little files across the site-to-site vpn connection (I'd like to try to maximize the bandwidth).

Obviously the more scp process I run on the server, I guess there's more load put on that server.

EEAA
  • 109,904
Simon
  • 13

3 Answers3

1

Forget about the site-to-site for a minute, because as long as its ipsec and your endpoints are not toasters its unlikely to be bottleneck in itself, and have a quick look at bbcp:

http://www.slac.stanford.edu/~abh/bbcp/

Here is a line from the perl script we used during the last migration, which had the same requirements as you i.e. move the data fast

sprintf('/usr/local/bin/bbcp -a -F -s 16 -P 10 -T "ssh -x -a -oFallBackToRsh=no %%I -l %%U %%H /usr/local/bin/bbcp" -d . -v %s %s:%s',
  join(' ', @files_to_copy), $remote_host, $destination_dir);

Play with the options, especially the number of threads.

Questions you will want answered are:

  • what is the latency of the links
  • what is the packet jitter likely to be like
  • what is the total maximum bandwidth I can expect
  • who/what else will I stomp on by hogging the entire link

bbcp should be able to max any link to the point where cpu becomes your bottleneck with the right flags. Good luck

1

I'd look at rsync for anything that large.

Something like:

rsync -ave "ssh -c arcfour -o Compression=no -x" source_file user@destination:/path/to/dest

http://en.wikipedia.org/wiki/RC4

That way, if a partial copy gets interrupted for any reason, you will be able to resume the upload leveraging rsync's internals.

dmourati
  • 25,870
0

If you're willing to consider commercial solutions, Aspera or Signiant are much faster than scp, sftp, or rsync.

Kenster
  • 2,162