Darwinweb

Deploying Websites with Rsync from OS X

May 22, 2006     

Although rsync has a ton of options and is considered by many to be difficult to use, there’s no denying its utility for deploying changes to websites. Especially sites that have a ton of files, such as those using almost any open-source component.

I once tried to upload a full copy of phpMyAdmin through Transmit and it took over an hour on a high speed connection. Command line tools such as scp will easily be 10 times as fast, but it’s nothing compared to the time savings of repeated site updates over rsync.

Howto

Rsync does have a dizzying array of functionality, but I prefer to think of it much like the cp or mv utilities. You give it a source and a destination and a few options. My rsync commands often look something like this:

rsync -rlt ./ ssh_user@web_server.com:public_html

The options stand for recursive copying of subdirectories, copying of symlinks (instead of their targets), and preserve file times.

Other options that I may throw in are --delete which will remove files from the remote end that don’t exist in the source directory. Along with that I like to throw in a -n (dry-run) to see what will actually be deleted before I commit to it.

Gotchas

Rsync can be dangerous when used off-the-cuff like this. For complex syncing, (particularly with the --delete option) it’s often better to develop a wrapper script that is well-tested so you don’t have to worry about typos. However, the practice of always doing a dry-run first makes it safe enough for basic use by a conscientous user. It’s a good idea to have the disaster scenarios in your head beforehand:

Multiple Developers / Remote Updates Rsync is not a version-control system! If you are making changes directly to your server you have to be careful that you don’t overwrite changes. The --update option will prevent overwriting of newer files, but you’ll still have no way to merge changes from both locations. The trick to using rsync successfully is to always have a canonical version of a site. In other words, if you make changes on the server, that becomes the canonical version and you need to sync it back to your local copy before you can make any more local changes.

Trailing Slash on Source Directory If you don’t put a slash after the directory name you are syncing, the directory will be created at the receiving end. However, if you do put a slash then the contents of the directory will be copied, not the directory itself. In other words rsync -rlt ~/Sites/domain.com ... will result in the directory domain.com being created.

OS X Extras As of Mac OS X 10.4 rsync, tar and a host of other commands will automatically handle resource forks and extended attributes. This is supported on non-HFS filesystems by creating filenames starting with ._filename that are associated with filename. Needless to say these can exist for almost every file on a Mac. There are also the annoying .DS_Store files that the Finder creates. To deal with these, find is your friend:

find . -name '._*' -exec rm {} \;
find . -name '.DS_Store' -exec rm {} \;

Rsync Bugs As of 10.4.6 rsync has a number of bugs that can cause segmentation faults as well as other misbehaviour. If you are serious about using rsync on OS X then it’s probably in your best interest to patch and compile your own rsync