Why uid != username

Due to various problems with Ubuntu Server, I’m migrating our server at the shop to CentOS.  Part of that migration included copying over the chroot environments that we use for PXE booting.  When it comes to copying large amounts of data over the network, especially when the permissions and ownership of that data is crucial, rsync is always my goto tool.  It can use compression to save network bandwidth, uses delta copies for files that are already copied but have changes, and has options to delete files from the destination that don’t exist in the source.  On the whole, rsync is pretty easy to use as well and the man page has EXCELLENT examples.  Here’s a simplified version of what I was using (the real one uses ssh tricks to get rsync to run as sudo on the other end):

rsync -avz 192.16 8.1.14:/nfsrootmav/ /nfsrootmav/

Looks good, right?  What this tells rsync to do is copy recursively and maintain ownership/permissions (-a), use verbose output messages (-v), and use compression during the copy (-z).  I ran this command and all my files copied happily and without complaint.  When I tried to test the copied chroot environment, however, it wasn’t quite right.  When I tried to shut the computer down (booted to the chroot over PXE) it logged out of Gnome instead.  The shutdown command in GDM didn’t work either.

I knew that the only difference between the original and the copy is that the copy…had been copied.  This lead me in the direction of permissions, though I couldn’t understand why they’d be different if rsync was running in archive mode.  I tried running diff on `ls` dumps of the original vs. the copied directories, but I never really got the outputs on the different servers right for the diff to tell me everything I wanted to know.  What I did glean from my diff was that if I looked at the uid’s and gid’s of the files numerically, the original and copy had different ownerships on some files!

What?  How could that be?  Then it occurred to me…I had copied the chroot from the host OS, not from within the chroot.  It turns out that by default rsync copies the user and group owners of a file by name, not uid and gid.  That makes sense, doesn’t it?  If I have a file that’s owned by, say, “messagebus” on one computer and I copy it to another, I want it to be owned by “messagebus” on the other computer too even if the “messagebus” user has a different numeric uid.  It went wrong here because files are stored with a numeric uid/gid, and any time an operation requests or sets the text user/group then they are simply looked up in /etc/passwd or /etc/group.  The files in my chroot had users that matched uids in it’s own /etc/passwd, but rsync was referencing the host OS’s /etc/passwd.

The solution?  rysnc has an option called “–numeric-ids”.  This will cause rsync to use the numeric uid’s and gid’s directly rather than trying to look up the plain text names.  I added that and my files copied properly.  The PXE-booted computer shutdown as it should and all was well with the world.

I actually never would have had this problem if I’d chrooted into my /nfsrootmav directory and rsynced out from there because rsync would have been looking at the right /etc/passwd and /etc/group, but then I never would have learned this great lesson.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: