Thursday, July 11, 2013

Socorro File System Storage Revisited - the inode problem


The Socorro radix file system storage classes were written originally to be permanent crash storage. We eventually started using HBase instead of the file system. However, the file system storage lives on as temporary storage on the Socorro collectors. Even through the 2013 rewrite of the file system classes, the mind set of this being permanent storage persisted. 

This creates a problem. The collectors dump all the crashes into file system storage. The crash movers then spool these temporarily stored crashes into HBase. As the crash movers do their work, crashes are removed from file system storage. For efficiency sake when working with a resource that is touched for reads and writes by other processes simultaneously, the file system classes do not clean up empty directories. As the days progress, this leaves large empty directory trees hanging around on the disk. Eventually, this become a problem as the disk runs out of inodes. We did not solve this problem in 2013 FS rewrite.  We couldn't come up with a good solution as for how to clean up the old directories without slowing down the collectors, crashmover or having a separate cleaning process or cron.

Here's a simple solution: recycling. The file system classes could have a setting for the number of days to keep old directories, say defaulting to one week. So in a week, we've got these directories created:

primaryCrashStorage/20130501
primaryCrashStorage/20130502
primaryCrashStorage/20130503
primaryCrashStorage/20130504
primaryCrashStorage/20130505
primaryCrashStorage/20130506
primaryCrashStorage/20130507

On May 8th, we'd normally create a new directory called primaryCrashStorage/20130508. All those older directories  are empty trees, the crashmover consumed them days previously. I suggest that the file system classes take the oldest and simply rename it for the current day. The internal directory structure is already been built, storing crashes can begin immediately.

mv primaryCrashStorage/20130501 primaryCrashStorage/20130508

The number_of_days should be set to a bit larger than the longest time that we'd could expect the crashmovers to out of commission in a crisis. If we want the file system to be permanent storage, set the constant to MAXINT (or make a special case for 0).

This system has two benefits: first the inode problem goes away without having to slow down anything or create another process; second it actually speeds up the collector, after the first week, it doesn't have to create radix directories any more, they've already been created.

I've no idea why this idea didn't come to me before – it seems to be nothing but win-win.