Monday, May 16, 2011

Configuration - Part 1

I've been obsessed with configuration lately.  Over the course of my career, I've noticed that there is less and less programming to do, corresponding to more and more configuration to do.  Why create a complicated program from scratch when you could take existing programs and modules and configure them to solve your problem?

One the places where I have noted this trend is in the Java world.  It seems there are millions of pre-built Java components out there.  The trick is not to create new ones, but to configure the existing ones to accomplish the task at hand.  Java programming has become less about the language and more about manipulating huge undecipherable XML files.

In Python, we've got a number of tools to help get configuration information from various sources: getopt, argparse, ConfigParse, ConfigObj, etc.  To my eyes, however, these tools only provide partial implementations of configuration because they focus only on one dimension of the problem.

What is configuration, anyway?  I define as the set of values that remain constant throughout runtime, but can be varied prior to run time.  There's lots of ways to get configuration information: ini files, flat config files, the command line, operating system environment, json files, XML, etc.  I've not found an overall system that can gather configuration information from all these sources.

Back in '05, when I was at the OSUOSL, I took my first swipe at this grand unified configuration manager.  Six years later, I'm using the latest generation of the tool in the Socorro project.

It works like this: you define the configuration parameters required by a program using a homebrewed definition language consisting of Namespaces and Options.  Options are objects that define a single configuration parameter: name, documentation, default value and a function that can convert a string into the appropriate type.  A Namespace is just a collection of options. The end result is a dictionary of key value pairs accessible using dot notation.

    import config_manager as cm
    n = cm.Namespace()
    n.option(name='host',
             doc='the host name',
             default='localhost')
    n.option(name='debug',
             doc='use debug mode',
             default=False,
             short_form='D',
             from_string_converter=cm.boolean_converter)
    conf_man = cm.ConfigurationManager([n],
                                       application_name='sample')
    config = conf_man.get_config()
    print config.host
    print config.debug
 
If this program were saved as sample.py and just run from the command line, it would print this:

    $ python ./sample.py
    localhost
    False


Invoke it like this and you'll get:

    $ python ./sample.py --help
    sample
      --_write
        write config file to stdout (conf, ini, json) (default: None)
      --config_path
        path for config file (not the filename) (default: ./)
      -D, --debug
        use debug mode
      --help
        print this
      --host
        the host name (default: localhost)


This exposes some hidden options.  Lets say that we want an ini file for this program.  We can get the script to write it for us:

    $ python ./sample.py --_write=ini
    $ cat ./sample.ini
    [top_level]
    # name: debug
    # doc: use debug mode
    # converter: socorro.lib.config_manager.boolean_converter
    debug=False

    # name: host
    # doc: the host name
    # converter: str
    host=localhost


Just as easily, you could have written a flat conf file, json file, or, Lord help us, XML.  You can now edit your ini file to your heart's content.  This isn't a reimplementation of the ini support in Python, under the covers, ConfigurationManager is using the existing ConfigParse module.  So go ahead and use all the macro and substitution features available in that module.

The changes that you make to the values in this file become the new "defaults" reported by the --help feature.

In my next post on this topic, I'll show how the configuration manager overlays values from various sources to produce its final configuration values.

This class is currently in an experimental branch of the Socorro SVN tree.  I'll announce when it gets merged into trunk and is readily available.