Configuration - part 5

In the previous installment of this series about my grand unified configuration system, I showed how namespaces can be used to resolve conflicts for parameters with the same name. These namespaces can be nested arbitrarily deeply.

This system was designed for use in the Socorro Project, Mozilla's crash reporting system. Socorro's ecosystem consists of a flock of thirty small scripts run periodically by cron and some long running daemon processes. Each of these scripts has its own executable file. If you examine these files, you'll see that they're almost all exactly the same: they setup logging and set of configuration variables, import a class that defines the business logic of the script, instantiate the class, call the class' "main" function. Aside from the business logic class, it's all boiler plate. Since we've already got all these classes, its a simple step to instrument them so that the configuration manager can dynamically load them. That eliminates all but one copy of the boiler plate code and drops the number of executable scripts from thirty to one. Everything is controlled by configuration.

app_definition = cm.Namespace()
app_definition.option(
    "_application",
    doc="the fully qualified class of the application",
    default=some_default_class,
    from_string_converter=cm.class_converter,
)

config_manager = cm.ConfigurationManager((app_definition,), (ConfigParser, os.environ, getopt))
config = config_manager.get_config()

application_class = config._application
app_instance = application_class(config)
logger.info("starting %s", application_class.app_name)
app_instance.main()

For clarity, error handling has been stripped from this example.

The configuration manager treats a option definition with the name '_application' in a special way. It assumes that the class will have three attributes: a string called, app_name, app_version, and a main function. It makes the further assumption that the constructor for the class needs the DOM-like config dictionary for initialization.

With a properly instrumented cooperating class, that is all the code required to load the app, read the configuration ini file, override any config files option values with values from the environment and then override those with any provided on the command line.

Here's an edited version of a class compatible with the application class above.

import os.path
import socarro.configman as cm
import socorro.database as sdb
import socorro.lib.gzip_csv as gzcsv


class DailyCsvApp(object):

    app_name = "daily_csv"
    app_version = "1.1"
    app_doc = "This app produces a csv file of the current day's crash data"

    required_config = cm.Namespace()
    required_config.option(
        name="day",
        doc="the date to dump (YYYY-MM-DD)",
        default=dt.date.today().isoformat(),
        from_string_converter=cm.date_converter,
    )
    required_config.option(
        name="outputPath", doc="the path of the gzipped csv output file", default="."
    )
    required_config.option("product", doc="the name of the product to dump", default="Firefox")
    required_config.option(name="version", doc="the name of the version to dump", default="4.%")
    # get database connection option definitions from the database module
    required_config.update(sdb.get_required_config())

    def __init__(self, config):
        self.config = config

    def main(self):
        with config.database.transaction() as db_conn:
            output_filename = ".".join(
                self.config.product, self.config.version, self.config.day.isoformat()
            )
            csv_pathname = os.path.join(self.config.outputPath, output_filename)
            db_query = self.construct_query(
                self.config.day, self.config.product, self.confg.version
            )
            with gzcsv(csv_pathname) as csv_fp:
                for a_row in sdb.query(db_conn, db_query):
                    csv_fp.write_row(a_row)

    def construct_query(self, day, product, version):
        # implementation omitted for brevity
        pass```

In Socorro, the generic app is called simply, "app.py". To run see the options brought in by the DailyCsvApp class:

$ python app.py --_application=socorro.cron.dailycsv --help
  daily_csv 1.1
      This app produces a csv file of the current day's crash data
      --_write
          write config file to stdout (conf, ini, json) (default: None)
      --config_path
          path for config file (not the filename) (default: ./)
      --databaseHost
          the host name of the database (defailt: localhost)
      --databaseUser
          the username on the database (default: breakpad_rw)
      --databasePassword
          the user password (default: *******)
      --day
          the date to dump (YYYY-MM-DD) (default: 2011-07-04)
      --help
          print this
      --outputPath
          the path of the gzipped csv output file (default: .)
      --product
          the name of the product to dump (default: Firefox)
      --version
          the name of the version to dump (default: 4.%)

Of course, it is easy to write out an ini file by just specifying the _write directive on the command line. The configuration manager will write out the ini to the directory specified by the config_path using the name of the app as the basename of the ini file.

In the next installment of this multipart missive, I'll talk about using other sources like argparse as the source of option definitions.