Monday, June 09, 2014

Crontabber and Postgres


This essay is about Postgres and Crontabber, the we-need-something-more-robust-than-cron job runner that Mozilla uses in Socorro, the crash reporting system.

Sloppy database programming in an environment where autocommit is turned off leads to very sad DBAs. There are a lot of programmers out there that cut their teeth in databases that either had autocommit on by default or didn't even implement transactions.  Programmers that are used to working with relational databases in autocommit mode actually miss out on one of the most powerful features of relational databases. However, bringing the cavalier attitude of autocommit into a transactional world will lead to pain.  

In autocommit mode, every statement given to the database is committed as soon as it is done. That isn't always the best way to interact with a database, especially if there are multiple steps to a database related task.

For example say we've got database tables representing monetary accounts. To move money from one account to another requires two steps, deduct from the first account and add to the other. If using autocommit mode, there is a danger that the accounts could get out of sync if some disaster happens between the two steps.

To counter that, transactions allow the two steps to be linked together. If something goes wrong during the two steps, we can rollback any changes and not let the accounts get out of sync. However, having manual transaction requires the programmer to be more careful and make sure that there is no execution path out of the database code that doesn't pass through either a commit or rollback. Failing to do so may end up leaving connections idle in transactions. The risk is critical consumption of resources and impending deadlocks.

Crontabber provides a feature to help make sure that database transactions get closed properly and still allow the programmer to be lazy.

When writing a Crontabber application that accesses a database, there are a number of helpers. Let's jump directly to the one that guarantees proper transactional behavior.

# sets up postgres
@using_postgres()
# tells crontabber control transactions
@as_single_postgres_transaction()
def run(self, connection):
    # connection is a standard psycopg2 connection instance.
    # use it to do the two steps:
    cursor = connection.cursor()
    cursor.execute(
        'update accounts set total = total - 10' where acc_num = '1'
    )
    do_something_dangerous_that_could_cause_an_exception()
    cursor.execute(
        'update accounts set total = total +10' where acc_num = '2'
    )

In this contrived example, the method decorator gave the crontabber job the a connection to the database and will ensure that that if the job runs to completion, the transaction will be commited. It also guarantees that if the the 'run' method exits abnormally (an exception), the transaction will be rolled back.

Using this class decorator is declaring that this Crontabber job represents a single database transaction.  Needless to say, if the job takes twenty minutes to run, you may not want it to be a single transaction.  

Say you have a collection of periodic database related scripts that have evolved over years by Python programmers long gone. Some of the old crustier ones from the murky past are really bad about leaving database connections “idle in transaction”. In porting it to crontabber, call that ill behaved function from within the context of a construct like that previous example. Crontabber will take on the responsibility of transactions for that function with these simple rules:

  • If the method ends normally, crontabber will issue the commit on the connection.
  • If an exception escapes from the scope of the function, crontabber will rollback the database connection.

Crontabber provides three dedicated class decorators to assist in handling periodic Postgres tasks. Their documentation can be found here: Read The Docs: Postgres Decorators.  The @with_postgres_connection_as_argument decorator will pass the connection the run method, but does not handle commit and/or rollback.  Use that decorator if you'd like to manage transactions manually within the Crontabber job. 

Transactional behavior contributes in making Crontabber robust.  Crontabber is also robust because of self healing behaviors. If a given job fails, dependent jobs will not be run. The next time the periodic job's time to execute comes around, the 'backfill' mechanism will make sure that it makes up for the previous failure. See Read The Docs: Backfill for more details.

The transactional system can also contribute to self healing by retrying failed transactions, if those failures were caused by transient issues. Temporary network glitches can cause failure. If your periodic job runs only once every 24 hours, maybe you'd rather your app retry a few times before giving up and waiting for the next scheduled run time.

Through configuration, the transactional behavior of Postrges, embodied by Crontabber's TransactionExecutor class, can do a “backing off retry”. Here's the log of an example of backoff retry, my commentary is in green:

# we cannot seem to connect to Postgres
2014-06-08 03:23:53,101 CRITICAL - MainThread - ... transaction error eligible for retry
OperationalError: ERROR: pgbouncer cannot connect to server
# the TransactorExector backs off, retrying in 10 seconds
2014-06-08 03:23:53,102 DEBUG - MainThread - retry in 10 seconds
2014-06-08 03:23:53,102 DEBUG - MainThread - waiting for retry ...: 0sec of 10sec
# it fails again, this time scheduling a retry in 30 seconds;
2014-06-08 03:24:03,159 CRITICAL - MainThread - ... transaction error eligible for retry
OperationalError: ERROR: pgbouncer cannot connect to server
2014-06-08 03:24:03,160 DEBUG - MainThread - retry in 30 seconds
2014-06-08 03:24:03,160 DEBUG - MainThread - waiting for retry ...: 0sec of 30sec
2014-06-08 03:24:13,211 DEBUG - MainThread - waiting for retry ...: 10sec of 30sec
2014-06-08 03:24:23,262 DEBUG - MainThread - waiting for retry ...: 20sec of 30sec
# it fails a third time, now opting to wait for a minute before retrying
2014-06-08 03:24:33,319 CRITICAL - MainThread - ... transaction error eligible for retry
2014-06-08 03:24:33,320 DEBUG - MainThread - retry in 60 seconds
2014-06-08 03:24:33,320 DEBUG - MainThread - waiting for retry ...: 0sec of 60sec
...
2014-06-08 03:25:23,576 DEBUG - MainThread - waiting for retry ...: 50sec of 60sec
2014-06-08 03:25:33,633 CRITICAL - MainThread - ... transaction error eligible for retry
2014-06-08 03:25:33,634 DEBUG - MainThread - retry in 120 seconds
2014-06-08 03:25:33,634 DEBUG - MainThread - waiting for retry ...: 0sec of 120sec
...
2014-06-08 03:27:24,205 DEBUG - MainThread - waiting for retry ...: 110sec of 120sec
# finally it works and the app goes on its way
2014-06-08 03:27:34,989 INFO  - Thread-2 - starting job: 065ade70-d84e-4e5e-9c65-0e9ec2140606
2014-06-08 03:27:35,009 INFO  - Thread-5 - starting job: 800f6100-c097-440d-b9d9-802842140606
2014-06-08 03:27:35,035 INFO  - Thread-1 - starting job: a91870cf-4d66-4a24-a5c2-02d7b2140606
2014-06-08 03:27:35,045 INFO  - Thread-9 - starting job: a9bfe628-9f2e-4d95-8745-887b42140606
2014-06-08 03:27:35,050 INFO  - Thread-7 - starting job: 07c55898-9c64-421f-b1b3-c18b32140606
The TransactionExecutor can be set to retry as many times as you'd like with retries at whatever interval is desired.  The default is to try only once.  If you'd like the backing off retry behavior, change TransactionExecutor in the Crontabber config file to TransactionExecutorWithLimitedBackOff or TransactionExecutorWithInifiteBackOff

While Crontabber supports Postgres by default, Socorro, the Mozilla Crash Reporter, extends the support of the TransactionExecutor to HBase, RabbitMQ, and Ceph.  It would not be hard to get it to work for MySQL or,  really, any connection based resource.

The TransactionExecutor, Coupled with Crontabber's Backfilling capabilities, nobody has to get out of bed at 3am because the crons have failed again.  They can take care of themselves.

On Tuesday, June 10, Peter Bengtsson of Mozilla will give a presentation about Crontabber to the SFPUG.  The presentation will be broadcast on AirMozilla.

SFPUG June: Crontabber manages ALL the tasks








Sunday, May 04, 2014

Crouching Argparse, Hidden Configman

I've discovered that people that persist in being programmers over age fifty do not die.  Wrapped in blankets woven from their own abstractions, they're immune to the forces of the outside world. This is the first posting in a series about a pet hacking project of mine so deep in abstractions that not even light can escape.

I've written about Configman several times over the last couple of years as it applies to the Mozilla Socorro Crash Stats project.  It is unified configuration.  Configman strives to wrap all the different ways that configuration information can be injected into a program.  In doing so, it handily passes the event threshold and becomes a configuration manager, a dependency injection framework, a dessert topping and a floor wax.

In my experimental branch of Configman, I've finally added support for argparse.  That's the canonical Python module for parsing the command line into key/value pairs, presumably as configuration.  It includes its own data definition language in the form of calls to a function called add_argument.  Through this method, you define what information you'll accept from the command line.

argparse only deals with command lines.  It won't help you with environment variables, ini files, json files, etc.  There are other libraries that handle those things.  Unfortunately, they don't integrate at all with argparse and may include their own data definition system or none at all.

Integrating Configman with argparse was tough.  argparse doesn't play well in extending it in the manner that I want.  Configman employs argparse but resorts to deception to get the work done.  Take a look at this classic first example from the argparse documentation.

from configman import ArgumentParser

parser = ArgumentParser(description='Process some integers.')
parser.add_argument('integers', metavar='N', type=int, nargs='+',
                    help='an integer for the accumulator')
parser.add_argument('--sum', dest='accumulate', action='store_const',
                    const=sum, default=max,
                    help='sum the integers (default: find the max)')

args = parser.parse_args()
print(args.accumulate(args.integers))

Instead of importing argparse from its own module, I import it from Configman.  That just means that we're going to use my subclass of the argparse parser class.  Otherwise it looks, acts and tastes just like argparse: I don't emulate it or try to reimplement anything that it does, I use it to do what it does best.  Only at the command line, running the 'help' option, is the inner Configman revealed.

$ ./x1.py 0 0 --help
    usage: x1.py [-h] [--sum] [--admin.print_conf ADMIN.PRINT_CONF]
                 [--admin.dump_conf ADMIN.DUMP_CONF] [--admin.strict]
                 [--admin.conf ADMIN.CONF]
                 N [N ...]

    Process some integers.

    positional arguments:
      N                     an integer for the accumulator

    optional arguments:
      -h, --help            show this help message and exit
      --sum                 sum the integers (default: find the max)
      --admin.print_conf ADMIN.PRINT_CONF
                            write current config to stdout (json, py, ini, conf)
      --admin.dump_conf ADMIN.DUMP_CONF
                            a file system pathname for new config file (types:
                            json, py, ini, conf)
      --admin.strict        mismatched options generate exceptions rather than
                            just warnings
      --admin.conf ADMIN.CONF
                            the pathname of the config file (path/filename)

There's a bunch of options with "admin" in them.  Suddenly, argparse supports all the different configuration libraries that Configman understands: that brings a rainbow of configuration files to the argparse world.  While this little toy program hardly needs them, wouldn't it be nice to have a complete system of "ini" or "json" files with no more work than your original argparse argument definitions? 

using argparse through Configman means getting ConfigObj for free

Let's make our example write out its own ini file:

    $ ./x1.py --admin.dump_conf=x1.ini
    $ cat x1.ini
    # sum the integers (default: find the max)
    #accumulate=max
    # an integer for the accumulator
    #integers=
Then we'll edit that file and make it automatically use the sum function instead of the max function.  Uncomment the "accumulate" line and replace the "max" with "sum".  Configman will associate an ini file with the same base name as a program file to trigger automatic loading.  From that point on, invoking the program means loading the ini file.  That means the command line arguments aren't necessary.  Rather not have a secret automatically loaded config file? Give it a different name.

    $ ./x1.py 1 2 3
    6
    $ ./x1.py 4 5 6
    15
I can even make the integer arguments get loaded from the ini file.  Revert the "sum" line change and instead change the "integers" line to be a list of numbers of your own choice.

    $ cat x1.ini
    # sum the integers (default: find the max)
    #accumulate=max
    # an integer for the accumulator
    integers=1 2 3 4 5 6
    $ ./x1.py
    6
    $ ./x1.py --sum
    21

By the way, making argparse not have a complete conniption fit over the missing command line arguments was quite the engineering effort.  I didn't change it, I fooled it into thinking that the command line arguments are there.


Ini files are supported in Configman by ConfigObj.  Want json files instead of ini files?  Configman figures out what you want by the file extension and searches for an appropriate handler.  Specify that you want a "py" file and Configman will write a Python module of values.  Maybe I'll write an XML reader/writer next time I'm depressed.

Configman does environment variables, too:
    $ export accumulate=sum
    $ ./x1.py 1 2 3
    6
    $ ./x1.py 1 2 3 4 5 6
    21

There is a hierarchy to all this.  Think of it as layers: at the bottom you have the defaults expressed or implied by the arguments defined for argparse.  Then next layer up is the environment.  Anything that appears in the environment will override the defaults.  The next layer up is the config file.  Values found there will override both the defaults and the environment.  Finally, the arguments supplied on the command line override everything else.

This hierarchy is configurable, you can make it any order that you want.  In fact, you can put anything that conforms to the collections.Mapping api into that hierarchy.  However, for this example, as a drop-in augmentation of argparse, the api to adjust the "values source list" is not exposed.

In the next installment, I'll show a more interesting example where I play around with the type in the definition of the argparse arguments.  By putting a function there that will dynamically load a class, we suddenly have a poor man's dependency injection framework.  That idea is used extensively in Mozilla's Socorro to allow us to switch out storage schemes on the fly.

If you want to play around with this, you can pip install Configman.  However, what I've talked about here today with argparse is not part  of the current release.  You can get this version of configman from my github repo: Configman pretends to be argparse - source from github.  Remember, this branch is not production code.  It is an interesting exercise in wrapping myself in yet another layer of abstraction. 

My somewhat outdated previous postings on this topic begin with Configuration, Part 1

Sunday, April 13, 2014

Pegboard Tool Storage - circuit tester & grounding adapter

(this post is part 8 of a longer series.  See  
Pegboard Tool Storage, Part 1 for the beginning,
or the whole series)

These are the most commonly lost tools that I own.  In fact, I finally found the left grounding adapter on the floor behind the shelf in the garage, no where near an outlet.  That's the problem: make these tools easy to find.


The solution is to make fake electrical outlets to put on the pegboard, of course.  Now all have to do is remember to put them away when I'm done with them.  If I can manage to do that, finding them again will be trivial because they'll be in plain sight.


This simple design is available for download at: pegboard outlet storage

Saturday, April 12, 2014

Pegboard Tool Storage - hex bits

(this post is part 7 of a longer series.  See  


I've got lots of hex bits for the electric drill, the electric screw driver and various interchangeable tip tools.  I've been keeping them in a jar on a shelf.  I've gotten sick of dumping them out to search for the one that I want.


This design has given me instant visual access to the whole array of hex bits.  The photo is of only one of them, I've actually made a bunch for all the different types:


This design is available for 3D printing at: Pegboard Hex Bit Holder

Next in this series Circuit Tester and Grounding Adapter

Wednesday, April 09, 2014

Pegboard Tool Storage - the goat hook

(this post is part 6 of a longer series.  See  

 And why shouldn't pegboard storage systems be whimsical?   This design for a screwdriver holder came as a sudden inspiration while browsing thingiverse.com. I saw the goat head sculpture and decided to scale it down and mix it up with my pegboard blank:



Adding the funnel shaped hole through the top of the head and emerging from the mouth makes for a silly and slightly disturbing tool holder.






This design is available for the 3D printing at: http://www.thingiverse.com/thing:293801

Next in this series, the ever exciting Hex Bits

Monday, April 07, 2014

Pegboard Tool Storage - Batteries


(this post is part 5 of a longer series.  See  

I am a heavy user of rechargeable batteries.  There is a trick to the successful use of rechargeables. If you have a need for just a handful of batteries, then you need to have 2 times the number of rechargeables than you have spaces that need batteries.  If you have more than a handful, you can get away with about 1.5 times the number of spaces requiring batteries.

The overstock is so that that you never have the experience of running out of power  and having to wait for batteries to charge.  

I keep the battery charger in the breezeway into the old cottage, just beneath the pegboard tools storage.  I keep fully charged batteries on the pegboard so they can just be grabbed as needed.  Batteries in need of a charge go directly into the charger sitting on the self below the pegboard.  When I walk by, if I notice that there are completed batteries in the charger, I just transfer them up on the pegboard holders.

Here's my design of the pegboard battery station:




Those designs translate into reality as these:



You can download this design at my Pegboard Battery Storage entry on thingiverse.com


Saturday, April 05, 2014

Back Into the Light

The whole Mozilla/Brendan Eich affair has been traumatic from day one.  For me, superimposed over that trauma was the death, just a few days earlier, of my former partner Richard in Montana.

Please, indulge me for a moment as an old man tells a story. 

Richard was diagnosed with an aggressive untreatable cancer just two weeks before he died.  Called from my home in Oregon, I rushed to Montana to say good bye.  I got there in time for him to open his eyes, smile and say my name.  Two days later, Richard died with me, his sister, his adult children, his former wife, and her husband gathered in a circle around him.

I lead the funeral attended by our close family group, a factional extended family and a parade of friends from his life.  I found the tension between the factions to be untenable and I struggled to think of how I could help break down the barriers.

In the eulogy I talked about our unconventional family bound together by love.  I told the story of how Linda, Richard and I lived together as one family supporting each other through graduate school and raising two wonderful kids.  The abiding theme was an encompassing circle of love.

I spoke extemporaneously and  I heard myself saying, "We are a diverse group here today with different beliefs and many disagreements.  Today, though, we unite to celebrate the love of for our lost brother, cousin, nephew, partner, husband, friend.  With love comes forgiveness: with forgiveness comes grace.  I am an atheist, but I bring this from my deepest childhood memories, I reach out to everyone here regardless of faith or lack there of, please join me..."  Then somehow, spilling forth from me came the Lord's Prayer.  I had not recited those words in decades.

With that prayer, I felt connected to everyone in the room, and I dare say we all connected and the factionalism melted away. Maybe it didn't, but from my perspective anyway, people mixed and talked more freely than before the eulogy.

I flew from Montana directly to California for Mozilla work week.  In the first hours of being there, the Mozilla CEO trauma started.  I struggled with my own emotions over what was happening with the Mozilla CEO selection.  My gut reaction was to call for resignation, but I kept silent.  It took me a week of  thinking of my own words, "We are a diverse group here today with different beliefs and many disagreements. -- With love comes forgiveness: with forgiveness comes grace" before my own attitude snapped into focus.  Sitting in the airport just before boarding a plane home, I spilled my feelings in a blog posting.  That post brought both support and condemnation.  On Brendan's resignation, in a fit of pique, I deleted that blog.

I bring it back now, as I think it is important to me and it was, apparently, important to others.  I support tolerance.  I support forgiveness.  I support grace.
I am a gay employee of the Mozilla Corporation, and I support my company's decisions regarding the selection of CEO. This doesn't mean that I'm entirely comfortable with the selection, but not because I think Brendan Eich is a threat, but instead because of the public relations repercussions.

The CEO of a corporation is the public face of the company. It is easy for the public to conflate the personal beliefs of the person with the mission of the company. For this reason, I see that that the selection of Brendan is a public relations disaster. I'm sad that it appears this firestorm was not foreseen. However, the decision is made, we must move on to focus on the real work.

Mozilla's mission is to defend and nurture the free Web. If we're not going to do it, who is? The fervor of indignation regarding our new CEO is a distraction that we do not need. Our energy should be going to support or mission not spin the personal beliefs of the CEO. These are difficult times for the Web with threats from large corporations pushing us into silos and government overreach. The energy that we expend defending our selection of CEO is energy taken from our real mission.

I have friends that hold political opinions that are antithetical to me – I do not exclude them from my life, I embrace my friends. I neither support nor understand their beliefs, but doesn't mean that I throw them away. I cannot condone holding a grudge in perpetuity. To do so would be leaving a wake of enemies behind me.   Instead, I could have them as allies beside me where we do agree.

I do not agree with Brendan's support of Prop 8. However, that particular battle is one that Brendan lost. It's over. I don't know if his opinions have changed nor do I feel that I need to know. Technically, Brendan is a good choice for CEO: we need to be a technically driven company.

Mozilla has a vocal LBGT community. Brendan could not derail us if he wanted to. I don't think that he does want to because he's focused on the real mission: the free Web. He's working with us, I, for one, am willing to set aside my trepidation and work with him, too.

I say to the larger community calling for the ouster of Brendan Eich, “please don't succumb to the knee jerk reaction.” I did at first, but with some thought, I realize that we need to focus on the future not exact retribution for the past.

There is no time in life to draw exclusionary circles.  We must find where we do agree and focus on those, for that is the only route that I see to grace.