Tuesday, May 21, 2013

Advance Argument Parsing in Python: Reading Args from a File

Python is a great environment to write small scripts to make life little easier.  In no time these scripts can become full fledged tool that is used by lot of people and now you have to keep things organized in your script/tool.  One of the most important interface/design of a tool is how it handles command line arguments.  Python's standard library provides a feature rich argparse module to provide a standardized way for your tool to manage command line arguments.  Some of the key features of argparse module are:
  • Parse positional (required) arguments and optional arguments
  • Generate a nicely formatted -h (help) or usage output and customize it
  • Parse not only from sys.argv but also from parse from a list or string in your script
  • Read arguments from one or multiple files
  • Custom argument types, for example reading csv style arguments or dictionary style arguments

There are many more features not mentioned here, but you can always take a look at the argparse documentation to find more about it.  In this blog post I am going to cover a feature that is really useful - Reading arguments from file.  In my experience as the script/tool keeps growing, we just keep adding more and more arguments and the final command line becomes bloated that spans more than few lines of your terminal.

Enabling Argparse to read from file

argparse module provides built-in support to read arguments from file.  To enable this feature, when ArgumentParser object is created, pass fromfile_prefix_char argument with a valid character as shown in an example below:
parser = argparse.ArgumentParser(fromfile_prefix_char='@')
This enables parser object to treat any argument starting with '@' character as file and read arguments from that file.  For example, a defaut.args file has following content:
--verbose 1
--debug 0
--enable-all
--out output.txt
This default.args file can be passed to your tool with '@' argument and ArgumentParser object will be able to read all the arguments of the file as they were passed on the command line:
$ ./my_tool.py @default.args
Running above command on your command line sets parser.verbose to 1 and parser.debug to 0. Internally ArgumentParser object reads the file, parse all the arguments and place them into a common buffer that holds all arguments, those that are passed on the command line and also that are passed in default.args. One shortcoming of this feature is each line must have only one argument (you can easily overcome this limitation by extending ArgumentParser class, just keep reading..). This enables us to override argument values passed in the file from command line. For example,
$ ./my_tool.py @default.args --verbose 2
Above command will set parser.verbose value to 2 instead of 1 that was specified in default.args. This enables users to play with the arguments without changing the default argument files.

Reading Multiple Files

Because all the arguments are treated equal (either from command line or from file), you can pass multiple file arguments or create a hierarchy by adding file argument within a file.  To show an example lets create a debug.args file with following content:
--debug 1
--loglevel 3
Now you can pass two files on command line like below:
$ ./my_tool.py @default.args @debug.args
This will read arguments from both default.args and debug.args, and it will override parser.debug value to be 1.  Another option is to include @default.args in the specific.args at the top as shown below:
@default.args
--debug 1
--loglevel 3
And now you only have to pass only one file in the command line like this:
$ ./my_tool.py @debug.args

Supporting Comments in a File

Now that you have created multiple files and you'll find yourself adding/removing arguments from the file while you play with your tool.  It would be nice if argparse can natively support comments but it does the second best thing, allows users to easily extend ArgumentParser class to support comments and many more features.

The ArgumentParser class implements convert_arg_line_to_args(..) function that contains default implementation of ArgumentParser's file reading feature.  The default implementation is very simple, split the line by space and pass each element as argument.  In order to support comments in args file, we can create our own parser class that extends ArgumentParser and overrides convert_arg_line_to_args function.
class CustomArgumentParser(argparse.ArgumentParser):
    def __init__(self, *args, **kwrags):
        super(CustomArgumentParser, self).__init__(*args, **kwargs)

    def convert_arg_line_to_args(self, line):
        for arg in line.split():
            if not arg.strip():
                continue
            if arg[0] == '#':
                break
            yield arg
In above code, CustomArgumentParser class overrides convert_arg_line_to_args function.  This function is called for each line in a file.  This example treat any string after '#' as comment, just like Python language.  To detect comments, convert_arg_line_to_args check for '#' in first character of each element.  If '#' is found, rest of the elements from the line are ignored, hence treating it as comment.

Now create your parser object from CustomArgumentParser and your tool will have one more feature - comments in args file.  The comment support is just one example, you can also extend it to support multi-line comments, handling string based arguments etc.

Python's standard library modules have lots of little gems like this that allows users to extend basic functionality to create robust and powerful tools.