Monday, October 15, 2012

Bundling Python files into a stand-alone executable


One of the problems with building a medium to large sized program in Python (or similar scripting languages) is distributing it to users. When a Python script grows beyond a couple hundred lines, most programmers prefer to split that single script file into multiple Python modules and packages. For an individual developer, modules and packages are primarily an aid in mental organization, though they also ease navigating around the project. For a large Python program being developed by a team, modules and packages are an important way to communicate the structure and intent of the code.

Unfortunately, distributing a multi-module Python program has a number of problems. First, you must carefully assemble all your program's dependencies in a single directory tree. Second, you need to make a zip or tarball of the directory tree for distribution. Third, you need to instruct your end users on how to unpack the zipped or tarballed program and how to correctly set their PYTHONPATH and which Python file or shell script in the directory tree to invoke to run your program.

Python has long included the distutils module to help developers distribute Python code. Distutils is focused on distributing Python modules and packages for use by other Python developers and is great for its intended purpose; it can also install shell scripts in the standard operating system command directory (such as /usr/local/bin on most UNIX-derived systems). It has a big problem though: Python libraries installed by distutils are made available to all Python code unless special care is taken. If you include any common third-party libraries in your program, you run the risk that your end user may have a different, possibly incompatible version of that library already on their system. You risk breaking other Python programs, and being broken in turn if you share libraries. Windows users have long dealt with DLL Hell, a similar problem where different Windows applications would install incompatible versions of shared libraries.

Today even the computer in your pocket has dozens of gigabytes of storage so modern development has moved away from sharing library code between programs. For Python developers, virtualenv allows you to quickly and easily create separate virtual Python installations on a single computer, each one isolated from the others and from the "real" Python installation. You can install Python modules and packages in one virtualenv without affecting the others. Used along with the pip package manager, it's easy to document and recreate a virtualenv Python environment, which is a boon to Python web developers.

Virtualenv is still overkill for end users, technical or not, who simply want to run your program in order to get their work done. Fortunately, Python quietly added a new feature in 2.5 that makes it possible to bundle up a directory full of Python code into a single executable file. I say "quietly" because Python 2.5 was released in 2006 and I only heard about this feature now in 2012, six years later. (Okay, it's possible I wasn't paying close attention. :-) Typical of Python, the feature isn't pretty but it has a certain elegance to it: the __main__.py file.

How to use a __main__.py file
The Python documentation for the __main__.py file explains its purpose succinctly but barely hints at the possibilities. I'll try to do a better job. Lets start by creating a directory for our Python application named app:
$ mkdir app
Now open your favorite text editor and create the file app/__main__.py. Add the following code to it:
# file app/__main__.py

def main():
  print('The rain in Spain falls mainly in the plain.')

if __name__ == '__main__':
  main()
If you've done some Python programming, you'll recognize the __name__ == '__main__' idiom used to determine if a python module is being executed directly rather than imported as a module. When it's executed directly, the example simply calls the main() function, which prints "The rain in Spain falls mainly in the plain." to standard out.

Now let's run this program. Instead of calling __main__.py directly, we can treat the app directory as our Python program:
$ python app
The Python interpreter sees that app is a directory and checks for a __main__.py file inside it. Note that Python only checks the top level of the directory; it doesn't search subdirectories. Since there is a __main__.py directly in app, the interpreter runs it and the output is:
The rain in Spain falls mainly in the plain.

In addition, the Python interpreter will add the directory to the start of sys.path so that all imports will check the that directory first. By placing all of the modules and packages that our program depends on in the directory, we can stay isolated from whatever versions the end user may have installed as well as keep our dependencies isolated from the end user's system.

Zip it up
Python has supported loading modules and packages out of a zip file since 2.3. Just as it now looks in a directory for __main__.py, Python will also look in a zip file for __main__.py. Let's zip up the app directory and test this.

Note that the __main__.py file needs to be at the top level in the zip container, not in a subdirectory. This makes creating the zip file a little tricky. We want to recursively zip up everything in our app directory, but not include the app directory itself. (Windows users will need a command line zip program to follow along.)
$ cd app
$ zip -r ../app.zip *
$ cd ..
(Use *.* instead of * on Windows.)

To test that you've zipped things up correctly, run your Python program directly from the zip file:
$ python app.zip
You should see the expected output:
The rain in Spain falls mainly in the plain.

Python will place the zip file first on sys.path just as it does for a directory; all modules and packages imports will search the zip file first. Be sure to place your modules and packages at the top level in your directory along side the __main__.py file.

Load a resource
If you've put all your Python code in the right place using this scheme, everything pretty much just works as you expect it to. But some programs depends on resources aside from Python code, and need to load various data files that come bundled with the program. The easiest way to find and load a program bundle like this is to use the pkg_resources module. The pkg_resources module does a lot of things, but you'll want to look first at the ResourceManager API which has the most common functions for finding and loading resource files.

Let's add a resource file to our little app and load it using the pkg_resources.resource_string function. Create a subdirectory under app called resources.
$ mkdir app/resources
Using your favorite text editor again, create the file app/resources/inFrance.txt and add some text to it:
But the ants in France are mainly in your pants.
Now edit app/__main__.py so that it looks like this:
# file app/__main__.py

import pkg_resources

def main():
  print('The rain in Spain falls mainly in the plain.')
  print(pkg_resources.resource_string('resources', 'inFrance.txt'))

if __name__ == '__main__':
  main()
You may already have pkg_resources.py installed on your system. If you don't, you'll find it's part of the distribute package. Download the latest version of distribute, unpack the tarball and find pkg_resources.py inside. Copy pkg_resources.py to app/pkg_resources.py. (Even if you already have the pkg_resources module on your system, if you use it in your program, you should add it to your bundle before distributing it to others.)

Now when you run the program:
$ python app
You should see this output:
The rain in Spain falls mainly in the plain.
But the ants in France are mainly in your pants.

Make it executable
Finally, you can turn your zipped program bundle into a stand-along executable on UNIX-like systems using a couple of commands. Zip up the latest version of the program in the app directory and name it app2.zip.
$ cd app
$ zip -r ../app2.zip *
$ cd ..
Now use a bit of UNIX magic to turn app2.zip into an executable.
$ echo '#!/usr/bin/env python' | cat - app2.zip > app2
$ chmod +x app2
The first command inserts a UNIX shebang at the start of the zip file and writes it to a new file called simply app2. The zip file format is designed to allow a small executable program to be inserted at the front (that's how self-extracting zip files are created), so this is kosher and doesn't corrupt the zip file. The second command sets the executable bits on app2.

Now you can simply run app2 like any executable.
$ ./app2
And you should see the expected output.
The rain in Spain falls mainly in the plain.
But the ants in France are mainly in your pants.

Wednesday, October 10, 2012

BSD-style license for code

We have published a fair amount of source code on the Able Pear Software blog over the past few years, but we neglected to specify any kind of software license to go along with it. We occasionally get asked about an open source license (the UrlEncoding category for NSDictionary is particularly popular).

All the code we publish on the Able Pear Blog is free for you to use under a BSD-style license, as is our Autoindigestion tool. We like the simplicity of unrestrictive open source licenses like the BSD and MIT licenses. If you decide to incorporate some of our code into your project or product, we'd love to know. Modern software is so complex, it's simply not practical to write everything from scratch. Many parts of OS X and iOS have their roots in the FreeBSD project, which itself is a descendant of BSD UNIX. Even Microsoft has incorporated some BSD code in Windows.

Here is the Able Pear Software blog open source code license. Adjust the copyright year to match the date on the blog post. (Please note that the full text of the blog is not included in this license, only the source code.)
Copyright (c) 2012, Able Pear Software Inc.
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

- Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

- Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Monday, October 8, 2012

Smart App Banners

The new version of Mobile Safari that ships with iOS 6 has a great new feature for app publishers: Smart App Banners.
When a user visits your site in Mobile Safari on iOS, you can now add a pop-up banner to promote your iOS app, which includes a direct link to the app in the App Store and optionally your iTunes affiliate information. David Smith has a great overview on his blog, and you can find all the details of Smart App Banners in Apple's Safari Web Content Guide on the Apple developer site.

Though I doubt that this will do away with all those annoying full-page "get our iPad app" pop-ups, I hope that at least some sites will start to use this instead. Thanks Apple for providing something to help sites promote their apps in a less annoying way.