Python Packaging Overview

Thilina Madumal
3 min readJun 8, 2018

At the time of this writing I have been coding in python for five straight months. I came from Java/Javascript background and I was frustrated and always complaining about python packaging system. It was mainly because I did not understand the simple but effective concepts behind python packaging and also because I was thinking in terms of Java/Javascript. Now I vouch for python, python packaging, and pythonic way.

Without further ado, let’s get back to our topic. Python packaging totally based on the number of directories you have within a python project. Python setuptools are being utilized in distributing a python project and it is content as one complete package or set of packages.

__init__.py

Each directory within a project acts a package or a sub-package depending on the directory structure. To be in the best-practice side each package or sub-package should have its own __init__.py file.

— init__.py should import the components that should be exported from that package. Here we can rename the components such that whoever using the package would see the new name.

python access controlling is governed by the convention

For good or bad there is no proper access control mechanism in python. We can’t restrict an outsider from using the components that we do not intend them to use. That is to say, python access controlling is governed by the convention. It is the user’s responsibility to only depend on the components that have been revealed via __init__.py. Going beyond is at user’s own risk. The following is a simple example of __init__.py file.

from .timeseries import Timeseries as TimeseriesFactory
from .curw_schema import Data, Run, RunView
from .exceptions import DataLayerError, InconsistencyError

Directory structure of a python project

The best way see how community best practices in packaging, just create a virtualenv (say venv), pip install some famous python packages (ex. numpy, pandas), and then go to venv/lib/site-packages and see the directory structure of those.

Encapsulate all the packages in a single package.

In python it is bit tricky. You need to have another directory other than the project directory to encapsulate all the packages together. If you want to export your entire project as a single package, then you need to adhere to a similar structure as shown in Figure 1.

Figure 1

Under the project directory (i.e. data_layer) you need to have another directory (i.e. also data_layer) which encapsulates all the other packages of the project. Now when you install this package in another project you can access the components in the following manner,

from data_layer import base
from data_layer.timeseries import TimeseriesFactory

Exporting as a collection of independent packages

Figure 2

If you just have one directory (that means only the project directory, say data_layer) (see Figure 2) then once the package is installed in another project instead of a single entity you would see independent entities of packages. Typically this kind of packaging is discouraged in the community. However, It is up to you.

In this manner imports would look like as follows,

import base
from Timeseries import TimeseriesFactory

Excluding packages

In your project you might have packages that you do not need to distribute. Python setuptools comes in to the rescue. In the setup.py when you call the setup function you can explicitly specify the packages to be exported or else you can instruct the setuptools to do so as follows.

from setuptools import setup, find_packagessetup(
name='data_layer',
version='0.0.1',
packages=find_packages(exclude=['triggering_api'])
)

Of course there is lot more to the python packaging and best practices. However, in my opinion the above explanations are quite enough to get going with python in the right direction (i.e. pythonic way). Happy packaging.. :) !!

--

--