Python has been around for so long, yet it still lacks a de facto standard for project management and build tools. As a result, Python project structures and build methods vary widely, perhaps reflecting Python’s spirit of freedom.
Unlike Java, which evolved from manual builds to semi-automated Ant, and then to Maven, which is now essentially the de facto standard. During this time, Maven has faced challenges from other tools like Gradle (primarily promoted for Android projects), SBT (mainly for Scala projects), Ant+Ivy, and Buildr, but none have seriously shaken Maven’s dominance. Moreover, these other tools largely follow Maven’s directory layout.
Back to Python: tools like pip, pipenv, and conda have emerged as package managers, but they don’t impose any conventions on project directory structure.
Regarding builds, many still use the traditional Makefile approach, supplemented with setup.py and build.py scripts for installation and building via code. For project directory structure, some have created project templates and then built tools to apply those templates.
Below, we’ll briefly examine the usage of four tools:
- CookieCutter
- PyScaffold
- PyBuilder
- Poetry
CookieCutter – A Classic Python Project Directory Structure
bash
$ pip install cookiecutter $ cookiecutter gh:audreyr/cookiecutter-pypackage # Uses the GitHub template audreyr/cookiecutter-pypackage, asks a series of questions, and generates a Python project ...... project_name [Python Boilerplate]: sample ......
The final project template generated by cookiecutter looks like this:
bash
$ tree sample sample ├── AUTHORS.rst ├── CONTRIBUTING.rst ├── HISTORY.rst ├── LICENSE ├── MANIFEST.in ├── Makefile ├── README.rst ├── docs │ ├── Makefile │ ├── authors.rst │ ├── conf.py │ ├── contributing.rst │ ├── history.rst │ ├── index.rst │ ├── installation.rst │ ├── make.bat │ ├── readme.rst │ └── usage.rst ├── requirements_dev.txt ├── sample │ ├── __init__.py │ ├── cli.py │ └── sample.py ├── setup.cfg ├── setup.py ├── tests │ ├── __init__.py │ └── test_sample.py └── tox.ini 3 directories, 26 files
This represents the main framework of a currently popular directory structure. The core elements are:
bash
$ tree sample
sample
├── Makefile
├── README.rst
├── docs
│ └── index.rst
├── requirements.txt
├── sample
│ ├── __init__.py
│ └── sample.py
├── setup.cfg
├── setup.py
└── tests
├── __init__.py
└── test_sample.py
The project root sample contains a subdirectory also named sample for Python source files, a tests directory for test files, a docs directory for documentation, a README.rst, and other files for building: setup.py, setup.cfg, and Makefile.
This is indeed a classic Python project structure. The build process then uses the make command. Entering make shows the commands defined in the Makefile:
bash
$ make clean remove all build, test, coverage and Python artifacts clean-build remove build artifacts clean-pyc remove Python file artifacts clean-test remove test and coverage artifacts lint check style test run tests quickly with the default Python test-all run tests on every Python version with tox coverage check code coverage quickly with the default Python docs generate Sphinx HTML documentation, including API docs servedocs compile the docs watching for changes release package and upload a release dist builds source and wheel package install install the package to the active Python's site-packages
To use the above build process, you need to install packages like tox, wheel, coverage, sphinx, flake8, which can all be installed via pip. Then you can run make test, make coverage, make docs, make dist, etc. The make docs command can generate very nice web documentation.
PyScaffold – Creating a Project
As the name suggests, PyScaffold is a tool for creating Python project scaffolding.
Installation and usage:
bash
$ pip install pyscaffold $ putup sample
This creates a Python project. The directory structure is similar to the cookiecutter template mentioned earlier, except it places source files in a src directory rather than a sample directory.
bash
$ tree sample sample ├── AUTHORS.rst ├── CHANGELOG.rst ├── CONTRIBUTING.rst ├── LICENSE.txt ├── README.rst ├── docs │ ├── Makefile │ ├── _static │ ├── authors.rst │ ├── changelog.rst │ ├── conf.py │ ├── contributing.rst │ ├── index.rst │ ├── license.rst │ ├── readme.rst │ └── requirements.txt ├── pyproject.toml ├── setup.cfg ├── setup.py ├── src │ └── sample │ ├── __init__.py │ └── skeleton.py ├── tests │ ├── conftest.py │ └── test_skeleton.py └── tox.ini
The entire project build now uses the tox tool. Tox is an automation tool for testing and building. It creates Python virtual environments during the build process, ensuring a clean environment for testing and building.
tox -av displays all tasks defined in tox.ini:
bash
$ tox -av default environments: default -> Invoke pytest to run automated tests additional environments: build -> Build the package in isolation according to PEP517, see https://github.com/pypa/build clean -> Remove old distribution files and temporary build artifacts (./build and ./dist) docs -> Invoke sphinx-build to build the docs doctests -> Invoke sphinx-build to run doctests linkcheck -> Check for broken links in the documentation publish -> Publish the package you have been developing to a package index server. By default, it uses testpypi. If you really want to publish your package to be publicly accessible in PyPI, use the `-- --repository pypi` option.
Execute commands using tox -e build, tox -e docs, etc.
In my experience, each tox step seems relatively slow, likely due to the time needed to create virtual environments.
PyBuilder
Let’s also look at another build tool, PyBuilder. The directory structure it creates is very close to Maven’s. Let’s take a look:
bash
$ pip install pybuilder $ mkdir sample && cd sample # Project directory needs to be created manually $ pyb --start-project # After answering some questions, creates the necessary directories and files
Afterwards, examine its directory structure:
bash
$ tree sample
.
├── build.py
├── docs
├── pyproject.toml
├── setup.py
└── src
├── main
│ ├── python
│ └── scripts
└── unittest
└── python
The build process still uses the pyb command. Use pyb -h for help, pyb -t to list all tasks. PyBuilder’s tasks are added as plugins, configured in the build.py file.
bash
$ pyb -t sample
Tasks found for project "sample":
analyze - Execute analysis plugins.
depends on tasks: prepare run_unit_tests
clean - Cleans the generated output.
compile_sources - Compiles source files that need compilation.
depends on tasks: prepare
coverage - <no description available>
depends on tasks: verify
install - Installs the published project.
depends on tasks: package publish(optional)
package - Packages the application. Package a python application.
depends on tasks: compile_sources run_unit_tests(optional)
prepare - Prepares the project for building. Creates target VEnvs
print_module_path - Print the module path.
print_scripts_path - Print the script path.
publish - Publishes the project.
depends on tasks: package verify(optional) coverage(optional)
run_integration_tests - Runs integration tests on the packaged application.
depends on tasks: package
run_unit_tests - Runs all unit tests. Runs unit tests based on Python's unittest module
depends on tasks: compile_sources
upload - Upload a project to PyPi.
verify - Verifies the project and possibly integration tests.
depends on tasks: run_integration_tests(optional)
$ pyb run_unit_tests sample
PyBuilder also creates virtual environments before building or testing. Starting from version 0.12.9, you can skip virtual environment creation with the --no-venvs parameter. With --no-venvs, Python code will execute in the current Python environment where pyb is run, and required dependencies must be installed manually.
Project dependencies are also defined in the build.py file:
python
@init
def set_properties(project):
project.depends_on('boto3', '>=1.18.52')
project.build_depends_on('mock')
Then, when executing pyb to create the virtual environment, the above dependencies will be installed, and tests and builds will run within it.
Poetry
Finally, Poetry. This feels like a more mature, actively developed Python build tool with powerful dependency management. poetry add boto3 adds a dependency, and poetry show --tree displays the dependency tree. Let’s see how to install and create a project:
bash
$ pip install poetry $ poetry new sample
The project it creates is simpler than the ones above:
bash
$ tree sample
sample
├── README.rst
├── pyproject.toml
├── sample
│ └── __init__.py
└── tests
├── __init__.py
└── test_sample.py
If you add the --src parameter to poetry new, the source file directory sample will be placed under src, i.e., sample/src/sample.
poetry init generates a pyproject.toml file in the current directory; directories need to be created manually.
It doesn’t focus on documentation generation, code style checks, or code coverage. Its project configuration is more centralized, all in the pyproject.toml file. What is toml? It’s a configuration file format: Tom’s Obvious, Minimal Language (https://github.com/toml-lang/toml).
pyproject.toml is somewhat similar to NodeJS’s package.json file. For example:
bash
# Adds a dependency on boto3 to pyproject.toml and installs it (add can also install dependencies from local or git) poetry add boto3 # Installs the corresponding dependencies defined in the pyproject.toml file into the current Python virtual environment # For example, in <test-venv>/lib/python3.9/site-packages. After installation, modules can also be used by test cases. poetry install
Other main commands:
poetry build# Builds installable *.whl and tar.gz filespoetry shell# Creates and uses a virtual environment based on dependencies defined in pyproject.tomlpoetry run pytest# Runs test cases using pytest, like tests/test_sample.pypoetry run python -m unittest tests/sample_tests.py# Runs unittest test casespoetry export --without-hashes --output requirements.txt# Exports a requirements.txt file. Use –dev to include dev dependencies. Or:poetry export --without-hashes > requirements.txt
poetry run can execute any system command, but it runs them in its required virtual environment. Therefore, for a Poetry project to generate documentation or coverage reports, commands like poetry run ... must support sphinx, coverage, or flake8.
Create a file my_module.py in the sample directory (same level as pyproject.toml) with the content:
python
def main():
print('hello poetry')
Then, in pyproject.toml, write:
toml
[tool.poetry.scripts] my-script="sample.my_module:main"
Then execute:
bash
$ poetry run my-script
It will output “hello poetry”.
Based on the understanding of these four tools, the complexity of the project structure decreases in the order: cookiecutter-pyproject -> PyScaffold -> PyBuilder -> Poetry, and the difficulty of use roughly follows the same order.