
Modern Python Part 3: Run a CI Pipeline and Publish Your Package to PiPy
To propose a well-maintained and useful Python package to the open source community or even within your company, you are expected to perform a set of critical steps. First, make sure your code is unit tested. Second, respect the common writing and formatting styles. Automate these steps and integrate them into a continuous integration pipeline to avoid possible regression resulting from modifications applied to your source code. Finally, provide sufficient documentation for future users. Once that's done, it's common to publish your Python package on Python Package Index (PyPI). Here we will see how to perform each of these steps using Poetry, Tox and GitHub Actions. The code used for our use case can be found in our warehouse.
This article is the last in a series of three where we share our best practices.
Automate linter checks and tests with tox
If you don't, activate your virtual environment.
To check the conformance of our code, we use a couple of packages that will evaluate whether the code respects the standard Python writing guidelines. Then, to automate their execution as well as our unit tests, we use tox. To install them, run:
poetry add black flake8 pylint tox --dev
tox
and poetry
do not work well together by default. They are somewhat redundant. To be able to use them together, we need to implement some tricks (see problems 1941 and 1745). tox
install its own environment and dependencies to run its tasks. But to install dependencies you need to declare the command poetry install
this spring tox
configuration. This creates redundancy and can lead to some questions. Additionally, this does not allow developers to install dependencies needed to run our tests. It is more productive to sound tox
Used poetry.lock
file to install required dependencies. For this I advise you to use tox poetry installer package developed to solve these problems:
poetry add tox-poetry-installer[poetry] --dev
Now we explain ours tox
configuration in one tox.ini
file whose contents are:
[tox]
envlist = py38
isolated_build = true
[testenv]
description = Linting, checking syntax and running tests
require_locked_deps = true
install_dev_deps = true
commands =
poetry run black summarize_dataframe/summarize_df.py
poetry run flake8 summarize_dataframe/summarize_df.py
poetry run pylint summarize_dataframe/summarize_df.py
poetry run pytest -v
You can watch two episodes here:
[tox]
: Define the global settings for yourtox
automation pipeline including the Python version of the test environments.[testenv]
: Define the test environments. In our case we have some extra variablesrequire_locked_deps
andinstall_dev_deps
which is carried by tox poetry installer package.require_locked_deps
is to choose whether you want to or nottox
to exploitpoetry.lock
file to install dependencies.install_dev_deps
is to choose againtox
installs the developer dependencies.
Reference
tox
documentation to learn more about the configuration as welltox-poetry-installer
documentation to learn more about the additional configuration.
Run it tox
pipeline:
tox
py38 run-test: commands[0] | poetry run black summarize_dataframe/summarize_df.py
All done! ✨ 🍰 ✨
1 file left unchanged.
py38 run-test: commands[1] | poetry run flake8 summarize_dataframe/summarize_df.py
py38 run-test: commands[2] | poetry run pylint summarize_dataframe/summarize_df.py
************* Module summarize_dataframe.summarize_df
summarize_dataframe/summarize_df.py:1:0: C0114: Missing module docstring (missing-module-docstring)
summarize_dataframe/summarize_df.py:4:0: C0103: Argument name "df" doesn't conform to snake_case naming style (invalid-name)
summarize_dataframe/summarize_df.py:11:4: C0103: Argument name "df" doesn't conform to snake_case naming style (invalid-name)
summarize_dataframe/summarize_df.py:23:4: C0103: Argument name "df" doesn't conform to snake_case naming style (invalid-name)
summarize_dataframe/summarize_df.py:43:0: C0103: Argument name "df" doesn't conform to snake_case naming style (invalid-name)
------------------------------------------------------------------
Your code has been rated at 7.62/10 (previous run: 7.62/10, +0.00)
ERROR: InvocationError for command /home/fbraza/Documents/python_project/summarize_dataframe/.tox/py38/bin/poetry run pylint summarize_dataframe/summarize_df.py (exited with code 16)
________________________________________________________ summary ________________________________________________________________
ERROR: py38: commands failed
An error occurs due to pylint highlight some style inconsistencies. By default, tox
exits if any warnings or errors occurred while executing the commands. The errors themselves are quite clear. After correcting them, run the pipeline again:
tox
py38 run-test: commands[0] | poetry run black summarize_dataframe/summarize_df.py
All done! ✨ 🍰 ✨
1 file left unchanged.
py38 run-test: commands[1] | poetry run flake8 summarize_dataframe/summarize_df.py
py38 run-test: commands[2] | poetry run pylint summarize_dataframe/summarize_df.py
--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)
py38 run-test: commands[3] | poetry run pytest -v
================================================= test session starts =============================================================
platform linux -- Python 3.8.7, pytest-5.4.3, py-1.10.0, pluggy-0.13.1 -- /home/fbraza/Documents/python_project/summarize_dataframe/.tox/py38/bin/python
cachedir: .tox/py38/.pytest_cache
rootdir: /home/fbraza/Documents/python_project/summarize_dataframe
collected 2 items
tests/test_summarize_dataframe.py::TestDataSummary::test_data_summary PASSED [ 50%]
tests/test_summarize_dataframe.py::TestDataSummary::test_display PASSED [100%]
================================================= 2 passed in 0.30s ===============================================================
______________________________________________________ summary ____________________________________________________________________
py38: commands succeeded
congratulations :)
Perfect. The tox
automation pipeline succeeds locally. The next step starts implementing the CI pipeline with GitHub Actions.
Continuous integration with GitHub Actions
GitHub Actions makes it easy to automate all of your software workflows. This service is event driven which means that a set of commands are triggered when a specific event occurs. Such events can be a commit sent to the branch or a pull request. GitHub Actions are quite convenient to run all necessary tests against your code.
Most importantly, GitHub Actions provides the ability to test your Python package with multiple Python versions and on different operating systems (Linux, macOS, and Windows). All you need is an existing archive and a .github/workflows/
file:
mkdir -p .github/workflows
touch .github/workflows/ci.yml
Content of .github/workflows/ci.yml
the file is:
name: CI Pipeline for summarize_df
on:
- push
- pull_request
jobs:
build:
runs-on: ${{matrix.platform}}
strategy:
matrix:
platform: [ubuntu-latest, macos-latest, windows-latest]
python-version: [3.7, 3.8, 3.9]
steps:
- uses: actions/checkout@v1
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install poetry
poetry install
- name: Test with tox
run: poetry run tox
A few words about the different areas:
on
: this field defines the type of event that will trigger the pipeline.jobs
: this field defines the multiple steps in your pipeline. They run in an instance of a virtual environment.build
: this is where all the magic happens:- The
strategy.matrix.platform
field defines the different operating systems you want to use to test your package. Use template to pass these values tobuild.runs-on
field (${{matrix.platform}}
). - The
strategy.matrix.python-version
field defines the different versions of Python you want to use to test your package. - The
steps
field allows you to specify which actions you use (steps.uses
) and which command you want to run (steps.run
)
- The
Before you exit, change tox.ini
and pyporject.toml
files accordingly. At first we chose 3.8
Python version for tox
. But we want it to be compatible with 3.7
and 3.9
. For pyproject.toml
file, select a version expected to be compatible with your package. Here we choose to make our package compatible from 3.7.1
and above. Below are the changes added to our files:
[tox]
envlist = py37,py38,py39
isolated_build = true
skip_missing_interpreters = true
[...]
[...]
[tool.poetry.dependencies]
python = "^3.7.1"
[...]
When you change
pyproject.toml
file, always run itpoetry update
command that can check for any unexpected incompatibilities between your dependencies and the version of Python you want to use.
To finish, we will install a package called tox-gh actionsto run tox
in parallel on GitHub while using multiple versions of Python:
poetry add tox-gh-actions --dev
The pipeline is ready. Add, commit and push your changes to see the pipeline run:
echo "!.github/" >> .gitignore
git add .gitignore
git commit -m "build: update .gitignore to unmask .github/ folder"
git add pyproject.toml tox.ini poetry.lock `.github/workflows/ci.yml`
git commit -m "build: tox pipeline + github actions CI pipeline"
Go to your GitHub repository and click Actions tab:
You see all past and ongoing pipelines:
Let's click on the ongoing pipeline. The pipeline runs on every OS and for every Python version. Wait a few minutes to see the result:
All pipelines succeed! We are ready to publish our package to the PyPi registry.
Publish packages on PyPi with poetry
To make your package publishable, add some details in the [tool.poetry]
section of your pyproject.toml
file:
[tool.poetry]
name = "summarize_dataframe"
version = "0.1.0"
description = "A package to provide summary data about pandas DataFrame"
license = "MIT"
authors = ["fbraza " ]
keywords = ["pandas", "dataframe"]
readme = "README.md"
homepage = "https://github.com/fbraza/summarize_dataframe"
repository = "https://github.com/fbraza/summarize_dataframe"
include = ["CHANGELOG.md"]
[...]
All the variables here are pretty clear. These are metadata needed for publishing the package. The include
variable is interesting to add whatever files you want. In our case, we will add one CHANGELOG.md
file. Do you remember commitizen
? If not, take the time to read our article about it the commitizen and conventional commits. Use the following command:
It prints the semantic version from your pyproject.toml
file and asks you to create a Git tag. The version will be updated based on your Git commit. Then we create CHANGELOG.md
:
cz changelog
cat CHANGELOG.md
- correct pylint warnings
- split the function into two: one returning df other for output
- implementation of the summary function to summarize dataframe
Your CHANGELOG.md
has been created based on the Git history you generated thanks to commitizen
. Pretty neat huh?! Once that's done, let's focus on publishing our package:
poetry build
Building summarize_dataframe (0.1.0)
- Building sdist
- Built summarize_dataframe-0.1.0.tar.gz
- Building wheel
- Built summarize_dataframe-0.1.0-py3-none-any.whl
This creates a folder called dist
where the built package is located. To test if everything works, you can use pip
:
Do this outside of your virtual environment so as not to pollute it.
pip install path/to/your/package/summarize_dataframe-0.1.0-py3-none-any.whl
Now we need to create an account on PyPi. Just enter the expected data, validate your email and run:
poetry publish
Username: ***********
Password: ***********
Publishing summarize_dataframe (0.1.0) to PyPI
- Uploading summarize_dataframe-0.1.0-py3-none-any.whl 100%
- Uploading summarize_dataframe-0.1.0.tar.gz 100%
The package is now online and shared with the community.
Conclusion
tox
provides a nice interface to automate all your unit tests and validation checks. The ecosystem around poetry
becomes more mature and offers solutions to work with tox
without too much hassle. Together, these two solutions allow establishing a highly efficient and coherent CI pipeline. To run the pipeline and test your packages against different operating systems or versions of Python, you can leverage GitHub Actions as described above.
poetry
was at the center of our strategy. From the project initiation to its publication and review of the management of its packages and dependencies. poetry
demonstrated its ease of use and efficiency that will definitely make life easier for developers, data scientists or data engineers developing projects in Python.
Our articles describe a complete setup that you can use to build your own Python project to respect good software engineering practices.
Cheating clap
tox
-
Run your tox pipeline
poetry
-
Build your package
-
Publish your package
#Modern #Python #Part #Run #Pipeline #Publish #Package #PiPy
Source link