Modern Python Part 3: Run a CI Pipeline and Publish Your Package to PiPy

To propose a well-maintained and useful Python package to the open source community or even within your company, you are expected to perform a set of critical steps. First, make sure your code is unit tested. Second, respect the common writing and formatting styles. Automate these steps and integrate them into a continuous integration pipeline to avoid possible regression resulting from modifications applied to your source code. Finally, provide sufficient documentation for future users. Once that's done, it's common to publish your Python package on Python Package Index (PyPI). Here we will see how to perform each of these steps using Poetry, Tox and GitHub Actions. The code used for our use case can be found in our warehouse.

This article is the last in a series of three where we share our best practices.

Automate linter checks and tests with tox

If you don't, activate your virtual environment.

To check the conformance of our code, we use a couple of packages that will evaluate whether the code respects the standard Python writing guidelines. Then, to automate their execution as well as our unit tests, we use tox. To install them, run:

poetry add black flake8 pylint tox --dev

tox and poetry do not work well together by default. They are somewhat redundant. To be able to use them together, we need to implement some tricks (see problems 1941 and 1745). tox install its own environment and dependencies to run its tasks. But to install dependencies you need to declare the command poetry install this spring tox configuration. This creates redundancy and can lead to some questions. Additionally, this does not allow developers to install dependencies needed to run our tests. It is more productive to sound tox Used poetry.lock file to install required dependencies. For this I advise you to use tox poetry installer package developed to solve these problems:

poetry add tox-poetry-installer[poetry] --dev

Now we explain ours tox configuration in one tox.ini file whose contents are:

[tox]
envlist = py38
isolated_build = true

[testenv]
description = Linting, checking syntax and running tests
require_locked_deps = true
install_dev_deps = true
commands =
    poetry run black summarize_dataframe/summarize_df.py
    poetry run flake8 summarize_dataframe/summarize_df.py
    poetry run pylint summarize_dataframe/summarize_df.py
    poetry run pytest -v

You can watch two episodes here:

  • [tox]: Define the global settings for your tox automation pipeline including the Python version of the test environments.
  • [testenv]: Define the test environments. In our case we have some extra variables require_locked_deps and install_dev_deps which is carried by tox poetry installer package. require_locked_deps is to choose whether you want to or not tox to exploit poetry.lock file to install dependencies. install_dev_deps is to choose again tox installs the developer dependencies.

Reference tox documentation to learn more about the configuration as well tox-poetry-installer documentation to learn more about the additional configuration.

Run it tox pipeline:

tox
py38 run-test: commands[0] | poetry run black summarize_dataframe/summarize_df.py
All done! ✨ 🍰 ✨
1 file left unchanged.
py38 run-test: commands[1] | poetry run flake8 summarize_dataframe/summarize_df.py
py38 run-test: commands[2] | poetry run pylint summarize_dataframe/summarize_df.py
************* Module summarize_dataframe.summarize_df
summarize_dataframe/summarize_df.py:1:0: C0114: Missing module docstring (missing-module-docstring)
summarize_dataframe/summarize_df.py:4:0: C0103: Argument name "df" doesn't conform to snake_case naming style (invalid-name)
summarize_dataframe/summarize_df.py:11:4: C0103: Argument name "df" doesn't conform to snake_case naming style (invalid-name)
summarize_dataframe/summarize_df.py:23:4: C0103: Argument name "df" doesn't conform to snake_case naming style (invalid-name)
summarize_dataframe/summarize_df.py:43:0: C0103: Argument name "df" doesn't conform to snake_case naming style (invalid-name)

------------------------------------------------------------------
Your code has been rated at 7.62/10 (previous run: 7.62/10, +0.00)

ERROR: InvocationError for command /home/fbraza/Documents/python_project/summarize_dataframe/.tox/py38/bin/poetry run pylint summarize_dataframe/summarize_df.py (exited with code 16)
________________________________________________________ summary ________________________________________________________________
ERROR:   py38: commands failed

An error occurs due to pylint highlight some style inconsistencies. By default, tox exits if any warnings or errors occurred while executing the commands. The errors themselves are quite clear. After correcting them, run the pipeline again:

tox

py38 run-test: commands[0] | poetry run black summarize_dataframe/summarize_df.py
All done! ✨ 🍰 ✨
1 file left unchanged.
py38 run-test: commands[1] | poetry run flake8 summarize_dataframe/summarize_df.py
py38 run-test: commands[2] | poetry run pylint summarize_dataframe/summarize_df.py

--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)

py38 run-test: commands[3] | poetry run pytest -v
================================================= test session starts =============================================================
platform linux -- Python 3.8.7, pytest-5.4.3, py-1.10.0, pluggy-0.13.1 -- /home/fbraza/Documents/python_project/summarize_dataframe/.tox/py38/bin/python
cachedir: .tox/py38/.pytest_cache
rootdir: /home/fbraza/Documents/python_project/summarize_dataframe
collected 2 items                                                                                                                                                            

tests/test_summarize_dataframe.py::TestDataSummary::test_data_summary PASSED                                                                                           [ 50%]
tests/test_summarize_dataframe.py::TestDataSummary::test_display PASSED                                                                                                [100%]

================================================= 2 passed in 0.30s ===============================================================
______________________________________________________ summary ____________________________________________________________________
  py38: commands succeeded
  congratulations :)

Perfect. The tox automation pipeline succeeds locally. The next step starts implementing the CI pipeline with GitHub Actions.

Continuous integration with GitHub Actions

GitHub Actions makes it easy to automate all of your software workflows. This service is event driven which means that a set of commands are triggered when a specific event occurs. Such events can be a commit sent to the branch or a pull request. GitHub Actions are quite convenient to run all necessary tests against your code.

Most importantly, GitHub Actions provides the ability to test your Python package with multiple Python versions and on different operating systems (Linux, macOS, and Windows). All you need is an existing archive and a .github/workflows/.yaml file:

mkdir -p .github/workflows
touch .github/workflows/ci.yml

Content of .github/workflows/ci.yml the file is:

name: CI Pipeline for summarize_df

on:
  - push
  - pull_request

jobs:
  build:
    runs-on: ${{matrix.platform}}
    strategy:
      matrix:
        platform: [ubuntu-latest, macos-latest, windows-latest]
        python-version: [3.7, 3.8, 3.9]

    steps:
    - uses: actions/checkout@v1
    - name: Set up Python ${{ matrix.python-version }}
      uses: actions/setup-python@v2
      with:
        python-version: ${{ matrix.python-version }}
    - name: Install dependencies
      run: |
        python -m pip install poetry
        poetry install
    - name: Test with tox
      run: poetry run tox

A few words about the different areas:

  • on: this field defines the type of event that will trigger the pipeline.
  • jobs: this field defines the multiple steps in your pipeline. They run in an instance of a virtual environment.
  • build: this is where all the magic happens:
    • The strategy.matrix.platform field defines the different operating systems you want to use to test your package. Use template to pass these values ​​to build.runs-on field (${{matrix.platform}}).
    • The strategy.matrix.python-version field defines the different versions of Python you want to use to test your package.
    • The steps field allows you to specify which actions you use (steps.uses) and which command you want to run (steps.run)

Before you exit, change tox.ini and pyporject.toml files accordingly. At first we chose 3.8 Python version for tox. But we want it to be compatible with 3.7 and 3.9. For pyproject.toml file, select a version expected to be compatible with your package. Here we choose to make our package compatible from 3.7.1 and above. Below are the changes added to our files:


[tox]
envlist = py37,py38,py39
isolated_build = true
skip_missing_interpreters = true

[...]

[...]

[tool.poetry.dependencies]
python = "^3.7.1"

[...]

When you change pyproject.toml file, always run it poetry update command that can check for any unexpected incompatibilities between your dependencies and the version of Python you want to use.

To finish, we will install a package called tox-gh actionsto run tox in parallel on GitHub while using multiple versions of Python:

poetry add tox-gh-actions --dev

The pipeline is ready. Add, commit and push your changes to see the pipeline run:

echo "!.github/" >> .gitignore
git add .gitignore
git commit -m "build: update .gitignore to unmask .github/ folder"
git add pyproject.toml tox.ini poetry.lock `.github/workflows/ci.yml`
git commit -m "build: tox pipeline + github actions CI pipeline"

Go to your GitHub repository and click Actions tab:




GitHub action

You see all past and ongoing pipelines:




The workflow is running

Let's click on the ongoing pipeline. The pipeline runs on every OS and for every Python version. Wait a few minutes to see the result:




Job completed

All pipelines succeed! We are ready to publish our package to the PyPi registry.

Publish packages on PyPi with poetry

To make your package publishable, add some details in the [tool.poetry] section of your pyproject.toml file:

[tool.poetry]
name = "summarize_dataframe"
version = "0.1.0"
description = "A package to provide summary data about pandas DataFrame"
license = "MIT"
authors = ["fbraza "]
keywords = ["pandas", "dataframe"]
readme = "README.md"
homepage = "https://github.com/fbraza/summarize_dataframe"
repository = "https://github.com/fbraza/summarize_dataframe"
include = ["CHANGELOG.md"]

[...]

All the variables here are pretty clear. These are metadata needed for publishing the package. The include variable is interesting to add whatever files you want. In our case, we will add one CHANGELOG.md file. Do you remember commitizen? If not, take the time to read our article about it the commitizen and conventional commits. Use the following command:

It prints the semantic version from your pyproject.toml file and asks you to create a Git tag. The version will be updated based on your Git commit. Then we create CHANGELOG.md:

cz changelog
cat CHANGELOG.md







- correct pylint warnings
- split the function into two: one returning df other for output



- implementation of the summary function to summarize dataframe

Your CHANGELOG.md has been created based on the Git history you generated thanks to commitizen. Pretty neat huh?! Once that's done, let's focus on publishing our package:

poetry build
Building summarize_dataframe (0.1.0)
  - Building sdist
  - Built summarize_dataframe-0.1.0.tar.gz
  - Building wheel
  - Built summarize_dataframe-0.1.0-py3-none-any.whl

This creates a folder called dist where the built package is located. To test if everything works, you can use pip:

Do this outside of your virtual environment so as not to pollute it.

pip install path/to/your/package/summarize_dataframe-0.1.0-py3-none-any.whl

Now we need to create an account on PyPi. Just enter the expected data, validate your email and run:

poetry publish
Username: ***********
Password: ***********
Publishing summarize_dataframe (0.1.0) to PyPI
 - Uploading summarize_dataframe-0.1.0-py3-none-any.whl 100%
 - Uploading summarize_dataframe-0.1.0.tar.gz 100%

The package is now online and shared with the community.




Project publication about PyPi

Conclusion

tox provides a nice interface to automate all your unit tests and validation checks. The ecosystem around poetry becomes more mature and offers solutions to work with tox without too much hassle. Together, these two solutions allow establishing a highly efficient and coherent CI pipeline. To run the pipeline and test your packages against different operating systems or versions of Python, you can leverage GitHub Actions as described above.

poetry was at the center of our strategy. From the project initiation to its publication and review of the management of its packages and dependencies. poetry demonstrated its ease of use and efficiency that will definitely make life easier for developers, data scientists or data engineers developing projects in Python.

Our articles describe a complete setup that you can use to build your own Python project to respect good software engineering practices.

Cheating clap

tox

  • Run your tox pipeline

poetry

  • Build your package

  • Publish your package

#Modern #Python #Part #Run #Pipeline #Publish #Package #PiPy

Source link

Leave a Reply