Contribution Guide¶
Quick Setup¶
Use tox¶
To start developing, the recommended way is to use tox. This way, your development environment is automatically
prepared by tox, including virtual environment setup, dependency management, installing nourish in development mode.
Install
tox:$ pip install -U -r requirements/tox.txt # If you are inside a virtual environment, conda environment $ pip3 install --user -U -r requirements/tox.txt # If you are outside any virtual environment or conda environment and don't have tox installed
At the root directory of
nourish, run:$ tox -e devTo force updating development environment in the future, run
tox --recreate -e devwhen the development environment is not activated.To activate the development environment, run:
$ . .tox/dev/bin/activate
Traditional Method¶
Alternatively, after cloning this repository (and preferably having created and activated a virtual environment), run the following command to install all dependencies:
$ pip install -U -e .
Install all required development packages:
$ pip install -U -r requirements-dev.txt
Tests & CI¶
Run All Tests¶
Before and after one stage of development, you may want to try whether the code would pass all tests.
To run all tests on the Python versions that are supported by Nourish and available on your system, run:
$ tox -s
When you are brave, to force running all tests on all Python versions that are supported by Nourish, run:
$ tox
By default tox uses virtualenv for creating each test environment (such as for each Python version). If you
would prefer tox to use conda, run:
$ pip install tox-conda
Running Part of the Tests¶
During development, you likely would like to run only part of the tests to save time.
To run all static tests, run:
$ tox -e lint
To run all runtime tests on the Python version in the development environment, run:
$ tox -e py
To run only a specific runtime test, run:
$ pytest tests -vk [test_name] # e.g., pytest tests -vk test_default_data_dir
Read pytest command line document for its more advanced usage.
To run document generation tests, run:
$ tox -e docs
Continuous Integration (CI)¶
We prefer keeping CI configuration files, namely .github/workflows/* and .gitlab-ci.yml, simple and unscrambled.
Normally, only test environment, such as Python version, OS and tox environmental variables, or anything that is
specific to the CI system, such as failure notification. Complicated test dependencies and other test dealings should go
to tox.ini and their respective test files in tests/.
Development¶
Schema vs Schemata¶
When creating variables that hold schema information, we use the following naming conventions to minimize confusion between
what we typically refer to as a schema, and what we typically refer to as a schemata:
- schemata = refers to the YAML file or the YAML file’s full contents
A schemata is a YAML file composed of some header information and content in the form of schemas, e.g. the dataset schemata is composed of some header metadata (i.e.
api_name,name, andlast_updated) and a collection of dataset schemas (i.e. the value returned by thedatasetskey).When referring to the YAML file or its full contents, we use
schematain the name, e.g.dataset_schemata,license_schemata, orformat_schemata.
- schema = refers to a unit within the YAML file’s contents
A schema is a unit of the content of a schemata, e.g. the metadata of a particular versioned dataset from a dataset schemata, used to instantiate a
Datasetclass.When referring to the schema for a certain (or generic) dataset, license, or format, we use
schemain the name, e.g.dataset_schema, orgmb_schema.Schemas in the plural refers to a dictionary composed of all the schemas in a schemata without the schemata’s header information, e.g.
license_schemaswould refer to the value returned by thelicenseskey from the licenses schemata.
Note: for type checking purposes, both a schema and a schemata in dictionary form can be of custom type SchemaDict
Where to Expose a Symbol (Function, Class, etc.)?¶
Generally speaking:
If a symbol is likely used by a casual user regularly, it should be exposed in
nourish/__init__.py. This gives casual users the cleanest and the most direct access.If a symbol is used only by a power user, but is unlikely used by a casual user regularly, it should be exposed in a file that does not start with an underscore, such as
nourish/schema.py; or in the__init__.pyfile in a subdirectory that does not start with an underscore, such asnourish/loaders/__init__.py. The rationale is that the amount of such symbols is usually large and if we expose them at the root level of the package, it would be messy and likely confuse casual users.If a symbol is solely used for internal purpose, it should be exposed only in files starting with a single underscore, such as
nourish/_dataset.py.
Please keep in mind that the criteria above are not meant to be rigid: They should be applied flexibly in light of factors such as where existing symbols are placed and other potentially important considerations (if any).
Where to Import a Symbol?¶
When referencing a symbol that is exposed to a user, in general, we prefer importing the symbol from where the package
publicly exposes it over importing from where the source code of the symbol is defined, e.g., use from .schema import
SchemaDict rather than from ._schema import SchemaDict. This way we have more code paths that would go through
what the user would actually experience and hopefully would give us more chances to discover bugs.
Docs¶
The easiest way to generate the docs is to run the tox docs test environment. The html index file generates at
.tox/docs/out/index.html:
$ tox -e docs
To run docs tests individually or to generate the docs, cd into the docs/ directory and run any of the commands below:
$ cd docs
To generate the HTML files for the docs to the build directory (note: this will automatically regenerate the
stubfiles used by autosummary prior to generating the html files):
$ sphinx-build -d build/doctrees source build/html -b html
To check reST code style compliance, run:
$ rstcheck -r docs/source/miscellaneous docs/source/user_guide docs/source/api-references/*.rst
The reST code style compliance is also checked by the tox lint test environment if you prefer to use that:
$ tox -e lint
Dependency Version Pinning Policy¶
We should pin the versions of all Python packages that we are using solely for testing and doc generating for a stable test and doc env (e.g., future incompatibility, regression, etc.). We want to pin these because, in this project, we use these packages solely for deployment of our development environment (i.e., running tests and generating docs) and we want stable packages that are used by us for these purposes. We let Renovate verify that bumping the versions won’t break anything before we actually upgrade any of these dependencies.
We should not pin the actual dependencies of Nourish (as specified in setup.py), because Nourish is an intermediate
software layer – those should be pinned only by the actual deployed application that depends on Nourish. We should only
code the info of supported versions of these dependencies. If there is some regression or incompatibilities in the
latest versions of our dependencies, we should either work around them, or update setup.py to avoid depending on
those versions.
Releases & Publishing to PyPI¶
To make a new release:
Tag the commit following PEP 0440, e.g.
git tag -a v0.1a1 -m "First Alpha Release".Push the tag, e.g.
git push origin v0.1a1.On the GitHub Releases page, convert the tag into a release by adding a title, description, and marking whether or not it is a pre-release.
To publish to PyPI:
When a full release (not a pre-release) is published,
.github/workflows/publish-to-pypi.ymlwill automatically upload the release to PyPI.However to manually upload a release, run
python setup.py sdist bdist_wheelthentwine upload -r pypi dist/nourish-x.x*.
Pull Request & Issues¶
Developer’s Certificate of Origin (DCO)¶
To ensure licensing criteria are followed, Nourish requires all contributors to agree to the Developer Certificate of Origin
(DCO). The DCO is an attestation attached to every contribution made by every developer. In the commit message of the
contribution, the developer simply adds a Signed-off-by: Name <Email> statement and thereby agrees to the DCO.
The DCO is a commitment that the contributor has the right to submit the patch per the license. The DCO agreement can be found at http://developercertificate.org/.
The DCO sign-off can either be added manually to your commit body, or you can add either -s or --signoff to your usual
Git commit commands. If you forget to add the sign-off you can amend a previous commit with the sign-off by running
git commit --amend -s.