Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise "Testing" and "Software Maturity" standards #35

Open
jtniehof opened this issue Aug 26, 2024 · 8 comments
Open

Revise "Testing" and "Software Maturity" standards #35

jtniehof opened this issue Aug 26, 2024 · 8 comments

Comments

@jtniehof
Copy link
Contributor

jtniehof commented Aug 26, 2024

(Broader context at the end).

This issue is to discuss a potential PHEP to replace the "testing" and "software maturity" portions of the existing PyHC standards: to lay out what looks like the current standards, and allow for suggested changes.

I am lumping these together as potentially related. I am not sure what the appropriate division of these existing standards is. I suggest we focus the conversation in this issue on scoping: what standards belong together in a future document? Then we can have new issues to discuss the actual standards.

The relevant standards are:

    1. Packaging: All code must be organized and provided as part of installable Python packages.
    1. Releases: Projects should strive to have consistent and stable releases. Those releases should be made available through PyPI and Conda. Code that is not yet stable must have a release number that is less than 1.0 (e.g. 0.x). Packages should consider using semantic versioning.
    1. Operating System Support: Packages must strive to support all major operating systems (e.g., OS X, Linux, Windows).
    1. Version control: All code must use version control. It is strongly recommended that projects make use of a distributed version control system (e.g., git).
    1. Coding Style: Projects must adopt the basic style recommendations of PEP 8 and static analysis tools should be used to identify deviations from the basic style recommendations (e.g. pylint, flake8, pycodestyle).
    1. Testing: Stable packages must provide unit tests of individual components (e.g. functions, classes) as well as integration tests that test the interaction between components that covers most of the code. Testing coverage should be measured. Automated testing is recommended, in which tests are run before any code is merged. System[link] and acceptance[link] testing are also recommended.
    1. Dependencies: Projects should import the minimum number of packages necessary. Adding new dependencies should be a considered decision.
    1. Binaries: Binary files should be added to the package repository only when necessary in order to keep packages as light as possible. Jupyter notebooks can be binary files and should not be committed to the package repository but can be provided in other repositories.

(bullet list to fight the renumbering, apologies for roman numerals and suggestions welcome)

I am not seeing any relevant suggested updates in #16.

Standard 10 is somewhat in tension with Standard 12 (duplication of effort); 12 is discussed in #33.

Broader context: I envision a process of "patriating the constitution" where we revisit the existing standards documents and incorporate them into PHEPs, potentially updated, that are explicitly noted as replacing the relevant standards. We probably do not want one PHEP per standard. Our previous grouping in the review guidelines seems a good start.

@jtniehof
Copy link
Contributor Author

jtniehof commented Aug 26, 2024

My feeling here is that 1, 3, 4 relate to the release process and supported environments and belong together, along with other changes we've discussed (e.g. binary wheels and the like). They're user-facing.

I could see 6 potentially moving into the community section (#33) or staying here.

7, 9, 10, 14 are all pretty clearly development and repository practices (with 6, if it stays here). "Maturity" may not be the best name for this set.

@jtniehof
Copy link
Contributor Author

jtniehof commented Sep 11, 2024

In the interests of prodding discussion, I'm throwing out a strawman:

A "packaging standards" PHEP would incorporate (changes are bold):

  • Packaging (standard 1), must be installable Python package. At least should use a PEP517/518 compliant build system. (Perhaps a must).
  • Releases (standard 3), must make releases on a regular basis (not necessarily scheduled, wordsmithing welcome on this...basically not "disappear and tell people to build from main for years"). Must be on PyPI. Should be on conda. Should deliver wheels, including binary wheels if needs compiled code. Must have a consistent and documented versioning scheme.
  • OS support (standard 4): Must support Linux, OSX, Windows. Must support current architectures. This needs some tuning--requiring support for Mac ARM is reasonable. Windows ARM isn't.

EDIT: I can see a "reference implementation" of a PyHC community-supported package that's just the framework for this....

Standard 6 moves to community (#33) and I'll explain why there.

A "development practices" PHEP would incorporate:

  • Coding standards (standard 7): Must adopt PEP8, should static analysis, should static analysis in CI
  • Testing (standard 9): Must include unit tests in repository. Must document preferred way of running unit tests. Should measure coverage and target reasonable code coverage while considering data coverage. Should use CI. Should have procedures for acceptance and system testing (this should be fleshed out more than in the previous standard, also see Missing link in standards doc #18)
  • Dependencies (standard 10): I'm really at a loss at the way forward here. There's a lot to be said for minimum dependencies. There's also a lot to be said for not reinventing the wheel. What is the goal here? How can it be evaluated? How can it be described?
  • Binaries (standard 14): Should not commit binaries (not commit large binaries? how big is large?) Should include relevant source where possible. Should use jupytext for notebooks, either instead of or in addition to binary notebook formats.

@rebeccaringuette
Copy link

rebeccaringuette commented Oct 7, 2024

Throwing in some ideas on how this might relate to the package tiers.

  • packaging is already listed in @ #PHEP 4: PyHC Package Tiering #31, but specifically for pip and conda. I am assuming the git clone method is implied for the copper tier. With other installation methods becoming more popular (e.g. mamba), perhaps the packaging item should be separated from the tier description for easier updating? Or is such a separation too granular?
  • I think the compliance of the build system to PEP517/518 could come in at the silver level or perhaps the bronze level depending on how hard this is across the PyHC landscape. A quick scan of PEP 518 turned up a mention of the setup.cfg option as a rejected idea and no mention in PEP 517. Perhaps the pyproject.toml approach becomes a higher tier "must" (silver or gold?), the setup.cfg is allowed at the next lower level, and only as a "should" for lower levels.
  • The complexity of binary wheels likely means this would be a silver or gold "must" and a "should" for lower levels.
  • for the release schedule, could put that requirement as a "must" at the silver or gold level, especially since that can be a significant demand. I think this component is part of the pyOpenSci maintenance reviews that happen long term. Perhaps @jibarnum would know more on this. The related PHEP has a home for that discussion, although the title of it should perhaps be made more clear: Define requirements for PyHC "affiliation"  #37
  • architecture support: suggest "must support common architectures available at the time of release, including Windows, Linux, and Mac" to pull this requirement down, otherwise an "all" seems to be implied. Not sure if this should be a bronze or silver "must", but it seems basic enough to be a requirement earlier than gold.
  • it makes sense to me to have versioning documentation items be included in whatever PHEP talks about versioning. If versioning moves to Revise "community" and "license" standards #33, then perhaps the versioning documentation should move there too. However, versioning seems quite different from the licensing discussion.
  • The PEP8 line is a bit cryptic. PEP8 is simple enough to be a "must" at bronze. I'm not sure what you mean by "should static analysis".
  • The unit coverage / testing item may be something that pyOpenSci already incorporates. Should check with them on this to see if we like what they already do or have enough objections to request a custom change for the PyHC/pyOpenSci review process (to be created).
  • Suggest moving the dependencies item to a PHEP focused on interoperability. For now, suggesting a temp rewording to be something like "Adding new dependencies should be a considered decision. However, dependencies are encouraged to avoid reinventing in one package what is already accomplished in another, especially if that dependency is already supported in a recent PyHC software environment used for the PyHC Summer School." Likely missing important pieces here, but there's a start.
  • I hesitate on including a specific method for notebooks (e.g. jupytext) since that may make the PHEP become too quickly outdated. I don't have enough background to further comment on binaries, but perhaps it is a good idea to include that notebooks should be committed to a repository in a cleaned state (no outputs saved) to decrease the size.

Let the tomato throwing contest begin! :D

@sapols
Copy link
Contributor

sapols commented Nov 13, 2024

Good "Style & static checks" guide from Scientific Python: https://learn.scientific-python.org/development/guides/style/

@sapols
Copy link
Contributor

sapols commented Nov 13, 2024

Codecov is a quite useful CI tool for measuring code coverage.

Although Stuart pointed out this How to Ditch Codecov for Python Projects

And pySPEDAS uses coveralls.io

@sapols
Copy link
Contributor

sapols commented Nov 13, 2024

The pyOpenSci packaging guide has been undergoing a lot of growth.

@jtniehof
Copy link
Contributor Author

Notes regarding a potential development practices PHEP:

  • Tweak the binaries-in-repository requirement. Have some sort of size limit, suggestion that it not be installed if just needed for tests, and a suggestion to generate on the fly at install time if it's possible.
  • If do static analysis, "should" do it in CI
  • Potential tweaks to testing requirement regarding acceptance and system testing, mention concept of data coverage, probably no %age coverage yet (but encourage coverage, in CI)
  • Do include documentation on how to run tests
  • Some discussion of how to ship tests--do the tests (and test data) come in with pip install, how else can we make it easy for Shawn to run the tests?
  • codecov can check coverage of changed lines, although we may not want to recommend codecov
  • Use reference implementations and "how to teach this" to point people at good examples of doing static checks, testing, and coverage. PySPEDAS example; @tech3371 has an IMAP example of checking coverage; other links that @sapols posted. Do this instead of specifying particular static analysis tools in main spec. (Similar for typehints).
  • We may want to remove the dependencies requirement. We probably have a bigger issue with duplication than with people importing too many dependencies. Potentially a note about not bundling dependencies

@jtniehof
Copy link
Contributor Author

Notes regarding packaging PHEP:

  • Again rely on packaging guides in "how to teach this" and existing reference implementations, see Shawn's links, rather than being exhaustive
  • pyOpenSci has packaging standards, as Shawn linked, we should look at those and either use in "how to teach" or make adopting them a "must"
  • Scientific Python simple packaging guide (more complicated examples linked same page)
  • find guides for building wheels in CI
  • In general, community support and documentation (doesn't have to be PHEPs, just docs!) more important than standards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants
@jtniehof @sapols @rebeccaringuette and others