Overall goals/plan #1

yarikoptic · 2021-03-30T16:24:52Z

copy/pasted from the original hackathon project page

Cover following testing scenarios
- save/load cycle resulting in identical data structure
  - have helper to produce "saved" .nwb files with clear "provenance" records to test against
  - each extension should provide interface to produce "an example (lean but representative)" file
- ability to load NWB files saved by prior versions of hdmf/pynwb(/extension)
- ability to work with an extension version(s) in the declared "compatibility" range
- absent side-effects from having other extensions loaded
Have tests regularly run and update status
Have (subset?) of testing across extensions ran for PyNWB PRs
Borrow ideas/setup from possible existing setups:
- https://github.com/datalad/datalad-extensions/ - github/github-actions driven dashboard for DataLad extensions
- @yarikoptic had some related setup in elderly PyMVPA to ensure that h5save'd files could be later h5loaded

jwodder · 2021-03-30T20:14:47Z

@yarikoptic More specific ideas:

We write scripts for generating the following:
- One or more basic, extensionless NWB files
- For each extension, one or more NWB files that use that extension
Each script also generates alongside each NWB file a provenance record/sidecar file listing the package versions used to create the file (See sidecar file for each sample file #6).

The scripts should probably also read the NWB files back in after writing to ensure that they contain what they should as a sanity test.
We compile a list of sets of pynwb, hdmf, and extension versions to use to generate the sample NWBs (See specification for the setup #5).
We write & run a script that creates a virtualenv for each version set, installs the given versions of the packages in the venv, and then runs the appropriate NWB-generating script(s) in the venv to produce NWBs generated by all of the package combinations.
- Do we also want to run this script in different Python versions? Docker would help with that.

Assuming that the sample-production environments are parametrized by nothing more than a pynwb version, an hmdf version, and at most one extension + extension version, we could arrange the layout of the sample repository like so:

pynwb-{version}/
    hmdf-{version}/
        core/
            # Sample NWBs using the given pynwb & hmdf versions and no extensions
        {extension1}-{version1}/
            # Sample NWBs using the given pynwb & hmdf versions and the given version of the given extension
        {extension1}-{version2}/
        {extension2}-{version1}/
        # etc.

We write tests that iterate through the sample NWBs and test that they can be loaded successfully and hold the expected data.
- We can manage what sample NWBs get loaded by passing a list of installed extensions to a custom pytest command-line option, and then the tests can examine the provenance files and skip any NWBs generated by extensions that aren't available (and also skip any files generated by later versions than installed?).
  - Alternatively, each test could use pytest.importorskip() (assuming it's applicable to pynwb extensions) to avoid being run if the relevant extension isn't installed, and then each extension-specific test could be parametrized by the extension-specific NWB files (either a hardcoded list or generated via filepath iteration & provenance inspection).
- Question: How should the expected data be stored? Normally, I'd suggest storing the expected data in JSON files, but the presence of datetime fields makes that impractical. Should we pickle the expected values? Or will they all be small enough that it's practical to store them in the Python source?
We compile a list of sets of pynwb, hmdf, and extension versions to test the sample NWBs against (See specification for the setup #5).
We use Jinja2 templates to generate GitHub Actions workflows from the version sets (one workflow per extension) that install the specified package combinations and then run the tests against the relevant sample NWBs.
We use Jinja2 templates to generate a README showing a grid of CI status badges for each workflow.

jwodder · 2021-03-30T21:04:50Z

@yarikoptic Possible alternative way of structuring the production & testing code that may have been more in line with what you were getting at in #2:

For each extension (and also for "no extension"), we create a Python module containing one or more classes (possibly inheriting some ABC to serve as a marker), each of which uses that extension and has a method for producing a sample NWB and another method that takes a loaded NWB file produced by the other method and asserts that it contains the expected contents.
In our specification for the setup #5 config file, we map each extension to its corresponding Python module.
The sample NWB production script then loads the modules for the available extensions and instantiates & calls the classes within.
The testing code likewise uses the modules to check the corresponding sample files. (I still need to work out the details of how the tests would be organized for this.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overall goals/plan #1

Overall goals/plan #1

yarikoptic commented Mar 30, 2021

jwodder commented Mar 30, 2021 •

edited

Loading

jwodder commented Mar 30, 2021 •

edited

Loading

Overall goals/plan #1

Overall goals/plan #1

Comments

yarikoptic commented Mar 30, 2021

jwodder commented Mar 30, 2021 • edited Loading

jwodder commented Mar 30, 2021 • edited Loading

jwodder commented Mar 30, 2021 •

edited

Loading

jwodder commented Mar 30, 2021 •

edited

Loading