Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "canary" workflow for HMR simulations [WIP] #22

Merged
merged 10 commits into from
Oct 13, 2020
Merged
2 changes: 1 addition & 1 deletion .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -62,4 +62,4 @@ jobs:
- name: Test the package
shell: bash -l {0}
run: |
pytest -v
pytest -v openforcefields/tests/
76 changes: 76 additions & 0 deletions .github/workflows/canary.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
name: Canary tests

on:
push:
branches:
- "master"
pull_request:
branches:
- "master"

jobs:
test:
name: Canary tests on ${{ matrix.cfg.os }}, Python ${{ matrix.python-version }}
runs-on: ${{ matrix.cfg.os }}
strategy:
fail-fast: false
matrix:
python-version:
- 3.6
- 3.7
cfg:
- os: ubuntu-latest
- os: macOS-latest

steps:
- uses: actions/checkout@v2
with:
fetch-depth: 0

- name: Find if any OFFXML files were added
shell: bash -l {0}
run: |
if [[ $(git diff origin/master HEAD --name-only | grep "\.offxml") ]]; then
echo "new_off_xml=true" >> $GITHUB_ENV
fi
echo $GITHUB_ENV

- name: Setup conda
if: ${{ env.new_off_xml == 'true' }}
uses: conda-incubator/setup-miniconda@v1
with:
python-version: ${{ matrix.python-version }}
environment-file: devtools/conda-envs/canary_env.yaml
channels: conda-forge,defaults,omnia
activate-environment: test
auto-update-conda: true
auto-activate-base: false
show-channel-urls: true

- name: Environment Information
if: ${{ env.new_off_xml == 'true' }}
shell: bash -l {0}
run: |
conda info
conda list

- name: Additional info about the build
if: ${{ env.new_off_xml == 'true' }}
shell: bash
run: |
uname -a
df -h
ulimit -a

- name: Install package
if: ${{ env.new_off_xml == 'true' }}
shell: bash -l {0}
run: |
conda remove --force openforcefields
python -m pip install -e .

- name: Run HMR canary tests
if: ${{ env.new_off_xml == 'true' }}
shell: bash -l {0}
run: |
git diff origin/master --name-only | grep "\.offxml" | python canary/scripts/test_hmr.py
24 changes: 24 additions & 0 deletions canary/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Canary tests
### Scripts and data files used for canary tests.
These tests are designed to be quick checks for basic behavior to safeguard
against releasing bad force fields. These are **not** rigorous benchmarks or
thorough tests of correct behavior across general uses.

For implementation, see .github/workflows/canary.yaml

This is stored in a directory separate from `openforcefields/tests` because
these are **not** unit tests. If you intended to run the test suite for the
`openforcefields` package, point `pytest` to the unit test directory:

```python3
pytest -v openforcefields/tests/
```

### Data sources

* `data/coverage.smi`: Seeded from [here](https://raw.githubusercontent.com/openforcefield/open-forcefield-data/master/Utilize-All-Parameters/selected/chosen.smi)
* `data/propynes.smi`: Curated from [reported HMR failure](https://github.com/openforcefield/openforcefields/issues/19)

### Scripts

* `scripts/run_hmr.py`: Runs short (10 ps) HMR simulations with a 4 fs timestep
71 changes: 71 additions & 0 deletions canary/data/coverage.smi
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
COC(O)OC
CS
CO
CCO
C#Cc1ccccc1
C(=[NH2+])N
CS(=N)(=O)NC1CC1
CC1(CC(=O)N1)c2nccs2
# Cl
c1conc1C2(CC2)C(=O)O
C(N)P(=O)O
C(CBr)c1[nH]nnn1
C(Cl)I
[N+](=O)(O)[O-]
C1C(O1)CCCF
# [CH3-]
c1c(snn1)CNN
COC(=O)C[N+]#[C-]
c1cnc(cc1Cl)S(=O)O
CSSCCN=C=S
CNCC(Cl)Cl
Cc1c(cc(s1)Br)C2CC2
C1CS(=O)(=O)CCC12CN=C(O2)N
CCOP(=O)(N)OCC
CC(=O)OO
CN1C(=O)C=C(S1(=O)=O)Cl
COc1cc(c(cc1I)O)F
CSC(=S)NN
C(C#N)N=[N+]=[N-]
CC(=C)C1(CC1)CON
COC(=O)Cc1cs/c(=N/S(=O)(=O)C)/[nH]1
C1CC1NC(=O)CN2CC(CO2)O
CCC1CC1(c2ccc(nn2)OC)N
CO/N=[N+](/C(Br)Br)\[O-]
CC(=O)[O-]
CS(=O)Cl
C(C(F)(Cl)Br)(F)(F)Br
CC1(C2C1CCC2)O
CNC(=O)NCOC
c1cc(ccc1[N+](=O)[O-])S(=O)(=O)NCl
CC1CN(S(=O)(=O)C1)NC
CNC(=O)S
CC1(COP(=S)(OC1)S)C
C[N+](C)(C)C
c1ccc(cc1)P(=O)(Cl)Cl
CCc1ccc(cc1)N=S=O
CC(c1[nH][nH]c(=O)n1)NC#N
CCS(=O)(=O)n1cc(cn1)O
c1cc(ccc1OC2CC2)I
c1ccc(cc1)S(F)(F)(F)(F)F
CCC(CCl)Cl
CC(=O)c1ccc(cc1)S(=C)(=C)C
COc1ccc(cc1OC)c2ccc3ccccc3[o+]2
F[P-](F)(F)(F)(F)F
CC(C)S(=O)(=O)Br
C1CC(=O)N(C1=O)Br
C=C=CN1CCCC1=O
CC(C)C(=O)NNC=O
CC1(C(=O)OOC1(C)C)C
CC(=O)OC#COC(=O)C
C1[C@@H]2[C@@H]([C@H]2O)CN1
C[n+]1ccc(cc1)C(=O)N
CC1(CC1(C(=O)N)C(=O)O)C
c1cc(nc(c1N(=O)O)N)Cl
COP(=O)(NC(=O)C(Cl)(Cl)Cl)OC
c1ccc(c(c1)S(=O)(=O)N=N#N)S(=O)(=O)N=N#N
Cn1cccc1C(=S)N=P(N(C)C)(N(C)C)N(C)C
c1ccc(cc1)S(=O)(=O)N(F)S(=O)(=O)c2ccccc2
c1ccc(cc1)P(c2ccccc2)(c3ccccc3)(I)I
c1ccc(cc1)CP(Cc2cccc(c2)F)(c3ccccc3)(c4ccccc4)Br
c1cc(ccc1c2ccc(cc2)OSOc3ccc(cc3)Cl)OSOc4ccc(cc4)Cl
2 changes: 2 additions & 0 deletions canary/data/propynes.smi
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
C#CC
c1ccccc1C#CC
97 changes: 97 additions & 0 deletions canary/scripts/test_hmr.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
import sys
from pathlib import Path

import numpy as np
from openforcefield.topology import Molecule
from openmmforcefields.generators import SystemGenerator
from pkg_resources import resource_filename
from simtk import openmm, unit
from simtk.openmm import app

DATA_PATH = Path(resource_filename("openforcefields", "../canary/data/")).resolve()

coverage_mols = DATA_PATH / "coverage.smi"
propyne_mols = DATA_PATH / "propynes.smi"


class NANEnergyError(Exception):
"""Base exception for a system with NaN-like potential energy"""


class CanaryError(Exception):
"""Base exception for canary"""


class HMRCanaryError(CanaryError):
"""Exception for an HMR canary test failing"""


def hmr_driver(mol, ff_name):
"""Given an OpenFF Molecule, run a short 4 fs HMR simulation. This function is adapted from
https://github.com/openforcefield/openforcefields/issues/19#issuecomment-689816995"""
print(
f"Running HMR with force field {ff_name} and molecule with SMILES {mol.to_smiles()}"
)

forcefield_kwargs = {
"constraints": app.HBonds,
"rigidWater": True,
"removeCMMotion": False,
"hydrogenMass": 4
* unit.amu, # Does this also _subtract_ mass from heavy atoms?:w
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. So carbons get really light sometimes.

}

system_generator = SystemGenerator(
small_molecule_forcefield=ff_name,
forcefield_kwargs=forcefield_kwargs,
molecules=mol,
)
system = system_generator.create_system(mol.to_topology().to_openmm())

temperature = 300 * unit.kelvin
collision_rate = 1.0 / unit.picoseconds
timestep = 4.0 * unit.femtoseconds

integrator = openmm.LangevinIntegrator(temperature, collision_rate, timestep)
context = openmm.Context(system, integrator)
mol.generate_conformers(n_conformers=1)
context.setPositions(mol.conformers[0])

# Run for 10 ps
integrator.step(2500)

state = context.getState(getEnergy=True)
pot = state.getPotentialEnergy()
# OpenMM will silenty "fail" if energies aren't explicitly checked
if np.isnan(pot / pot.unit):
raise NANEnergyError()


if __name__ == "__main__":
"""This function expects to be called with a list of OFFXML files passed to it,
i.e. piped from git diff upstream/master --name-only"""
# Read force field filenames from stdin
for line in sys.stdin:
if "_unconstrained" in line:
continue
ff_name = line.split("/")[-1][:-8]
# Molecule.from_file fails on pathlib.Path ojects, despite being str-like
hmr_mols = Molecule.from_file(
str(propyne_mols), file_format="smi", allow_undefined_stereo=True
)
# TODO: Add coverage set, with known failures stripped out
# hmr_mols = Molecule.from_file(str(coverage_mols), file_format='smi', allow_undefined_stereo=True)
failed_runs = []
for mol in hmr_mols:
try:
hmr_driver(mol, ff_name)
except NANEnergyError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On linux, NaN failures don't seem to be getting caught. Not a huge deal since failing is the whole point of the canary, but it may be helpful to make it so that multiple NaNs can be caught and a summary of failures can be shown to users.

Copy link
Member Author

@mattwthompson mattwthompson Oct 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, we run into one of those "un-Pythonic" behaviors in OpenMM, as the triggered exception cannot be caught.

>>> from simtk.openmm import OpenMMException
>>> try:
...     raise OpenMMException
... except OpenMMException:
...     print('caught!')
...
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
TypeError: exceptions must derive from BaseException

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
TypeError: catching classes that do not inherit from BaseException is not allowed

I can just catch it later, but not with the specificity we'd normally prefer (catching more than less is better for these tests, anyway):

(test) root@af8cfd33010c:/openforcefields# echo openforcefields/offxml/openff-1.2.0.offxml | python canary/scripts/test_hmr.py -vs
Warning: Unable to load toolkit 'OpenEye Toolkit'. The Open Force Field Toolkit does not require the OpenEye Toolkits, and can use RDKit/AmberTools instead. However, if you have a valid license for the OpenEye Toolkits, consider installing them for faster performance and additional file format support: https://docs.eyesopen.com/toolkits/python/quickstart-python/linuxosx.html OpenEye offers free Toolkit licenses for academics: https://www.eyesopen.com/academic-licensing
[16:05:40] WARNING: no name column found on line 0
[16:05:40] WARNING: no name column found on line 1
Running HMR with force field openff-1.2.0 and molecule with SMILES [H][C]#[C][C]([H])([H])[H]
/bin/bash: /opt/conda/envs/test/lib/libtinfo.so.6: no version information available (required by /bin/bash)
/bin/bash: /opt/conda/envs/test/lib/libtinfo.so.6: no version information available (required by /bin/bash)
/bin/bash: /opt/conda/envs/test/lib/libtinfo.so.6: no version information available (required by /bin/bash)
/bin/bash: /opt/conda/envs/test/lib/libtinfo.so.6: no version information available (required by /bin/bash)
/bin/bash: /opt/conda/envs/test/lib/libtinfo.so.6: no version information available (required by /bin/bash)
/bin/bash: /opt/conda/envs/test/lib/libtinfo.so.6: no version information available (required by /bin/bash)
Running HMR with force field openff-1.2.0 and molecule with SMILES [H][c]1[c]([H])[c]([H])[c]([C]#[C][C]([H])([H])[H])[c]([H])[c]1[H]
/bin/bash: /opt/conda/envs/test/lib/libtinfo.so.6: no version information available (required by /bin/bash)
/bin/bash: /opt/conda/envs/test/lib/libtinfo.so.6: no version information available (required by /bin/bash)
/bin/bash: /opt/conda/envs/test/lib/libtinfo.so.6: no version information available (required by /bin/bash)
/bin/bash: /opt/conda/envs/test/lib/libtinfo.so.6: no version information available (required by /bin/bash)
/bin/bash: /opt/conda/envs/test/lib/libtinfo.so.6: no version information available (required by /bin/bash)
/bin/bash: /opt/conda/envs/test/lib/libtinfo.so.6: no version information available (required by /bin/bash)
Traceback (most recent call last):
  File "canary/scripts/test_hmr.py", line 94, in <module>
    raise HMRCanaryError(failed_runs)
__main__.HMRCanaryError: [[Molecule with name '' and SMILES '[H][C]#[C][C]([H])([H])[H]', 'openff-1.2.0', 'NaN Position(s)'], [Molecule with name '' and SMILES '[H][c]1[c]([H])[c]([H])[c]([C]#[C][C]([H])([H])[H])[c]([H])[c]1[H]', 'openff-1.2.0', 'NaN Position(s)']]

failed_runs.append([mol, ff_name, "NaN energy"])
except Exception:
# OpenMM's OpenMMException cannot be caught as it does not
# inherit from BaseException; therefore this clause may
# hit other errors than NaN positions
failed_runs.append([mol, ff_name, "NaN position(s)"])

if len(failed_runs) > 0:
raise HMRCanaryError(failed_runs)
18 changes: 18 additions & 0 deletions devtools/conda-envs/canary_env.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name: test

channels:
- conda-forge
- omnia

dependencies:
# Base depends
- python
- pip

# Testing
- pytest
- openmm
- openmmforcefields

# Standard dependencies
- openforcefield