Integrating physics-ml potentials (#27)

* Remove unused implementation parameter in YAML files and potential factory class * Refactor potential initialization code * Refactor test functions to use PotentialFactory * Refactor test functions to use parameterized inputs * allow multiple potentials * Fix potential typo and update physicsml-model configuration * tests are passing locally * add pretrained mace * adopt names * refactor * Refactor potential initialization code and update YAML files * update yaml * update name * update path * update dicsstring * add docs * add type hint * Update README.md * Update README.md * Update README.md * Update setup.py * Update setup.py * Update setup.py * Update setup.py * Update setup.py * Removed eval and added device in yaml and setup. * bugfix * Update README.md * expression to float * expression to float * cast precision to str * expression to float * fix passing of platform proerties * better visualization * Fixed multiplication error * Fixed output file name bug Added conditions for the name of the nnp to avoid the openmmml pointer name being used in the file name. * Added experimental water rdf to output Useful for comparing against the NNP calculated water rdfs * Fixed DOF test PDB writer fixed, positions set correctly and system generated from opt not name. * First round of fixing tests * Removed stability tests output data and more test fixing * Removed generic .dcd and .pdb from gitignore We need to upload some pdb and dcd files for testing purposes and cannot ignore them globally. * Second round of fixing tests * Third round of fixing tests. * Added maximum bond length user option for DOF scan * Removed exs physicsml and fixed typo --------- Co-authored-by: exs-adambaskerville <[email protected]>
Exscientia · Jun 5, 2024 · 40b7065 · 40b7065
1 parent c636097
commit 40b7065
Show file tree

Hide file tree

Showing 108 changed files with 1,934 additions and 5,839 deletions.
diff --git a/.gitignore b/.gitignore
@@ -108,8 +108,6 @@ ENV/
 # In-tree generated files
 */_version.py
 scripts/test_stability_protocol/
-*pdb
-*dcd
 *pt
 test_stability_protocol/
 guardowl/data/drugbank/
diff --git a/README.md b/README.md
@@ -20,10 +20,21 @@
 
 StableNetGuardOwl provides a robust suite for conducting stability tests on Neural Network Potentials (NNPs) used in molecular simulations. These tests are critical in validating NNPs for research and industrial applications, ensuring accuracy and reliability.
 
+
+## Installation
+
+Since openMM and PhysML use different package managers, obtaining a conda|mamba environment with the correct packages is not trivial.
+The following (note the order of the installation of the packages, this is critical for a working environment) has worked in the past:
+```bash
+mamba create --name owl python=3.11
+mamba activate owl
+pip install "physicsml[openmm, openeye, rdkit]"
+mamba install openmm-ml pytorch-gpu -c conda-forge
+mamba install openmmtools loguru typer openff-toolkit 
+```
 ## Features
 
 StableNetGuardOwl supports stability tests for NNPs integrated with `openMM` and those implemented within `openmm-ml` or the Exscientia `physics-ml` package.  
-Currently this supports a range of NNPs including but not limited to `SchNET`, `PaiNN`, `MACE`, and `nequip`.
 
 ## Test Matrix
 
@@ -67,16 +78,15 @@ There is an example `test_config.yaml` file provided in the `scripts` directory
 For a stability test using a pure 15 Angstrom waterbox the `config.yaml` file may look like this
 ```
 tests:
-  - protocol: "waterbox_protocol"  # which protocol is performed
+  - protocol: "waterbox_test"  # which protocol is performed
     edge_length: 15                # waterbox edge length in Angstrom
     ensemble: "NVT"                # thermodynamic esamble that is used. Oter options are 'NpT' and 'NVE'.
     nnp: "ani2x"                   # the NNP used
-    implementation: "nnpops"       # the implementation if multiple are available
     annealing: false               # simulated annealing to slowly reheat the system at the beginning of a simulation
     nr_of_simulation_steps: 10_000 # number of simulation steps
     temperature: 300               # in Kelvin
 ```
-It defines the potential (nnp and implementation), the number of simulation steps, temperature in Kelvin, and edge length of the waterbox in Angstrom as well as the thermodynamic ensemble (`NVT`). Passing this to the `perform_guardowls.py` script runs the tests
+It defines the potential, the number of simulation steps, temperature in Kelvin, and edge length of the waterbox in Angstrom as well as the thermodynamic ensemble (`NVT`). Passing this to the `perform_guardowls.py` script runs the tests
 
 To visualize the results, use the `visualize_results.ipynb` notebook.
 
@@ -105,7 +115,6 @@ To perform a DOF scan over a bond in ethanol you need to generate a yaml file co
 tests:
   - protocol: "perform_DOF_scan"
     nnp: "ani2x"
-    implementation: "torchani"
     DOF_definition: { "bond": [0, 2] }
     molecule_name: "ethanol"
 ```

diff --git a/devtools/conda-envs/test_env.yaml b/devtools/conda-envs/test_env.yaml
@@ -8,6 +8,7 @@ dependencies:
   - python
   - pip
   - openmm>=8.0
+  - openmm-ml
   - openmm-torch
   - openff-toolkit
   - openmmtools
@@ -27,6 +28,7 @@ dependencies:
   - pytest-cov
   - codecov
   - black
+  - rdkit
 
     # Testing
   - pytest
@@ -35,6 +37,4 @@ dependencies:
 
     # Pip-only installs
   - pip:
-      - nvidia-ml-py3
-      - nptyping
-      - git+https://github.com/openmm/openmm-ml.git
+      - physicsml
diff --git a/guardowl/analysis.py b/guardowl/analysis.py
@@ -2,6 +2,7 @@
 import numpy as np
 from typing import List, Tuple
 from loguru import logger as log
+from pathlib import Path
 
 
 class PropertyCalculator:
@@ -86,6 +87,31 @@ def calculate_water_rdf(self) -> np.ndarray:
         )
         return rdf_result
 
+    def experimental_water_rdf(self) -> np.ndarray:
+        """
+        Returns the data for the experimental radial distribution function (RDF) for
+        water molecules. This is taken from the file experimental_water_rdf.txt
+
+        Returns
+        -------
+        np.ndarray
+            The RDF values for water molecules.
+        """
+        # get cwd
+        base_path = Path(__file__).parent
+        exp_rdf_path = (base_path / "data/experimental_water_rdf.txt").resolve()
+
+        # load experimental water rdf data
+        rdf_data = np.loadtxt(exp_rdf_path)
+
+        # convert A to nm for use with mdtraj
+        rdf_x = [pt / 10 for pt in rdf_data[:, [0]]]
+
+        rdf_y = rdf_data[:, [1]]
+
+        # return O-O data
+        return rdf_x, rdf_y
+
     def _extract_water_bonds(self) -> List[Tuple[int, int]]:
         bond_list = []
         for bond in self.md_traj.topology.bonds:
@@ -217,5 +243,5 @@ def monitor_phi_psi(self) -> Tuple[np.ndarray, np.ndarray]:
 
         """
         _, phi_angles = md.compute_phi(self.md_traj)
-        _, psi_angle = md.compute_psi(self.md_traj)
+        _, psi_angles = md.compute_psi(self.md_traj)
         return (phi_angles, psi_angles)
diff --git a/guardowl/benchmark.py b/guardowl/benchmark.py
@@ -65,7 +65,6 @@ def __init__(  # type: ignore
         platform: str,
         qml_timing,
         reference_timing,
-        implementation: str = "",
     ) -> None:
         Process.__init__(self)
         self.simulation_factory = SimulationFactory()
@@ -74,7 +73,6 @@ def __init__(  # type: ignore
         self.nnp = nnp
         self.remove_constraints = remove_constraints
         self.platform = platform
-        self.implementation = implementation
         self.qml_timing = qml_timing
         self.reference_timing = reference_timing
 
@@ -97,14 +95,13 @@ def get_timing_for_spe_calculation(
     def run(self) -> None:
         # this is executed as soon as the process is started
 
-        print(f"{self.implementation=} {self.platform=}")
+        print(f"{self.platform=}")
         potential = MLPotential(self.nnp)
 
         system = self.system_factory.initialize_system(
             potential,
             self.testsystem.topology,
             self.remove_constraints,
-            implementation=self.implementation,
         )
 
         psim = self.simulation_factory.create_simulation(
@@ -164,7 +161,6 @@ def run_benchmark(
         testsystems: Generator[TestSystem, None, None],
         remove_constraints: bool,
         platform: str,
-        implementation: str = "",
     ) -> None:
         self.reference_timing, self.qml_timing, self.gpu_memory = [], [], []
         # start memory logger
@@ -186,7 +182,6 @@ def run_benchmark(
                 platform,
                 qml_timing,
                 reference_timing,
-                implementation,
             )
             simulation_test.start()
             log.info("Started simulation")