Skip to content

Commit 26deeb8

Browse files
authored
atomate2 / OpenMM OPLS-AA Enhancements (#1111)
* added opls_lj function; pre-commit fails for (1) eps._value call, perhaps this is fine? and (2) usused eps14 variable (need to check that eps == eps14) * eps14 is not the correct value, off by 0.5 and eps should be used instead; solution: comment out eps14 line to match original ligpargen code * initial commit to implementing opls_lj with error checking for proper 14 interaction scale values (i.e., 0.5); changed create_system_from_xml to create_ff_from_xml to enable generate_openmm_interchange to have access to combination rules only available via the forcefield...still experiencing NoneType error when creating OpenMMInterchange object which prohibits inspection of OpenMMInterchange in generate_openmm_interchange * the use of io.StringIO is seemingly a bug? works fine without needing to convert xml to string, leaving this commented for now before checking other cases * added try/except to handle io.StringIO failure * added 1-4 scaling conditional statement; fized opls nonbonded method to cutoff; incorporated diffusedxml to follow linter guidelines * try/except failure should specify selenium or webdriver_manager packages missing * added generate_opls_xml function with placeholder for ligpargen commands * fixed shutil.copy to shutil.move to avoid redundant files in tmpdir * fixed linter issues (namely, subprocess.run(...shell=True) was unsafe and solution was to .split() command into a list of strings * completed OPLS Docker documentation + except for shifter issues, generate_opls_xml(...) is complete * completed boss->ligpargen->docker->podman-hpc->subprocess pipeline to generate OPLS .xml files * minor typo in docs * updated tests for download_ and generate_opls_xml functions * removed defusedxml dependency * removed textwrap dependency * replaced tag-based opls flag to ff_kwargs for opls * added names_params documentation * test_create_ff_from_xml * pre-commit * ff_kwargs is not None * all new tests pass pytest * added monkeypatching env variables * fixed docker and/or podman-hpc edge case * removed rogue file
1 parent 9487bf6 commit 26deeb8

File tree

6 files changed

+399
-30
lines changed

6 files changed

+399
-30
lines changed

docs/user/codes/openmm.md

Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -446,6 +446,144 @@ run_locally(flows[rank], ensure_success=True)
446446

447447
</details>
448448

449+
### Varying Forcefields: OPLS
450+
451+
<details>
452+
<summary>Learn to generate OPLS Forcefield Parameters</summary>
453+
454+
The OpenFF Force Fields provide a powerful starting point to simulate a variety of organic materials using general forcefields like [Parsley](https://doi.org/10.1021/acs.jctc.1c00571) and [Sage](https://pubs.acs.org/doi/10.1021/acs.jctc.3c00039). Just as is done through the OpenFF Toolkit and Interchange machinery, one can automate force field generation for custom force fields. For instance, LigParGen is an automatic OPLS-AA parameter generator for small organic molecules with both a [online server](https://traken.chem.yale.edu/ligpargen/) and open-source [repository](https://traken.chem.yale.edu/ligpargen/). You will see that for any custom parameter generation tool, one can create a container environment as a wrapper to plug into the workflow described up until now.
455+
456+
To do so, you will use the `generate_opls_xml(...)` function in `atomate2/openmm/utils`. This function runs a subprocess to call an image of the LigParGen repository (and all of its respective dependencies). Thus, this requires a local installation of [Docker](https://docs.docker.com/get-started/get-docker/) (otherwise, `download_opls_xml` can be run via the LigParGen website server instead). Once you have docker installed locally, `generate_opls_xml(...)` can be unlocked in three steps:
457+
458+
#### 1. Create a Private LigParGen Image
459+
460+
You will need to install [BOSS](https://zarbi.chem.yale.edu/software.html)--once you receive the email, follow the instructions, LICENSE guidelines, and save the `boss` directory in the same directory as the following `Dockerfile`:
461+
462+
```bash
463+
FROM ubuntu:20.04
464+
465+
LABEL org.opencontainers.image.version="20.04"
466+
LABEL org.opencontainers.image.ref.name="ubuntu"
467+
468+
ARG LAUNCHPAD_BUILD_ARCH
469+
ARG RELEASE
470+
471+
RUN dpkg --add-architecture i386 && \
472+
apt-get update && \
473+
apt-get install -y \
474+
libc6:i386 \
475+
libncurses5:i386 \
476+
libstdc++6:i386 \
477+
zlib1g:i386 \
478+
gcc-multilib \
479+
g++-multilib \
480+
binutils \
481+
git \
482+
curl \
483+
libxrender1 \
484+
csh && \
485+
apt-get clean
486+
487+
RUN curl -L -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
488+
bash Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda && \
489+
rm Miniconda3-latest-Linux-x86_64.sh && \
490+
/opt/conda/bin/conda init
491+
492+
RUN /opt/conda/bin/conda create -n ligpargen -y python=3.7 && \
493+
/opt/conda/bin/conda install -n ligpargen -y -c rdkit rdkit && \
494+
/opt/conda/bin/conda install -n ligpargen -y -c conda-forge openbabel
495+
496+
ENV PATH="/opt/conda/envs/ligpargen/bin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
497+
498+
RUN git clone https://github.com/Isra3l/ligpargen.git /opt/ligpargen && \
499+
cd /opt/ligpargen && \
500+
/opt/conda/envs/ligpargen/bin/pip install -e .
501+
502+
COPY ./boss /opt/BOSSdir
503+
504+
RUN chmod +x /opt/BOSSdir/*
505+
506+
ENV BOSSdir="/opt/BOSSdir"
507+
508+
WORKDIR /opt/output
509+
510+
RUN echo "source activate ligpargen" > ~/.bashrc
511+
512+
SHELL ["/bin/bash", "-c"]
513+
514+
CMD ["/bin/bash"]
515+
```
516+
517+
It will help to have an account either via [DockerHub](https://hub.docker.com/) or the [NERSC registry](https://registry.nersc.gov/account/sign-in?redirect_url=%2Fharbor%2Fprojects) to *privately* upload your image. Next, follow the docker commands to upload an image to your registry of choice:
518+
519+
```bash
520+
docker build -t USERNAME/ligpargen .
521+
docker login
522+
docker push USERNAME/ligpargen
523+
```
524+
Note: Be sure to check that on DockerHub, the image `Visibility` is set to `Private`.
525+
526+
Now, you can simply pull your image to which ever HPC cluster environment you choose to proceed with,
527+
528+
```bash
529+
docker pull USERNAME/ligpargen:latest
530+
```
531+
532+
On NERSC, users have the option of using [Shifter](https://docs.nersc.gov/development/containers/shifter/how-to-use/) or [Podman](https://docs.nersc.gov/development/containers/podman-hpc/overview/). We recommend Podman in this case to circumvent additional user-level permission requirements. The following Podman commands will work:
533+
534+
```bash
535+
podman-hpc login docker.io
536+
Username: USERNAME
537+
Password:
538+
539+
podman-hpc pull docker.io/USERNAME/ligpargen:latest
540+
```
541+
542+
#### 2. Set environment variables
543+
544+
Set the image name and container software (Docker, Shifter, Apptainer, etc.) to environment variables (consider adding these to your `~/.bashrc`):
545+
546+
```bash
547+
export LPG_IMAGE_NAME="USERNAME/ligpargen:latest"
548+
export CONTAINER_SOFTWARE="podman-hpc" # e.g.
549+
```
550+
551+
#### 3. Run `generate_opls_xml`
552+
553+
A simple function call will create your desired .XML force field file (e.g., `EC.xml`):
554+
555+
```python
556+
from atomate2.openmm.utils import generate_opls_xml
557+
558+
mols = {
559+
"EC": {
560+
"smiles": "C1COC(=O)O1",
561+
"charge": "0", # default_value=0
562+
"checkopt": 3, # default_value=3
563+
"charge_method": "CM1A", # default_value="CM1A"
564+
},
565+
}
566+
generate_opls_xml(mols)
567+
```
568+
569+
Functionally, this is equivalent to running the following LigParGen command:
570+
571+
```bash
572+
ligpargen -n EC -p EC -r EC -c 0 -o 3 -cgen CM1A -s C1COC(=O)O1
573+
```
574+
575+
Now, just like before, you can create an `Interchange` object. Be sure to include `opls` as an `ff_kwargs` so the correct [geometric combination rules](https://traken.chem.yale.edu/ligpargen/openMM_tutorial.html) for OPLS force fields are invoked,
576+
577+
```python
578+
elyte_interchange_job = generate_openmm_interchange(
579+
mol_specs_dicts, ff_xmls=["EC.xml"], ff_kwargs=["opls"]
580+
)
581+
```
582+
583+
See that this general process can work transferably for *any* parameter generation software given you (1) create an image, (2) set the image name as an environment variable, and (3) minimally modify `generate_opls_xml(...)` to your own requirements. In future work, we'll improve this black-box type functionality to support wider parameter generation methods.
584+
585+
</details>
586+
449587
## Analysis with Emmet
450588

451589
For now, you'll need to make sure you have a particular emmet branch installed.

src/atomate2/openmm/jobs/generate.py

Lines changed: 31 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -17,13 +17,15 @@
1717
from jobflow import Response
1818
from openmm import Context, LangevinMiddleIntegrator, System, XmlSerializer
1919
from openmm.app import PME, ForceField
20+
from openmm.app.forcefield import NonbondedGenerator
2021
from openmm.app.pdbfile import PDBFile
2122
from openmm.unit import kelvin, picoseconds
2223
from pymatgen.core import Element
2324
from pymatgen.io.openff import get_atom_map
2425

2526
from atomate2.openff.utils import create_mol_spec, merge_specs_by_name_and_smiles
2627
from atomate2.openmm.jobs.base import openmm_job
28+
from atomate2.openmm.utils import opls_lj
2729

2830
try:
2931
import openff.toolkit as tk
@@ -40,7 +42,10 @@ class XMLMoleculeFF:
4042

4143
def __init__(self, xml_string: str) -> None:
4244
"""Create an XMLMoleculeFF object from a string version of the XML file."""
43-
self.tree = ET.parse(io.StringIO(xml_string)) # noqa: S314
45+
try:
46+
self.tree = ET.parse(io.StringIO(xml_string)) # noqa: S314
47+
except ET.ParseError:
48+
self.tree = ET.parse(xml_string) # noqa: S314
4449

4550
root = self.tree.getroot()
4651
canonical_order = {}
@@ -153,11 +158,10 @@ def from_file(cls, file: str | Path) -> XMLMoleculeFF:
153158
return cls(xml_str)
154159

155160

156-
def create_system_from_xml(
157-
topology: tk.Topology,
161+
def create_ff_from_xml(
158162
xml_mols: list[XMLMoleculeFF],
159163
) -> System:
160-
"""Create an OpenMM system from a list of molecule specifications and XML files."""
164+
"""Create OpenMM forcefield from a list of molecule specifications and XML files."""
161165
io_files = []
162166
for i, xml in enumerate(xml_mols):
163167
xml_copy = copy.deepcopy(xml)
@@ -168,7 +172,7 @@ def create_system_from_xml(
168172
for i, xml in enumerate(io_files[1:]): # type: ignore[assignment]
169173
ff.loadFile(xml, resname_prefix=f"_{i + 1}")
170174

171-
return ff.createSystem(topology.to_openmm(), nonbondedMethod=PME)
175+
return ff
172176

173177

174178
@openmm_job
@@ -179,6 +183,7 @@ def generate_openmm_interchange(
179183
xml_method_and_scaling: tuple[str, float] = None,
180184
pack_box_kwargs: dict = None,
181185
tags: list[str] = None,
186+
ff_kwargs: list[str] = None,
182187
) -> Response:
183188
"""Generate an OpenMM Interchange object from a list of molecule specifications.
184189
@@ -216,6 +221,8 @@ def generate_openmm_interchange(
216221
toolkit.interchange.components._packmol.pack_box. Default is an empty dict.
217222
tags : List[str], optional
218223
A list of tags to attach to the task document.
224+
ff_kwargs : List[str], optional
225+
A list of additional keyword arguments for force field specification.
219226
220227
Returns
221228
-------
@@ -280,7 +287,25 @@ def generate_openmm_interchange(
280287
**pack_box_kwargs,
281288
)
282289

283-
system = create_system_from_xml(topology, xml_mols)
290+
ff = create_ff_from_xml(xml_mols)
291+
292+
# obtain 14 scaling values from forcefield
293+
generator = ff.getGenerators()
294+
for gen in generator:
295+
if isinstance(gen, NonbondedGenerator):
296+
c14 = gen.coulomb14scale
297+
lj14 = gen.lj14scale
298+
299+
system = ff.createSystem(topology.to_openmm(), nonbondedMethod=PME)
300+
if (ff_kwargs is not None) and ("opls" in ff_kwargs):
301+
if (c14 != 0.5) or (lj14 != 0.5):
302+
raise ValueError(
303+
f"NonbondedForce class in XML,"
304+
f"<NonbondedForce coulomb14scale='0.5' lj14scale='0.5'>,"
305+
f"does not match OPLS convention,"
306+
f"<NonbondedForce coulomb14scale='{c14}' lj14scale='{lj14}'>."
307+
)
308+
system = opls_lj(system)
284309

285310
# these values don't actually matter because integrator is only
286311
# used to generate the state

0 commit comments

Comments
 (0)