Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large DFT forces and strange geometries #105

Open
jharrymoore opened this issue Jul 9, 2024 · 7 comments
Open

Large DFT forces and strange geometries #105

jharrymoore opened this issue Jul 9, 2024 · 7 comments

Comments

@jharrymoore
Copy link

Hi,

Whilst inspecting some of the new subsets that were added in version 2, I came across some configurations where the hydrogens appear to have been ripped off their heavy atoms, and the forces from DFT are extremely high. I have attached some examples that appear when filtering the amino acid-ligand subset by max force. My understanding was that some of these configurations with high forces were present in the original dataset due to the psi4 bug but was not expecting them to be present in the more recently computed values.

spice_2_amino_acid_ligand_high_dft_forces.tar.gz

@peastman
Copy link
Member

peastman commented Jul 9, 2024

Where did you get these from? Anything with a force >1 was stripped out when we generated the HDF5 file for the dataset.

@jharrymoore
Copy link
Author

These came from the latest SPICE v2.0.1 HDF5 file on zenodo, positions and arrays were extracted to xyz

@peastman
Copy link
Member

peastman commented Jul 9, 2024

Can you provide the group names and conformation indices so I can look them up?

@peastman
Copy link
Member

peastman commented Jul 9, 2024

Just confirming, by group names do you mean the spice subset they belong to.

I mean the name of the top level group within the HDF5 file. So I can look them up in the file.

@jharrymoore
Copy link
Author

jharrymoore commented Jul 9, 2024

Attached is a set I extracted from the amino acid-ligand set with a force norm greater than 30 eV/A. Looking at the configs, many of the geometries seem reasonable, however it appears that certain heavy atoms are being replaced with hydrogens
image

high_force_configs_aa_ligand.txt

@peastman
Copy link
Member

peastman commented Jul 9, 2024

That makes sense. I assumed your file had the same units as the original dataset. The cutoff we applied to forces is 1 hartree/bohr, which is 51.4 eV/Å. Anything less than that is expected to still be present.

I looked through a few of the molecules you listed and didn't see any detached hydrogens like that. But I did see some mangled looking molecules, like this distorted ring in XEN HIS.

image

You might choose to apply a lower cutoff to forces to get rid of things like this. Strictly speaking they're still correct: the DFT calculation was run correctly for the given conformations. But you might decide you don't want to train on conformations that are that unrealistic.

@tamaswells
Copy link

This molecule is also strange pubchem_id=135091982
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants