Skip to content

Commit

Permalink
more docs
Browse files Browse the repository at this point in the history
  • Loading branch information
svandenhaute committed Jun 5, 2024
1 parent 96dd426 commit 5b3da30
Show file tree
Hide file tree
Showing 4 changed files with 190 additions and 103 deletions.
73 changes: 65 additions & 8 deletions docs/hamiltonian.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,77 @@
In Born-Oppenheimer-based molecular simulation, atomic nuclei are treated as classical particles that are subject to *effective* interactions which are determined by the quantum mechanical behavior of the electrons.
In addition to the atomic interactions, it is often useful to define additional biasing forces on the system, e.g. in order to drive a rare event or to prevent the system from exploring undesired regions in phase space.
In addition, there exist various alchemical free energy techniques which rely on systematic changes in the hamiltonian ( = potential energy) of the system to derive free energy differences between different states.
In Born-Oppenheimer-based molecular simulation, atomic nuclei are treated as classical particles that are subject to *effective* interactions, which are determined by the quantum mechanical behavior of the electrons.
These interactions determine the interatomic forces which are used in a dynamic simulation to propagate the atomic positions from one timestep to the next.
In more advanced schemes, researchers may modify these effective interactions to include biasing forces (e.g. in order to induce a phase transition), or perform an alchemical transformation between two potential energy surfaces (when computing relative free energies).

The ability to combine various energy contributions in an arbitrary manner allows for a very natural definition of many algorithms in computational statistical physics.
To accomodate for all these use cases, psiflow provides a simple abstraction for *"a function which accepts an atomic geometry and returns energies and forces"*: the `Hamiltonian` class.
Examples of Hamiltonians are a specific ML potential, a bias potential on a collective variable, or a quadratic approximation to a potential energy minimum.

By far the simplest hamiltonian is the Einstein crystal, which binds atoms to a certain reference position using harmonic springs with a single, fixed force constant.

To accomodate for all these use cases, psiflow provides a simple abstraction for *a function which accepts an atomic geometry and returns energies and forces*: the `Hamiltonian` class.
The simplest hamiltonian (which is only really useful for testing purposes) is the Einstein crystal, which binds atoms using harmonic springs to a certain reference position.
```py
from psiflow.geometry import Geometry
from psiflow.hamiltonians import EinsteinCrystal


geometry = Geometry.from_string('''
2
H 0.0 0.0 0.0
H 0.0 0.0 0.8
''')

hamiltonian = EinsteinCrystal(
reference_geometry=geometry.positions,
force_constant=0.1,
einstein = EinsteinCrystal(
reference_geometry=geometry.positions, # positions at which all springs are at rest
force_constant=0.1, # force constant, in eV / A**2
)

```
As mentioned earlier, the key feature of hamiltonians is that they take as input an atomic geometry, and spit out an energy, a set of forces, and optionally also virial stress.
Because hamiltonians might require specialized resources for their evaluation (e.g. an ML potential which gets executed on a GPU), evaluation of a hamiltonian does not necessarily happen instantly (e.g. if a GPU node is not immediately available). Similar to how `Dataset` instances return futures of a `Geometry` when a particular index is queried, hamiltonians return a future when asked to evaluate the energy/forces/stress of a particular `Geometry`:

```py
future = einstein.evaluate(geometry) # returns an AppFuture of the Geometry; evaluates instantly
evaluated = future.result() # calling result makes us wait for it to actually complete

assert evaluated.energy is not None # the energy of the hamiltonian
assert not np.any(np.isnan(evaluated.per_atom.forces)) # a (N, 3) array with forces
```
One of the most commonly used hamiltonians will be that of MACE, one of the most ubiquitous ML potentials.
There exist reasonably accurate pretrained models which can be used for exploratory purposes.
These are readily available in psiflow:

```py
from psiflow.hamiltonians import get_mace_mp0


mace = get_mace_mp0() # downloads MACE-MP0 from github
future = mace.evaluate(geometry) # evaluates the MACE potential on the geometry

evaluated = future.result()
forces = evaluated.per_atom.forces # forces on each atom, in float32

assert np.sum(np.dot(forces[0], forces[1])) < 0 # forces in H2 always point opposite of each other
assert np.allclose(np.sum(forces), 0.0) # forces are conservative --> sum to zero
```
As alluded to earlier, hamiltonians can be combined in arbitrary ways to create new hamiltonians.
Psiflow supports a concise syntax for basic arithmetic operations on hamiltonians, such as
multiplication by a scalar or addition of two hamiltonians:

```py
data = Dataset.load('train.xyz')
mix = 0.5 * einstein + 0.5 * mace # MixtureHamiltonian with E = 0.5 * E_einstein + 0.5 * E_mace
energies_mix = mix.evaluate(data).get('energy')

energies_einstein = einstein.evaluate(data).get('energy')
energies_mace = mace.evaluate(data).get('energy')
assert np.allclose(
energies_mix.result(),
0.5 * energies_einstein.result() + 0.5 * energies_mace.result(),
)
```
This makes it very easy to introduce bias potentials into your simulations -- see for example the formic acid transition state [example](https://github.com/molmod/psiflow/tree/main/examples/formic_acid_transition.py).
The following is a list of all available hamiltonians in psiflow:

- `EinsteinCrystal`: A simple harmonic potential which binds atoms to a reference position.
- `MACE`: ML potential, either pretrained as available on GitHub, or trained within psiflow (see later sections)
- `Harmonic`: A general quadratic potential based on a Hessian matrix and an optimized geometry.
- `PlumedHamiltonian`: a bias contribution based on a PLUMED input file.
92 changes: 0 additions & 92 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,98 +109,6 @@ In what follows, we assume that a suitable `context` has been initialized.
--->


## Atomic data
In psiflow, a set of atomic configurations is represented using the `Dataset` class.
It may represent training/validation data for model development, or
a trajectory of snapshots that was generated using molecular dynamics.
A `Dataset` instance mimics the behavior of a list of ASE `Atoms` instances:
```py
from psiflow.data import Dataset


data_train = Dataset.load('train.xyz') # create a psiflow Dataset from a file
data_subset = data_train[:10] # create a new Dataset instance with the first 10 states
data_train = data_subset + data_train[10:] # combining two datasets is easy

data = Dataset.load('lots_of_data.xyz')
train, valid = data.shuffle().split(0.9) # shuffle structures and partition into train/valid sets
type(train) # psiflow Dataset
type(valid) # psiflow Dataset

```
The main difference between a psiflow `Dataset` instance and an actual Python `list` of
`Atoms` is that a `Dataset` can represent data __that will be generated in the future__.

!!! note "Parsl 101: Apps and Futures"
To understand what is meant by 'generating data in the future', it is necessary
to introduce the core concepts in Parsl: apps and futures. In their simplest
form, apps are just functions, and futures are the result of an app given
a set of inputs. Importantly, a Future already exists before the actual calculation
is performed. In essence, a Future _promises_ that, at some time in the future, it will
contain the actual result of the function evaluation. Take a look at the following
example:

```py
from parsl.app.app import python_app


@python_app # convert a regular Python function into a Parsl app
def sum_integers(a, b):
return a + b


sum_future = sum_integers(3, 4) # tell Parsl to generate a future that represents the sum of integers 3 and 4
print(sum_future) # is an AppFuture, not an integer

print(sum_future.result()) # now compute the actual result; this will print 7 !

```
The return value of Parsl apps is not the actual result (in this case, an integer), but
an AppFuture that will store the result of the function evaluation after it has completed.
The main reason for doing things this way is that this allows for asynchronous execution.
For more information, check out the [Parsl documentation](https://parsl.readthedocs.io/en/stable/).

The actual atomic configurations are stored __as a Parsl future, in an attribute of the Dataset
object__.
Actually getting the data would require the user to make a `.result()` call similar
to the trivial Parsl example above.
Let's go back to the first example and try and get the actual list of `Atoms` instances:
```py
data_train = Dataset.load('train.xyz')
atoms_list = data_train.as_list() # returns AppFuture

isinstance(atoms_list, list) # returns False!

atoms_list.result() # this is the actual list


data_train[4] # AppFuture representing the configuration at index 4
data_train[4].result() # actual Atoms instance

```
If the initial XYZ file was formatted in extended XYZ format and contained the potential
energy, forces, and stress of each atomic configuration,
they are also loaded in the dataset:
```py
data_train[4].result() # actual Atoms instance
data_train[4].result().info['energy'] # potential energy, float
data_train[4].result().info['stress'] # virial stress, 2darray of shape (3, 3)
data_train[4].result().arrays['forces'] # forces, 2darray of shape (natoms, 3)

```
While not that important for the user, it is worth mentioning that psiflow
extends ASE's `Atoms` functionality with a few additional features, mostly
for internal convenience. Practically speaking, this does not really change anything for the user,
but we mention it for completeness.
```py
from ase import Atoms
from psiflow.data import FlowAtoms

snapshot = data_train[4].result() # convert Future of snapshot to actual snapshot
isinstance(snapshot, Atoms) # True; FlowAtoms subclasses Atoms
type(snapshot) == Atoms # False; it is not actually an Atoms instance
type(snapshot) == FlowAtoms # True

```
## Trainable potentials
Expand Down
114 changes: 114 additions & 0 deletions docs/sampling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
In the Born-Oppenheimer philosophy, we explore the phase space of a molecule or a material and generate samples using molecular dynamics simulations.
Those samples are then used to evaluate time averages of some property of interest in order to predict physical observables.
In psiflow, such simulations are executed within [i-PI](https://ipi-code.org/), a versatile and efficient code which supports an impressive number of [features](https://ipi-code.org/i-pi/features.html).
We mention the most important ones below

- **molecular dynamics in various ensembles**: most notably NVE, NVT, and fully anisotropic NPT. There exist a variety of thermostat and barostat options, the default being Langevin. Together with the ability to combine arbitrary hamiltonians, this includes biased molecular dynamics simulations using e.g. harmonic restraints (umbrella sampling).
- **path-integral molecular dynamics** (PIMD): allows for the simulation of the quantum behavior of light atomic nuclei. This is important for many systems involving hydrogen atoms at relatively low temperatures (<=room temperature). Importantly these simulations can also be performed in any of the aforementioned ensembles.
- **geometry optimizations**: i-PI can be used to optimize the geometry of a molecule or a material using a variety of optimization algorithms.
- **replica exchange** (parallel tempering): dramatically improves the sampling efficiency and ergodicity whenever nontrivial free energy barriers are present in the phase space of the system. In this approach, one considers replicas of the system at various temperatures and/or pressures, or optionally even with different hamiltonians.
- **multiple walker metadynamics**: simple but powerful method to overcome free energy barriers when a suitable collective variable is known for the system of interest.


## the `Walker` class
Psiflow is essentially a convenient wrapper around most of i-PI's features.
The key object which enables the execution of these simulations is the `Walker` class.
A single walker describes a single replica of the system which evolves through phase space.
It is initialized with a `Geometry` instance which describes the start of the simulation, and can be assigned a particular hamiltonian, a temperature and/or pressure, and a timestep.

```py
from psiflow.sampling import Walker
from psiflow.geometry import Geometry
from psiflow.hamiltonians import get_mace_mp0


start = Geometry.load("start.xyz")
walker = Walker(
start,
hamiltonian=get_mace_mp0(),
temperature=300.0,
pressure=None, # NVT simulation
timestep=0.5, # in femtoseconds, the default value
)
```
In the vast majority of cases, it is necessary to run mutiple simulations at slightly different conditions.
For example, let us create ten of these walkers which are identical except for the temperature:

```py
walkers = walker.multiply(10)
for i, walker in enumerate(walkers):
w.temperature = 300 + i * 10
```
When propagated, each of these walkers will generate trajectories in phase space which correspond to their own temperature.
In the case of temperature, such trajectories can be used to e.g. evaluate variation of the mean energy with respect to temperature
(and therefore, the heat capacity of the system).

## generating trajectories

Walkers can be propagated in time by using the `sample` function.
It accepts a list of walkers, each of which will be propagated in phase space according to its own parameters.
Importantly, there are *no restriction* on the type of walkers in this list.
Users can mix regular NVT walkers, with PIMD NVE walkers, and a list of N replica exchange walkers.
Internally, psiflow will recognize which walkers are independent and parallelize the execution as much as possible
Consider the following example:
```py
from psiflow.sampling import sample

outputs = sample(
walkers,
steps=1e6, # total number of timesteps to run the simulation for; this translates to 500 ps in this case
step=1e3, # sample every 1000 timesteps
start=1e5, # start sampling after 50 ps
)
print(outputs) # list of `SimulationOutput` instances
```
In this example, the sample function will return a list of `Output` instances, each of which contains the trajectory of a single walker.
The outputs are ordered in the same way as the input walkers (i.e. `outputs[0]` corresponds to the output from `walkers[0]`).
They provide access to the sampled trajectory of the simulation, the elapsed simulation time, and importantly, a number of *observable properties*
which have been written out by i-PI. These properties can be used to compute averages of physical observables, such as the internal energy or the virial stress tensor.
A full list of available properties is given in the [i-PI documentation](https://ipi-code.org/i-pi/output-tags.html). Note that psiflow adheres to the same naming convention as adopted in i-PI:

- `energy`: the total energy of the system. The actual name of this quantity is `potential{electronvolt}`
- `temperature`: the instantaneous temperature of the system. The actual name of this quantity is `temperature{kelvin}`
- `time`: the elapsed simulation time. The actual name of this quantity is `time{picosecond}`
- `volume`: the volume of the simulation cell (only for periodic systems). The actual name of this quantity is `volume{angstrom3}`

Similarly to the evaluation of a `Hamiltonian` or the querying of a snapshot in a `Dataset`, simulation outputs are returned as futures.
For example, say we wanted to compute the average energy for each of the simulations:
```py
import numpy as np

energy_futures = [output["potential{electronvolt}"] for output in outputs]
energies = [future.result() for future in energy_futures]
mean_energy = np.array([np.mean(energy) for energy in energies])
```
This example extracts the futures which contain the potential energies of all simulations, waits for them to complete (via `result()`), and then computes the mean energy for each simulation. In a very similar fashion, we can compute the bulk modules of bcc iron simply by constructing walkers at various pressures and extracting the corresponding `volume{angstrom3}` observable -- see [here](https://github.com/molmod/psiflow/tree/main/examples/iron_bulk_modulus.py).

In many cases, it is useful to save the trajectory of a simulation to disk.
Trajectories are essentially just a series of snapshots, and as such, psiflow represents them as `Dataset` instances.
Each of the outputs has an attribute `trajectory` which is a `Dataset` instance.
Let us save the trajectory of the first simulation to disk:

```py
outputs[0].trajectory.save("300K.xyz")
```

As a sanity check, let us recompute the potential energies which were stored at each snapshot during the simulation using the `evaluate` functionality of our MACE hamiltonian:
```py
mace = walkers[0].hamiltonian # the hamiltonian used in the simulations
future = mace.evaluate(outputs[0].trajectory).get('energy') # future of the recomputed energies as an array

manual_energies_0 = future.result() # get the actual numpy array

assert np.allclose(
manual_energies_0,
energies[0],
)
```
## walker utilities

## PIMD simulations

## replica exchange

## metadynamics
14 changes: 11 additions & 3 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,15 @@ markdown_extensions:
emoji_index: !!python/name:materialx.emoji.twemoji
emoji_generator: !!python/name:materialx.emoji.to_svg

#extra_javascript:
# - javascripts/mathjax.js
# - https://polyfill.io/v3/polyfill.min.js?features=es6
# - https://unpkg.com/mathjax@3/es5/tex-mml-chtml.js

extra_javascript:
- javascripts/mathjax.js
- https://polyfill.io/v3/polyfill.min.js?features=es6
- https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js
- javascripts/katex.js
- https://unpkg.com/katex@0/dist/katex.min.js
- https://unpkg.com/katex@0/dist/contrib/auto-render.min.js

extra_css:
- https://unpkg.com/katex@0/dist/katex.min.css

0 comments on commit 5b3da30

Please sign in to comment.