14. Testing and Validation of OpenMM

The goal of testing and validation is to make sure that OpenMM works correctly. That means that it runs without crashing or otherwise failing, and that it produces correct results. Furthermore, it must work correctly on a variety of hardware platforms (e.g. different models of GPU), software platforms (e.g. operating systems and OpenCL implementations), and types of simulations.

OpenMM includes unit tests, which are small tests designed to test specific features or pieces of code in isolation. For example, a test of HarmonicBondForce might create a System with just a few particles and bonds, compute the forces and energy, and compare them to the analytically expected values. There are thousands of unit tests that collectively cover all of OpenMM.

In addition, a direct comparison has been made of the individual forces computed by OpenMM to those computed by other programs for a collection of biomolecules.

Each of these types of tests is outlined in greater detail below; a discussion of the current status of the tests is then given.

14.1. Description of Tests

14.1.1. Unit tests

The unit tests are with the source code, so if you build from source you can run them yourself. See Section 9.1.4 for details. When you run the tests (for example, by typing “make test” on Linux or Mac), it should produce output something like this:

      Start   1: TestReferenceAndersenThermostat
1/317 Test   #1: TestReferenceAndersenThermostat .............. Passed  0.26 sec
      Start   2: TestReferenceBrownianIntegrator
2/317 Test   #2: TestReferenceBrownianIntegrator .............. Passed  0.13 sec
      Start   3: TestReferenceCheckpoints
3/317 Test   #3: TestReferenceCheckpoints ..................... Passed  0.02 sec
... <many other tests> ...

Each line represents a test suite, which may contain multiple unit tests. If all tests within a suite passed, it prints the word “Passed” and how long the suite took to execute. Otherwise it prints an error message. If any tests failed, you can then run them individually (each one is a separate executable) to get more details on what went wrong.

14.1.2. Direct comparisons between OpenMM and other programs

As another check, identical systems are set up in OpenMM and in another program (Gromacs 4.5 or Tinker 6.1), each one is used to compute the forces on atoms, and the results are directly compared to each other.

14.2. Test Results

In this section, we highlight the major results obtained from the tests described above. They are not exhaustive, but should give a reasonable idea of the level of accuracy you can expect from OpenMM.

14.2.1. Comparison to Reference Platform

The differences between forces computed with the Reference platform and those computed with the OpenCL or CUDA platform are shown in Table 14-1. For every atom, the relative difference between platforms was computed as 2·|Fref–Ftest|/(|Fref|+|Ftest|), where Fref is the force computed by the Reference platform and Ftest is the force computed by the platform being tested (OpenCL or CUDA). The median over all atoms in a given system was computed to estimate the typical force errors for that system. Finally, the median of those values for all test systems was computed to give the value shown in the table.

Force

OpenCL (single)

OpenCL (double)

CUDA (single)

CUDA (double)

Total Force

2.53·10-6

1.44·10-7

2.56·10-6

8.78·10-8

HarmonicBondForce

2.88·10-6

1.57·10-13

2.88·10-6

1.57·10-13

HarmonicAngleForce

2.25·10-5

4.21·10-7

2.27·10-5

4.21·10-7

PeriodicTorsionForce

8.23·10-7

2.44·10-7

9.27·10-7

2.56·10-7

RBTorsionForce

4.86·10-6

1.46·10-7

4.72·10-6

1.4·10-8

NonbondedForce (no cutoff)

1.49·10-6

6.49·10-8

1.49·10-6

6.49·10-8

NonbondedForce (cutoff, nonperiodic)

9.74·10-7

4.88·10-9

9.73·10-7

4.88·10-9

NonbondedForce (cutoff, periodic)

9.82·10-7

4.88·10-9

9.8·10-7

4.88·10-9

NonbondedForce (Ewald)

1.33·10-6

5.22·10-9

1.33·10-6

5.22·10-9

NonbondedForce (PME)

3.99·10-5

4.08·10-6

3.99·10-5

4.08·10-6

GBSAOBCForce (no cutoff)

3.0·10-6

1.76·10-7

3.09·10-6

9.4·10-8

GBSAOBCForce (cutoff, nonperiodic)

2.77·10-6

1.76·10-7

2.95·10-6

9.33·10-8

GBSAOBCForce (cutoff, periodic)

2.61·10-6

1.78·10-7

2.77·10-6

9.24·10-8

Table 14-1: Median relative difference in forces between Reference platform and OpenCL/CUDA platform

14.2.2. Energy Conservation

Figure 14-1 shows the total system energy versus time for three simulations of ubiquitin in OBC implicit solvent. All three simulations used the CUDA platform, a Verlet integrator, a time step of 0.5 fs, no constraints, and no cutoff on the nonbonded interactions. They differ only in the level of numeric precision that was used for calculations (see Chapter 11).

../_images/EnergyDrift.png

Figure 14-1: Total energy versus time for simulations run in three different precision modes.

For the mixed and double precision simulations, the drift in energy is almost entirely diffusive with negligible systematic drift. The single precision simulation has a more significant upward drift with time, though the rate of drift is still small compared to the rate of short term fluctuations. Fitting a straight line to each curve gives a long term rate of energy drift of 3.98 kJ/mole/ns for single precision, 0.217 kJ/mole/ns for mixed precision, and 0.00100 kJ/mole/ns for double precision. In the more commonly reported units of kT/ns/dof, these correspond to 4.3·10-4 for single precision, 2.3·10-5 for mixed precision, and 1.1·10-7 for double precision.

Be aware that different simulation parameters will give different results. These simulations were designed to minimize all sources of error except those inherent in OpenMM. There are other sources of error that may be significant in other situations. In particular:

  • Using a larger time step increases the integration error (roughly proportional to dt2).

  • If a system involves constraints, the level of error will depend strongly on the constraint tolerance specified by the Integrator.

  • When using Ewald summation or Particle Mesh Ewald, the accuracy will depend strongly on the Ewald error tolerance.

  • Applying a distance cutoff to implicit solvent calculations will increase the error, and the shorter the cutoff is, the greater the error will be.

As a result, the rate of energy drift may be much greater in some simulations than in the ones shown above.

14.2.3. Comparison to Gromacs

OpenMM and Gromacs 4.5.5 were each used to compute the atomic forces for dihydrofolate reductase (DHFR) in implicit and explicit solvent. The implicit solvent calculations used the OBC solvent model and no cutoff on nonbonded interactions. The explicit solvent calculations used Particle Mesh Ewald and a 1 nm cutoff on direct space interactions. For OpenMM, the Ewald error tolerance was set to 10-6. For Gromacs, fourierspacing was set to 0.07 and ewald_rtol to 10-6. No constraints were applied to any degrees of freedom. Both programs used single precision. The test was repeated for OpenCL, CUDA, and CPU platforms.

For every atom, the relative difference between OpenMM and Gromacs was computed as 2·|FMM–FGro|/(|FMM|+|FGro|), where FMM is the force computed by OpenMM and FGro is the force computed by Gromacs. The median over all atoms is shown in Table 14-2.

Solvent Model

OpenCL

CUDA

CPU

Implicit

7.66·10-6

7.68·10-6

1.94·10-5

Explicit

6.77·10-5

6.78·10-5

9.89·10-5

Table 14-2: Median relative difference in forces between OpenMM and Gromacs