6. Calculators

Exploring potential energy surfaces (PESs) requires calculation of energies and its derivatives (gradients, Hessian matrices).

For testing purposes and method development, pysisyphus implements various analytical 2d potentials, as they allow fast evaluations of the aforementioned quantities. Actual (production) calculations are carried out by wrapping existing quantum chemistry codes (ORCA, TURBOMOLE, Gaussian etc.) and delegating the required calculations to them. Pysisyphus generates all necessary inputs, executes the QC code and parses their output for the requested quantities.

Furthermore pysisyphus provides several "meta"-calculators which wrap other (actual) calculators, to modify calculated energies and forces. Examples for this are the Dimer calculator, used for carrying out transition state searches or the ONIOM calculator, allowing multi-level calculations comprising different levels of theory.

External forces, e.g. a restraining spherical potential or harmonic restraints on primitive internal coordinates (stretches, bends, torsion) can be applied with ExternalPotential.

6.1. YAML input

Possible keywords for the YAML input can be derived from inspecting the possible arguments of the Calculator base-class (see below) and the possible arguments of the respective calculator, e.g. ORCA or XTB.

The most commonly used keywords, derived from the Calculator baseclass are mem, handling the requested memory per core in MB, pal, handling the number of requested CPU cores, charge, the total charge of the system and mult, the systems multiplicity.

For (excited state, ES) calculations carried out with calculators derived from OverlapCalculator additional keywords are possible. The most common keywords controlling ES calculations are track, activating ES tracking, ovlp_type, selecting the tracking algorithm and ovlp_with, handling the selection of the reference cycle.

An example input highlighting the most important keywords is shown below.

geom:
 [... omitted ...]
calc:
 type: orca                     # Calculator type, e.g. g09/g16/openmolcas/
                                # orca/pyscf/turbomole/dftb+/mopac/psi4/xtb

 pal: 1                         # Number of CPU cores
 mem: 1000                      # Memory per core
 charge: 0                      # Charge
 mult: 1                        # Multiplicity
 # Keywords for ES calculations
 track: False                   # Activate ES tracking
 ovlp_type: tden                # Tracking algorithm
 ovlp_with: previous            # Reference cycle selection
 # Additional calculator specific keywords
 [... omitted ...]
opt:
 [... omitted ...]

6.2. Calculator base classes

class pysisyphus.calculators.Calculator.Calculator(calc_number=0, charge=0, mult=1, base_name='calculator', pal=1, mem=1000, keep_kind='all', check_mem=True, retry_calc=0, last_calc_cycle=None, clean_after=True, out_dir='qm_calcs', force_num_hess=False, num_hess_kwargs=None)[source]
__init__(calc_number=0, charge=0, mult=1, base_name='calculator', pal=1, mem=1000, keep_kind='all', check_mem=True, retry_calc=0, last_calc_cycle=None, clean_after=True, out_dir='qm_calcs', force_num_hess=False, num_hess_kwargs=None)[source]

Base-class of all calculators.

Meant to be extended.

Parameters:
  • calc_number (int, default=0) -- Identifier of the Calculator. Used in distinguishing it from other Calculators, e.g. in ChainOfStates calculations. Also used in the creation of filenames.

  • charge (int, default=0) -- Molecular charge.

  • mult (int, default=1) -- Molecular multiplicity (1 = singlet, 2 = doublet, ...)

  • base_name (str, default=calculator) -- Generated filenames will start with this string.

  • pal (int, default=1) -- Positive integer that gives the number of physical cores to use on 1 node.

  • mem (int, default=1000) -- Mememory per core in MB. The total amount of memory is given as mem*pal.

  • check_mem (bool, default=True) -- Whether to adjust the requested memory if too much is requested.

  • retry_calc (int, default=0) -- Number of additional retries when calculation failed.

  • last_calc_cycle (int) -- Internal variable used in restarts.

  • clean_after (bool) -- Delete temporary directory were calculations were executed after a calculation.

  • out_dir (str) -- Path that is prepended to generated filenames.

  • force_hess_kwargs (bool, default False) -- Force numerical Hessians.

  • num_hess_kwargs (dict) -- Keyword arguments for finite difference Hessian calculation.

apply_keep_kind()[source]
apply_set_plans(kept_fns, set_plans=None)[source]
build_set_plans(_set_plans=None)[source]
clean(path)[source]

Delete the temporary directory.

Parameters:

path (Path) -- Directory to delete.

conf_key = None
force_num_hessian()[source]

Always calculate numerical Hessians.

classmethod geom_from_fn(fn, **kwargs)[source]
get_cmd(key='cmd')[source]
get_energy(atoms, coords, **prepare_kwargs)[source]

Meant to be extended.

get_forces(atoms, coords, **prepare_kwargs)[source]

Meant to be extended.

get_hessian(atoms, coords, **prepare_kwargs)[source]

Get Hessian matrix. Fall back to numerical Hessian, if not overriden.

Preferrably, this method should provide an analytical Hessian.

get_num_hessian(atoms, coords, **prepare_kwargs)[source]
get_relaxed_density(atoms, coords, root, **prepare_kwargs)[source]

Meant to be extended.

get_restart_info()[source]

Return a dict containing chkfiles.

Returns:

restart_info -- Dictionary holding the calculator state. Used for restoring calculaters in restarted calculations.

Return type:

dict

get_stored_wavefunction(**kwargs)[source]
get_wavefunction(atoms, coords, **prepare_kwargs)[source]

Meant to be extended.

keep(path)[source]

Backup calculation results.

Parameters:

path (Path) -- Temporary directory of the calculation.

Returns:

kept_fns -- Dictonary holding the filenames that were backed up. The keys correspond to the type of file.

Return type:

dict

load_wavefunction_from_file(fn, **kwargs)[source]
log(message='')[source]

Write a log message.

Wraps the logger variable.

Parameters:

message (str) -- Message to be logged.

make_fn(name, counter=None, return_str=False)[source]

Make a full filename.

Return a full filename including the calculator name and the current counter given a suffix.

Parameters:
  • name (str) -- Suffix of the filename.

  • counter (int, optional) -- If not given use the current calc_counter.

  • return_str (int, optional) -- Return a string instead of a Path when True.

Returns:

fn -- Filename.

Return type:

str

property name
popen(cmd, cwd=None)[source]
prepare(inp)[source]

Prepare a temporary directory and write input.

Similar to prepare_path, but the input is also written into the prepared directory.

6. Paramters

inpstr

Input to be written into the file self.inp_fn in the prepared directory.

returns:
path: Path

Prepared directory.

prepare_coords(atoms, coords)[source]

Get 3d coords in Angstrom.

Reshape internal 1d coords to 3d and convert to Angstrom.

Parameters:
  • atoms (iterable) -- Atom descriptors (element symbols).

  • coords (np.array, 1d) -- 1D-array holding coordinates in Bohr.

Returns:

coords -- 3D-array holding coordinates in Angstrom.

Return type:

np.array, 3d

prepare_input(atoms, coords, calc_type)[source]

Meant to be extended.

prepare_path(use_in_run=False)[source]

Get a temporary directory handle.

Create a temporary directory that can later be used in a calculation.

Parameters:

use_in_run (bool, option) -- Sets the internal variable self.path_already_prepared that is later read by self.run(). No new temporary directory will be created in self.run().

Returns:

path: Path

Prepared directory.

prepare_pattern(raw_pat)[source]

Prepare globs.

Transforms an entry of self.to_keep into a glob and a key suitable for the use in self.keep().

Parameters:

raw_pat (str) -- Entry of self.to_keep

Returns:

  • pattern (str) -- Glob that can be used in Path.glob()

  • multi (bool) -- Flag if glob may match multiple files.

  • key (str) -- A key to be used in the kept_fns dict.

prepare_turbo_coords(atoms, coords)[source]

Get a Turbomole coords string.

Parameters:
  • atoms (iterable) -- Atom descriptors (element symbols).

  • coords (np.array, 1d) -- 1D-array holding coordinates in Bohr.

Returns:

coords -- String holding coordinates in Turbomole coords format.

Return type:

str

prepare_xyz_string(atoms, coords)[source]

Returns a xyz string in Angstrom.

Parameters:
  • atoms (iterable) -- Atom descriptors (element symbols).

  • coords (np.array, 1d) -- 1D-array holding coordinates in Bohr.

Returns:

xyz_str -- Coordinates in .xyz format.

Return type:

string

print_capabilities()[source]
print_out_fn(path)[source]

Print calculation output.

Prints the output of a calculator after a calculation.

Parameters:

path (Path) -- Temporary directory of the calculation.

restore_org_hessian()[source]

Restore original 'get_hessian' method, which may also fallback to numerical Hessians, if not implemented.

run(inp, calc, add_args=None, env=None, shell=False, hold=False, keep=True, cmd=None, inc_counter=True, run_after=True, parser_kwargs=None, symlink=True)[source]

Run a calculation.

The bread-and-butter method to actually run an external quantum chemistry code.

Parameters:
  • inp (str) -- Input for the external program that is written to the temp-dir.

  • calc (str, hashable) -- Key (and more or less type of calculation) to select the right parsing function from self.parser_funcs.

  • add_args (iterable, optional) -- Additional arguments that will be appended to the program call.

  • env (Environment, optional) -- A potentially modified environment for the subprocess call.

  • shell (bool, optional) -- Use a shell to execute the program call. Need for Turbomole were we chain program calls like dscf; escf.

  • hold (bool, optional) -- Wether to remove the temporary directory after the calculation.

  • keep (bool, optional) -- Wether to backup files as specified in self.to_keep(). Usually you want this.

  • cmd (str or iterable, optional) -- Overwrites self.base_cmd.

  • inc_counter (bool, optional) -- Wether to increment the counter after a calculation.

Returns:

results -- Dictionary holding all applicable results of the calculations like the energy, a forces vector and/or excited state energies from TDDFT.

Return type:

dict

run_after(path)[source]

Meant to be extended.

This method is called after a calculation was done, but before entering self.keep() and self.clean(). Can be used to call tools like formchk or ricctools.

set_restart_info(restart_info)[source]

Sets restart information (chkfiles etc.) on the calculator.

Parameters:

restart_info (dict) -- Dictionary holding the calculator state. Used for restoring calculaters in restarted calculations.

verify_chkfiles(chkfiles)[source]

Checks if given chkfiles exist and return them as Paths

Parameters:

chkfiles (dict) -- Dictionary holding the chkfiles. The keys correspond to the attribute names, the values are strs holding the (potentially full) filename (path).

Returns:

paths -- Dictionary of Paths.

Return type:

dict

class pysisyphus.calculators.Calculator.HessKind(value)

An enumeration.

NUMERICAL = 2
ORG = 1
class pysisyphus.calculators.Calculator.KeepKind(value)

An enumeration.

ALL = 1
LATEST = 2
NONE = 3
class pysisyphus.calculators.Calculator.SetPlan(key, name=None, condition=<function SetPlan.<lambda>>, fail=None)[source]
condition()
fail: Optional[Callable] = None
key: str
name: Optional[str] = None

6.3. OverlapCalculator base class

6.4. Calculators with Excited state capabilities

6.4.1. Gaussian09

6.4.2. Gaussian16

6.4.3. OpenMolcas

Pysisyphus currently supports energy and gradient calculations utilizing the &rasscf and/or the &mcpdft sections. Neither analytical nor numerical Hessians are yet implemented for the OpenMolcas-calculator.

Two keywords are always required: inporb and basis, with the former pointing to a .RasOrb file and the latter containing the selected atomic orbital basis, e.g., ano-rcc-vdzp. Additional input for the &gateway, &rasscf and &mcpdft sections can be given via the respective keyword(s).

Due to restrictions of the current design, simple keywords that don't take further arguments as cmsi in &rasscf or grad and mspdft in &mcpdft still must be given with a trailing colon. See below for an example.

Listing 6.1 Optimization using compressed-multi-state PDFT.
geom:
 type: redund
 fn: |
  4
  
  C	-1.0398336639	 0.0	    0.0
  S	 0.6002429216	 0.0	    0.0
  H -1.6321592382	-0.94139864	0.0
  H	-1.6321592382	 0.94139864	0.0
calc:
 type: openmolcas
 basis: cc-pvdz
 charge: 0
 mult: 1
 inporb: /home/johannes/tmp/359_cmspdft/359_cmspdft.RasOrb
 rasscf:
  ciroot: 3 3 1
  mdrlxroot: 2
  cmsi:
 mcpdft:
  ksdft: t:pbe
  grad:
  mspdft:
opt:
 thresh: gau

6.4.4. ORCA 4.2.1 / 5.0.1

6.4.5. PySCF 1.7.6

6.4.6. Turbomole 7.x

Pysisyphus does not implement a wrapper for define, so the user has to manually prepare a directory containing a valid control file. An automated define wrapper, restricted to ground state functionality, is available via the QCEngine project, to which I contributed the Turbomole harness.

Care should be taken to include only the minimum amount of necessary files in the control_path directory, e.g., (auxbasis, basis, control, coord, mos) for a closed-shell calculation using RI. A gradient file must not be present in control_path, as well as other subdirectories and files with .out extension. The coord file, while not strictly required, should be kept too, to facilitate testing of the setup with standalone Turbomole.

It may be a good idea to pre-converge the calculation in control_path, to see if the setup is correct and actually works. Resulting files like energy, statistics can be deleted; mos should be kept, as the converged MOs are reused in pysisyphus.

If an excited-state optimization is desired, care has to be taken, to include $exopt [n] for TD-DFT/TDA or the geoopt state=([n]) (ricc2)! Tracking of excited states is currently possible for closed shell egrad and ricc2 calculations.

The current implementation was tested against Turbomole 7.4.1 and QCEngine 0.19.0. Please see examples/complex/11_turbomole_gs_tsopt for a full example where Turbmole is utilized in a growing string calculation. The same example, using QCEngine, is found in examples/complex/12_qcengine_turbomole_gs_tsopt. MOs are not reused with the QCEngine calculator, so the native pysisyphus calculator is probably faster.

6.4.7. DFTB+ 20.x

6.5. Calculators with Ground state capabilities

6.5.1. MOPAC 2016

6.5.2. Psi4

6.5.3. QCEngine

6.5.4. XTB 6.x

6.5.5. Dalton

6.5.6. OpenBabel

6.5.7. CFOUR

6.6. Meta (wrapping) Calculators

6.6.1. ExternalPotential

6.6.2. Restraint

# General input structure for restraints
calc:
 type: ext
 # Multiple potentials could be specified here as a list
 potentials:
   # Primitive internal coordinate restraint
   - type: restraint
     # List of restraints; could also be multiple restraints. Each restraint is given as
     # list of 2 or 3 items.
     #
     # The first item always specifies an internal coordinate,
     # whereas the second argument is a force constant (in atomic units; actual units
     # depend on the coordinate). Optionally a reference value (third argument) can be
     # given. If omitted, the initial coordinate value is used as reference value.
     restraints: [[[BOND, 0, 1], 10, 3.0]]
     # The commented out input below would restrain the bond at its initial value.
     #restraints: [[[BOND, 0, 1], 10]]
     # Multiple restraints are specified as given below.
     #restraints: [[[BOND, 0, 1], 10], [[BEND, 0, 1, 2], 1.0]]
calc:
 type: [actual calculator that is wrapped by ExternalPotential]

6.6.3. HarmonicSphere

6.6.4. LogFermi

6.6.5. RMSD

6.6.6. DFT-D3

Method to add DFT-D3 dispersion corrections as an external potential via the program developed by the Grimme group <https://www.chemie.uni-bonn.de/grimme/de/software/dft-d3/get_dft-d3>.

This is for use with calculators that do not natively provide D3 corrections (e.g. OpenMolcas). Usage mirrors that of other external potentials, with an example given below.

# General input structure for restraints
calc:
 type: ext
 # Multiple potentials could be specified here as a list
 potentials:
   # Add atom-pairwise D3 dispersion correction as a differentiable, external potential
   - type: d3
     # Functional is specified in TURBOMOLE format, all lower case.
     functional: pbe
     # Optional Becke-Johnson damping, default false, recommended true
     bjdamping: true

calc:
 type: [actual calculator that is wrapped by ExternalPotential]

6.6.7. AFIR

class pysisyphus.calculators.AFIR.AFIR(calculator, fragment_indices, gamma, rho=1, p=6, ignore_hydrogen=False, zero_hydrogen=True, complete_fragments=True, dump=True, h5_fn='afir.h5', h5_group_name='afir', **kwargs)[source]

Bases: Calculator

__init__(calculator, fragment_indices, gamma, rho=1, p=6, ignore_hydrogen=False, zero_hydrogen=True, complete_fragments=True, dump=True, h5_fn='afir.h5', h5_group_name='afir', **kwargs)[source]

Artifical Force Induced Reaction calculator.

Currently, there are no automated drivers to run large-scale AFIR calculations with many different initial orientations and/or increasing collision energy parameter γ. Nontheless, selected AFIR calculations can be carried out by hand. After convergence, artificial potential & forces, as well as real energies and forces can be plotted with 'pysisplot --afir'. The highest energy point along the AFIR path can then be selected for a subsequent TS-optimization, e.g. via 'pysistrj --get [index] optimzation.trj'.

Future versions of pysisyphus may provide drivers for more automatted AFIR calculations.

Parameters:
  • calculator (Calculator) -- Actual QC calculator that provides energies and its derivatives, that are modified by the AFIR calculator, e.g., ORCA or Psi4.

  • fragment_indices (List[List[int]]) -- List of lists of integers, specifying the separate fragments. If the indices in theses lists don't comprise all atoms in the molecule, the reamining indices will be added as a separate fragment. If a AFIR calculation is carried out with 2 fragments and 'complete_fragments' is True (see below) it is enough to specify only the indices of one fragment, e.g., for a system of 10 atoms 'fragment_indices=[[0,1,2,3]]' is enough. The second system will be set up automatically with indices [4,5,6,7,8,9].

  • gamma (Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]]) -- Collision energy parameter γ in au. For 2 fragments it can be a single integer, while for > 2 fragments a list of gammas must be given, specifying the pair-wise collision energy parameters. For 3 fragments 3 gammas must be given [γ_01, γ_02, γ_12], for 4 fragments 6 gammas would be required [γ_01, γ_02, γ_03, γ_12, γ_13, γ_23] and so on.

  • rho (Union[_SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], default: 1) -- Direction of the artificial force, either 1 or -1. The same comments as for gamma apply. For 2 fragments a single integer is enough, for > 2 fragments a list of rhos must be given (see above). For rho=1 fragments are pushed together, for rho=-1 fragments are pulled apart.

  • p (int, default: 6) -- Exponent p used in the calculation of the weight function ω. Defaults to 6 and probably does not have to be changed.

  • ignore_hydrogen (bool, default: False) -- Whether hydrogens are ignored in the calculation of the artificial force. All weights between atom pairs containing hydrogen will be set to 0.

  • zero_hydrogen (bool, default: True) -- Whether to use 0.0 as covalent radius for hydrogen in the weight function. Compared to 'ignore_hydrogen', which results in zero weights for all atom pairs involving hydrogen, 'zero_hydrogen' may be non-zero, depending on the covalent radius of the second atom in the pair.

  • complete_fragments (bool, default: True) -- Whether an incomplete specification in 'fragment_indices' is automatically completed.

  • dump (bool, default: True) -- Whether an HDF5 file is created.

  • h5_fn (str, default: 'afir.h5') -- Filename of the HDF5 file used for dumping.

  • h5_group_name (str, default: 'afir') -- HDF5 group name used for dumping.

  • **kwargs -- Keyword arguments passed to the Calculator baseclass.

afir_fd_hessian_wrapper(coords3d, afir_grad_func)[source]
property charge
dump_h5(atoms, coords, results)[source]
get_energy(atoms, coords, **prepare_kwargs)[source]

Meant to be extended.

get_forces(atoms, coords, **prepare_kwargs)[source]

Meant to be extended.

get_hessian(atoms, coords, **prepare_kwargs)[source]

Get Hessian matrix. Fall back to numerical Hessian, if not overriden.

Preferrably, this method should provide an analytical Hessian.

init_h5_group(atoms, max_cycles=None)[source]
log_fragments()[source]
property mult
set_atoms_and_funcs(atoms, coords)[source]

Initially atoms was also an argument to the constructor of AFIR. I removed it so creation becomes easier. The first time a calculation is requested with a proper atom set everything is set up (cov. radii, afir function and corresponding gradient). Afterwards there is only a check if atoms != None and it is expected that all functions are properly set.

fragment_indices can also be incomplete w.r.t. to the number of atoms. If the sum of the specified fragment atoms is less than the number of atoms present then all remaining unspecified atoms will be gathered in one fragment.

write_fragment_geoms(atoms, coords)[source]
exception pysisyphus.calculators.AFIR.CovRadiiSumZero[source]

Bases: Exception

pysisyphus.calculators.AFIR.afir_closure(fragment_indices, cov_radii, gamma, rho=1, p=6, prefactor=1.0, logger=None)[source]

rho=1 pushes fragments together, rho=-1 pulls fragments apart.

pysisyphus.calculators.AFIR.get_data_model(atoms, max_cycles)[source]

6.6.8. ONIOM

6.6.9. Dimer

6.7. Pure Python calculators

6.7.1. Sympy 2D Potentials

6.7.2. Lennard-Jones

6.7.3. FakeASE

6.7.4. TIP3P