The input uses the YAML format, an user-friendly structured data format. In addition, Cuby checks the type of the data entered.
In the simplest form, the input is just a list of keywords and their values:
job: energy
interface: turbomole
method: hf
basisset: cc-pVDZ
charge: 0
When a keyword used by the claculation is missing in the input, cuby might either stop with an error message, or continue using a default value (optionally with a warning printed). This behavior, as well as the respective default value, is defined separately for each keyword.
YAML uses the character '#' to start a comment, either at the beginning or within a line.
# HF energy calculation
job: energy
method: hf # the method is set here!
More complex computational protocols consist of multiple calculations, input for them is provided in separate blocks of the input. Each block has its name and the contents of the block are indented. The block ends when the indentation returns to the original level. The blocks can be nested arbitrarily. What blocks are required or accepted by different interfaces and protocols is listed in their description.
interface: mixer # Mixes two calculations
calculation_a: # Name of the block defining the first calculation
method: HF # contents of the block is indented
modifiers: restraints
modifier_restraint: # A nested block
restraints_setup: yes # Contents of the nested block
charge: 0 # This belongs to the block calculation_a, as defined by the indentation
calculation_b: # A second block at the root level
method: MP2
There is an important difference between Cuby 3 and Cuby 4: In Cuby 3, most of the information was read at the root level of the input and only some setup was read from the respective block (e.g a charge of the molecule was listed only once at the root level and it was used in all the child calculations). In Cuby 4, each calculation has to be completely defined in its own block, Cuby does not look elsewhere (e.g. in a job with multiple caclulations, the charge has to be entered in each block defining a calculations). This requires some more typing but it makes it possible to create very complex inputs systematically.
YAML language specification does not allow the use of duplicate keys, but the ruby parser does not complain when it encounters this problems and just overwrites the value. Therefore, input
key: "ABC"
key: "xyz"
sets the value of the key to "xyz". Unfortunately, this is the behavior of the YAML parser itself, so it is not possible to warn the user when such duplicity is encountered.
In more complex inputs, parts of the setup can be reused in multiple calculations. Some protocols support a block 'calculation_common' that applies to all the calculations. The same can be achieved at the level of the YAML language where some parts of the input can be named and then reused.
# Definition of the shared settings
# In YAML, the name of the block can be arbitrary, but in cuby input, we use
# a convention that it should start with prefix 'shared_'. Then, a label starting
# with '&' is used to name the block for further use.
shared_mopac: &mopac_setup
interface: mopac
method: pm6
charge: 0
# Another block of shared settings, these can be mixed as needed
shared_job: &job_setup
job: interaction
# The job itself
job: multistep
steps: methanol, methylamine
calculation_methanol:
<<: *mopac_setup # This merges in the shared settings defined above
<<: *job_setup # This merges in the shared settings defined above
geometry: S66:02 # Water ... MeOH
calculation_methylamine:
job: interaction
<<: *mopac_setup
geometry: S66:03 # Water ... MeNH2
While it is not required in YAML, Cuby expects all the blocks of shared settings to be named 'shared_...' in order to distinguish them from other input blocks.
It is possible to modify the input depending on a condition evaluated when the calculation is run (more precisely, it is done when the computational protocol used for the calculation is being initialized). The conditional input has to be placed in a separate block named condition_... and the condition must be defined in this block using the condition keyword. If the condition evaluates positively, the contents of the conditial block are copied one level higher, it is into the jeb setup. Multiple conditinal blocks can be present.
The condition is evaluated as a ruby code, in the context of the initialization of the computational protocol. Apart from any general code (e.g. testing for presence of files, etc.), the current settings for the calculation are available as the @settings object. No more information on the calculation is available at this point because the condition should be able to overwrite the settings before they are used to build the calculation.
The following example shows a condition dependent on the interface used to perform the calculation. If the same input was used with another interface, the part of the input specific to MOPAC won't be used:
job: energy
interface: mopac
method: am1
geometry: A24:water
# A condition inserts a setup specific to MOPAC
condition_interface:
condition: "@settings[:interface] == :mopac"
mopac_precise: yes
It is, however, useful to apply conditions based on the geometry. While it is not defined yet at the time of the preparation of the calculation, it can be loaded just for the purpose of the evaluation of the condition if it is defined in the settings. In the following example, the number of optimization cycles is increased if the geometry has more than 6 atoms:
job: optimize
interface: mopac
method: pm6
maxcycles: 5
condition_molsize:
condition: "Geometry.load_from_settings(@settings).size > 6"
maxcycles: 100
geometry: S66:10