In the simplest form, the input is just a list of keywords and their values:
job: energy interface: turbomole method: hf basisset: cc-pVDZ charge: 0
When a keyword used by the claculation is missing in the input, cuby might either stop with an error message, or continue using a default value (optionally with a warning printed). This behavior, as well as the respective default value, is defined separately for each keyword.
YAML uses the character '#' to start a comment, either at the beginning or within a line.
# HF energy calculation job: energy method: hf # the method is set here!
More complex computational protocols consist of multiple calculations, input for them is provided in separate blocks of the input. Each block has its name and the contents of the block are indented. The block ends when the indentation returns to the original level. The blocks can be nested arbitrarily. What blocks are required or accepted by different interfaces and protocols is listed in their description.
interface: mixer # Mixes two calculations calculation_a: # Name of the block defining the first calculation method: HF # contents of the block is indented modifiers: restraints modifier_restraint: # A nested block restraints_setup: yes # Contents of the nested block charge: 0 # This belongs to the block calculation_a, as defined by the indentation calculation_b: # A second block at the root level method: MP2
There is an important difference between Cuby 3 and Cuby 4: In Cuby 3, most of the information was read at the root level of the input and only some setup was read from the respective block (e.g a charge of the molecule was listed only once at the root level and it was used in all the child calculations). In Cuby 4, each calculation has to be completely defined in its own block, Cuby does not look elsewhere (e.g. in a job with multiple caclulations, the charge has to be entered in each block defining a calculations). This requires some more typing but it makes it possible to create very complex inputs systematically.
YAML language specification does not allow the use of duplicate keys, but the ruby parser does not complain when it encounters this problems and just overwrites the value. Therefore, input
key: "ABC" key: "xyz"
sets the value of the key to "xyz". Unfortunately, this is the behavior of the YAML parser itself, so it is not possible to warn the user when such duplicity is encountered.
In more complex inputs, parts of the setup can be reused in multiple calculations. Some protocols support a block 'calculation_common' that applies to all the calculations. The same can be achieved at the level of the YAML language where some parts of the input can be named and then reused.
# Definition of the shared settings # In YAML, the name of the block can be arbitrary, but in cuby input, we use # a convention that it should start with prefix 'shared_'. Then, a label starting # with '&' is used to name the block for further use. shared_mopac: &mopac_setup interface: mopac method: pm6 charge: 0 # Another block of shared settings, these can be mixed as needed shared_job: &job_setup job: interaction # The job itself job: multistep steps: methanol, methylamine calculation_methanol: <<: *mopac_setup # This merges in the shared settings defined above <<: *job_setup # This merges in the shared settings defined above geometry: S66:02 # Water ... MeOH calculation_methylamine: job: interaction <<: *mopac_setup geometry: S66:03 # Water ... MeNH2
While it is not required in YAML, Cuby expects all the blocks of shared settings to be named 'shared_...' in order to distinguish them from other input blocks.
It is possible to modify the input depending on a condition evaluated when the calculation is run (more precisely, it is done when the computational protocol used for the calculation is being initialized). The conditional input has to be placed in a separate block named condition_... and the condition must be defined in this block using the condition keyword. If the condition evaluates positively, the contents of the conditial block are copied one level higher, it is into the jeb setup. Multiple conditinal blocks can be present.
The condition is evaluated as a ruby code, in the context of the initialization of the computational protocol. Apart from any general code (e.g. testing for presence of files, etc.), the current settings for the calculation are available as the @settings object. No more information on the calculation is available at this point because the condition should be able to overwrite the settings before they are used to build the calculation.
The following example shows a condition dependent on the interface used to perform the calculation. If the same input was used with another interface, the part of the input specific to MOPAC won't be used:
job: energy interface: mopac method: am1 geometry: A24:water # A condition inserts a setup specific to MOPAC condition_interface: condition: "@settings[:interface] == :mopac" mopac_precise: yes
It is, however, useful to apply conditions based on the geometry. While it is not defined yet at the time of the preparation of the calculation, it can be loaded just for the purpose of the evaluation of the condition if it is defined in the settings. In the following example, the number of optimization cycles is increased if the geometry has more than 6 atoms:
job: optimize interface: mopac method: pm6 maxcycles: 5 condition_molsize: condition: "Geometry.load_from_settings(@settings).size > 6" maxcycles: 100 geometry: S66:10