Molecular geometry


In the input, the geometry is specified by the geometry keyword. Multiple options are available:

Specify geometry from commandline

Note that the value of the geometry keyword can be set from the commandline using a -g shortcut. It is therefore possible to run cuby calculation on any geometry file by typing e.g.

cuby4 input.yaml -g

File formats

Cuby supports following file formats:

File format Filename Read Write Comments
XYZ *.xyz yes yes with extensions described elsewhere in the documentation
*.pdb yes yes modified version with more precise coordinates an be used as *.lpdb
Tripos MOL2 *.mol2 yes yes  
SDF *.sdf yes no  
Z-matrix *.zmat, *.gzmat yes no format compatible with Gaussian, converted to cartesian upon reading
Turbomole *.coord yes yes cartesian coordinates in in atomic units

Geometry databases

The geometry databases can be found in directory cuby4/data/geometries. Each database consists of a tarball containing the geometry files and a YAML file that assings names to the filenames.

The geometry datbases are often associated with data sets of benchmark data; for each of the data sets listed, there's a database of geometries of the same name.


The molecular geometry can be built from the SMILES notation.

Cuby uses an external tool for this, the Balloon program. To use this feature, a path to Balloon has ta be configured using keyword balloon_dir.


Upon loading the geometry, it can be modified by following keywors (in this order):

  1. geometry_update_coordinates - The coordinates in the geometry are updated with coordinates from another file. This is useful for e.g. loading modified coordinates from an .xyz file into geometry in PDB format containing additional information.
  2. geometry_rotate - The geometry is rotated around its origin in x,y,z axes.
  3. geometry_reorder - Change the order of the atoms.
  4. ghost_atoms - Selected atoms can be labaled as "ghost atoms" posessing basis set but no charge in QM calculations.
  5. selection - cut only selected atoms from the geometry.

More options of modifying a geometry are provided by the geometry protocol

Multiple geometries in file

Some protocols such as the protocol scan read multiple geometries from one file. In such case, .xyz format with multiple entries is used.

Additional data in geometry files

To simplify mass processing of different molecules using a single input file, Cuby can read the charge and multiplicity of the system from the geometry file. This feature has to be enabled in the input using the keyword geometry_setup_from_file. When one or both values are ommited, charge 0 and multiplicity 1 are assumed.

In a .xyz file, the data should be present at the second line of the file in format 'charge=X multiplicity=Y'. There is an example for hydrogen molecule anion:

charge=-1 multiplicity=2
H 0.000 0.000 0.000
H 0.000 0.000 0.600 

In PDB files, charge and multiplicity are read from REMARK statements that can be placed anywhere in the file. An example:

REMARK charge -1
REMARK multiplicity 2

In .mol2 files, the total charge is calculated as a sum of atomic charges. Reading multiplicity is not supported.