Reparametrization Instructions #41

jolayfield · 2020-07-23T20:23:58Z

My group has been looking to re-parametrize GFN2-xTB to calculate the vibrational structure of a specific class of small molecules. We would be happy to contribute a section to documents about how to do this since it appears to be missing. I am not sure if the omission is intentional but if it is desired we can add that section to the documentation if desired.

awvwgk · 2020-07-24T09:58:14Z

We did not add information on the parametrisation so far since this is usually not the primary concern for users of xtb. Of course, any contributions to this document are highly appreciated.

The parametrisation tools used by us internally are somewhat involved and not available with xtb, creating a more standard way to access and manipulate the xTB Hamiltonian would certainly be necessary. Also, certain features of the xTB Hamiltonian, especially the self-consistent D4 in GFN2-xTB, add a little caveat to any parametrisation attempt, as it requires a specially modified xtb binary for this purpose.

RaphaelRobidas · 2021-12-21T18:37:24Z

I would be also interested. Si was recently reparameterized for GFN1-xTB (10.1021/acs.jcim.1c01170). I am guessing that GFN1-xTB is easier to reparameterize than GFN2-xTB? That could be a starting point for this section of the documentation.

awvwgk · 2021-12-21T18:40:01Z

I created some basic instructions for parametrization of the xTB Hamiltonian at https://tblite.readthedocs.io/en/latest/tutorial/fitting.html. The base parametrization doesn't really matter actually, if the infrastructure allows to use handle them on equal footing.

RaphaelRobidas · 2021-12-21T19:21:00Z

This is great, thank you. The procedure seems to work so far. There is only one detail which is unclear to me in that tutorial, namely the format of the reference data. Is this described a bit more elsewhere?

It is mentioned that the structures must be in Turbomole format. I am guessing that each directory thus contains the reference energy as Turbomole output file? I unfortunately do not have access to Turbomole and would benefit from details to convert my reference data into the correct format.

awvwgk · 2021-12-21T21:40:32Z

It is mentioned that the structures must be in Turbomole format.

The geometry input format doesn't really matter, coord, xyz, gen, ein, mol, sdf, pdf, or vasp are currently supported. It should however be consistent to allow automatic processing.

I am guessing that each directory thus contains the reference energy as Turbomole output file?

Currently I'm using a format from DFTB+ called tagged data (see https://github.com/tblite/tblite/blob/main/man/tblite-tag.5.adoc). Using a more standardized format in the future would be preferable.

Feedback is welcome.

RaphaelRobidas · 2021-12-21T23:42:31Z

Thanks, I can get the fitting process running. The file format is not a huge issue in my opinion, since the calculation output files need to be parsed and formatted anyway. cclib actually parses the gradients for most packages. It would be fairly straightforward to write a helper script which parses the necessary information from raw output files and generates reference data in the required format. I'll code it, if you'd like.

Also, the fitting requires the virial. I'm not sure what that corresponds to, and it does not seem to be a term used in the output files (or does not refer to a 3x3 matrix). Does it require a certain keyword or I'm just missing the right synonym?

awvwgk · 2021-12-22T07:59:35Z

The virial (pressure) is related to the stress tensor / lattice gradient and is usually only available from periodic DFT programs. However, tblite does always calculate and print it, but entries which don't have an equivalent in the reference will be ignored.

RaphaelRobidas · 2021-12-23T16:34:28Z

Thanks a lot for the help! The reparameterization runs as expected by omitting the virial. The fitting has a tendency of either diverging into very high values (which become NaN) or staying quite close to the initial parameters. I'm guessing this depends on the initial guess and is an expected challenge of the fitting procedure.

RaphaelRobidas · 2021-12-24T14:59:23Z

Actually, I just realized that the reference energies need to be normalized in some way to be compared with the calculated xTB energies. The best practices of this process could be useful to have in the documentation as well.

awvwgk added enhancement New feature or request xtb Related to the extended tight binding program package labels Jul 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reparametrization Instructions #41

Reparametrization Instructions #41

jolayfield commented Jul 23, 2020

awvwgk commented Jul 24, 2020

RaphaelRobidas commented Dec 21, 2021

awvwgk commented Dec 21, 2021

RaphaelRobidas commented Dec 21, 2021

awvwgk commented Dec 21, 2021

RaphaelRobidas commented Dec 21, 2021

awvwgk commented Dec 22, 2021

RaphaelRobidas commented Dec 23, 2021

RaphaelRobidas commented Dec 24, 2021

Reparametrization Instructions #41

Reparametrization Instructions #41

Comments

jolayfield commented Jul 23, 2020

awvwgk commented Jul 24, 2020

RaphaelRobidas commented Dec 21, 2021

awvwgk commented Dec 21, 2021

RaphaelRobidas commented Dec 21, 2021

awvwgk commented Dec 21, 2021

RaphaelRobidas commented Dec 21, 2021

awvwgk commented Dec 22, 2021

RaphaelRobidas commented Dec 23, 2021

RaphaelRobidas commented Dec 24, 2021