Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reparametrization Instructions #41

Open
jolayfield opened this issue Jul 23, 2020 · 9 comments
Open

Reparametrization Instructions #41

jolayfield opened this issue Jul 23, 2020 · 9 comments
Labels
enhancement New feature or request xtb Related to the extended tight binding program package

Comments

@jolayfield
Copy link

My group has been looking to re-parametrize GFN2-xTB to calculate the vibrational structure of a specific class of small molecules. We would be happy to contribute a section to documents about how to do this since it appears to be missing. I am not sure if the omission is intentional but if it is desired we can add that section to the documentation if desired.

@awvwgk awvwgk added enhancement New feature or request xtb Related to the extended tight binding program package labels Jul 24, 2020
@awvwgk
Copy link
Member

awvwgk commented Jul 24, 2020

We did not add information on the parametrisation so far since this is usually not the primary concern for users of xtb. Of course, any contributions to this document are highly appreciated.

The parametrisation tools used by us internally are somewhat involved and not available with xtb, creating a more standard way to access and manipulate the xTB Hamiltonian would certainly be necessary. Also, certain features of the xTB Hamiltonian, especially the self-consistent D4 in GFN2-xTB, add a little caveat to any parametrisation attempt, as it requires a specially modified xtb binary for this purpose.

@RaphaelRobidas
Copy link

I would be also interested. Si was recently reparameterized for GFN1-xTB (10.1021/acs.jcim.1c01170). I am guessing that GFN1-xTB is easier to reparameterize than GFN2-xTB? That could be a starting point for this section of the documentation.

@awvwgk
Copy link
Member

awvwgk commented Dec 21, 2021

I created some basic instructions for parametrization of the xTB Hamiltonian at https://tblite.readthedocs.io/en/latest/tutorial/fitting.html. The base parametrization doesn't really matter actually, if the infrastructure allows to use handle them on equal footing.

@RaphaelRobidas
Copy link

This is great, thank you. The procedure seems to work so far. There is only one detail which is unclear to me in that tutorial, namely the format of the reference data. Is this described a bit more elsewhere?

It is mentioned that the structures must be in Turbomole format. I am guessing that each directory thus contains the reference energy as Turbomole output file? I unfortunately do not have access to Turbomole and would benefit from details to convert my reference data into the correct format.

@awvwgk
Copy link
Member

awvwgk commented Dec 21, 2021

It is mentioned that the structures must be in Turbomole format.

The geometry input format doesn't really matter, coord, xyz, gen, ein, mol, sdf, pdf, or vasp are currently supported. It should however be consistent to allow automatic processing.

I am guessing that each directory thus contains the reference energy as Turbomole output file?

Currently I'm using a format from DFTB+ called tagged data (see https://github.com/tblite/tblite/blob/main/man/tblite-tag.5.adoc). Using a more standardized format in the future would be preferable.

Feedback is welcome.

@RaphaelRobidas
Copy link

Thanks, I can get the fitting process running. The file format is not a huge issue in my opinion, since the calculation output files need to be parsed and formatted anyway. cclib actually parses the gradients for most packages. It would be fairly straightforward to write a helper script which parses the necessary information from raw output files and generates reference data in the required format. I'll code it, if you'd like.

Also, the fitting requires the virial. I'm not sure what that corresponds to, and it does not seem to be a term used in the output files (or does not refer to a 3x3 matrix). Does it require a certain keyword or I'm just missing the right synonym?

@awvwgk
Copy link
Member

awvwgk commented Dec 22, 2021

The virial (pressure) is related to the stress tensor / lattice gradient and is usually only available from periodic DFT programs. However, tblite does always calculate and print it, but entries which don't have an equivalent in the reference will be ignored.

@RaphaelRobidas
Copy link

Thanks a lot for the help! The reparameterization runs as expected by omitting the virial. The fitting has a tendency of either diverging into very high values (which become NaN) or staying quite close to the initial parameters. I'm guessing this depends on the initial guess and is an expected challenge of the fitting procedure.

@RaphaelRobidas
Copy link

Actually, I just realized that the reference energies need to be normalized in some way to be compared with the calculated xTB energies. The best practices of this process could be useful to have in the documentation as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request xtb Related to the extended tight binding program package
Projects
None yet
Development

No branches or pull requests

3 participants