Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ML cookbook ideas (suggestions welcome) #10

Open
mattwelborn opened this issue Sep 4, 2019 · 0 comments
Open

ML cookbook ideas (suggestions welcome) #10

mattwelborn opened this issue Sep 4, 2019 · 0 comments

Comments

@mattwelborn
Copy link
Collaborator

This issue proposes some ideas for ML tutorial content using the data on QCArchive. These examples should focus on use cases for quantum chemical data in ML, including e.g. supervised learning of relationships between:

  • structure and QC properties
  • QC properties and function

and e.g. unsupervised learning of:

  • molecule clusters on the basis of QC properties
  • interest in different classes of molecules or theory methods based on distributions found in the QCArchive data

These examples should demonstrate the key advantages of QCArchive as a distribution method for ML data versus the current model of SI and Figshare: uniform data formats, interoperability/composability, trusted provenance, and discovery of new datasets.

Some specific ideas for examples:

  • Train a model on QM7b to predict DFT energy from molecular geometries using Coulomb, SLATM, and SOAP features with a kernel method. Test the model on QM9 or GDB-13. Train a model with a combination of datasets (e.g. QM7b + QM9).
  • Fit a water model using THG's water cluster dataset to the TIP-4P functional form, perhaps using bayesian regression.
  • Placeholder for something using a generative model.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant