Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify the storage for examples #314

Open
ondrejkrejci opened this issue Nov 5, 2024 · 14 comments
Open

Unify the storage for examples #314

ondrejkrejci opened this issue Nov 5, 2024 · 14 comments
Assignees
Labels

Comments

@ondrejkrejci
Copy link
Collaborator

While working on # 233 and trying not to change the results there, I have found out that our examples are files are scattered all over the place - Zenodo, Dropbox, Mega ...
The most of all, I do not like that the pyridineDensOverlap/run.sh needs sudo for mega tools.
I am suggesting to move everything to Zenodo and adjust the examples accordingly. Let's discuss this,

@NikoOinonen
Copy link
Collaborator

Another thought about this: I made a while ago a Python function for easily downloading datasets in the GPU scripts here:

ppafm/ppafm/data.py

Lines 47 to 49 in 12a9034

def download_dataset(name: str, target_dir: PathLike):
"""
Download and unpack a dataset to a target directory.

It downloads and unpacks a named dataset to a chosen location while printing a progress percentage, and skips the download if the destination already exists.

Maybe we could make a CLI command for this, something like

ppafm-download [DATASET_NAME] [SAVE_PATH]

This would somewhat simplify the code and avoid repeated downloads in the CLI scripts.

@ProkopHapala
Copy link
Collaborator

Question @ondrejkrejci @NikoOinonen @yakutovicha - so should I upload to zenodo directly the directory structure of the examples with the downloaded files? Perhaps that would be most easy to navigate ?

@ondrejkrejci
Copy link
Collaborator Author

I am not sure if it is possible. If so, then it is the easiest, otherwise I would rename those as 'directory-filename.tgz' so it is the same and easy to understand

@NikoOinonen
Copy link
Collaborator

I think we want one compressed file per example, so that they can be downloaded individually. Also, we only need to upload the big files (.xsf), not the ones that only use an xyz geometry.

@ProkopHapala
Copy link
Collaborator

Hi,

@NikoOinonen @ondrejkrejci @yakutovicha @mondracek

I downloaded all files form example (unless I forgot something), and I'm prepared to upload them.

I think it would be better to put them in a flat directory structure, and make a .zip from each directory. It means I extracted those from paper_figure.

I suggest these directories:

├── C60
│   ├── CHGCAR.xsf
│   ├── LOCPOT.xsf
│   └── mol.xyz
├── CH3Br_KPFM
│   ├── hartree_potential_V0.cube
│   └── hartree_potential_Vz.cube
├── CO_tip
│   ├── CO_delta_density_aims.xsf
│   └── density_CO.xsf
├── FAD
│   ├── CHGCAR.xsf
│   ├── LOCPOT.xsf
│   └── mol.xyz
├── FFPB
│   ├── CHGCAR.xsf
│   ├── LOCPOT.xsf
│   └── mol.xyz
├── FFPB_KPFM
│   ├── LOCPOT_V0.xsf
│   └── LOCPOT_Vz.xsf
├── Pentacene
│   ├── CHGCAR.xsf
│   ├── LOCPOT.xsf
│   └── mol.xyz
├── Phtalocyanine
│   ├── CHGCAR.xsf
│   ├── LOCPOT.xsf
│   └── mol.xyz
├── PTCDA
│   ├── CHGCAR.xsf
│   ├── LOCPOT.xsf
│   └── mol.xyz
├── PTCDA_Ag
│   └── LOCPOT.xsf
├── pyridine
│   ├── sample
│   │   ├── CHGCAR.xsf
│   │   └── LOCPOT.xsf
│   └── tip
│       └── CHGCAR.xsf
└── pyridineBrCl
    ├── density.xsf
    └── hartree.xsf

Please let me know if I forgot something, or if you have different suggestion

Some notes:

  • I added pyridineBrCl example from Niko - mentioned here
  • There are two distictn PTCDAs
    • PTCDA is from paper_figure
    • PTCDA_Ag is the original 2 PTCDA molacules on Ag111 substrate used in PTCDA_Hartree and PTCDA_Hartree_dz2 )
  • There are two dist CO tips
    • that in directory CO_tip was used in paper_figure.
    • that in pyridine/tip was used in pyridine and is calculated for grid-shape matching pyridine sample.
      • In CPU version we need matching grids (as far as I know), in GPU version done by Niko there is some solution which allows different grid shapes for tip and sample (Am I right?). Maybe it would be worth implement it also for CPU, at some point?

Just for comparison there is current directroy strcuture of the examples (after downloading the files by the scripts)

├── benzeneBrCl2
│   ├── dichlor-brom-benzene.xyz
│   ├── params.ini
│   └── run.sh
├── CH3Br_KPFM
│   ├── hartree_potential_V0.cube
│   ├── hartree_potential_Vz.cube
│   ├── params.ini
│   ├── rhoTip.xsf
│   └── run.sh
├── CoPc-IETS
│   ├── answer.xyz
│   ├── atomtypes.ini
│   ├── cel.lvs
│   ├── params.ini
│   └── run.sh
├── CorrectionLoopGraphene
│   ├── clean.sh
│   ├── input.xyz
│   ├── ref.xyz
│   ├── simplePot.sh
│   ├── simplePotTest.xyz
│   └── _test_CorrectionLoop.py
├── FePc_Au-IETS
│   ├── clean.sh
│   ├── geom-cube.in
│   ├── input_plot.xyz
│   ├── params.ini
│   └── run_ppafm-iets.sh
├── FFPB_KPFM
│   ├── input_plot.xyz
│   ├── KPFM_hartree.tar.gz
│   ├── LOCPOT_V0.xsf
│   ├── LOCPOT_Vz.xsf
│   ├── params.ini
│   ├── rhoTip.xsf
│   └── run.sh
├── Generator
│   ├── example_molecules
│   │   ├── bcb.xyz
│   │   ├── formic_acid.xyz
│   │   ├── out2.xyz
│   │   └── out3.xyz
│   ├── generator_trainer.py
│   └── inverse_trainer_xyz.py
├── Graphene
│   ├── Gr6x6N3hole.xyz
│   ├── params.ini
│   ├── run.bat
│   ├── run_dev.sh
│   ├── run_gpu.py
│   └── run.sh
├── Graphene_mod
│   ├── AutoSegmentImage.py
│   ├── fitAtoms2.py
│   ├── fitAtoms.py
│   ├── Gr6x6N3hole-.xyz
│   ├── Gr6x6N3hole.xyz
│   ├── params.ini
│   ├── params.ini-bak
│   └── run.sh
├── Graphene-spline
│   ├── curve_points.ini
│   ├── Gr6x6N3hole.xyz
│   ├── params.ini
│   ├── run.sh
│   └── TipRSpline.ini
├── paper_figure
│   ├── CO-densities
│   │   ├── CO_delta_density_aims.xsf
│   │   └── density_CO.xsf
│   ├── dft-afm
│   │   ├── C60.npz
│   │   ├── FAD.npz
│   │   ├── FFPB.npz
│   │   ├── Pentacene.npz
│   │   ├── Phtalocyanine.npz
│   │   └── PTCDA.npz
│   ├── hartree-density
│   │   ├── C60
│   │   │   ├── CHGCAR.xsf
│   │   │   ├── LOCPOT.xsf
│   │   │   └── mol.xyz
│   │   ├── FAD
│   │   │   ├── CHGCAR.xsf
│   │   │   ├── LOCPOT.xsf
│   │   │   └── mol.xyz
│   │   ├── FFPB
│   │   │   ├── CHGCAR.xsf
│   │   │   ├── LOCPOT.xsf
│   │   │   └── mol.xyz
│   │   ├── Pentacene
│   │   │   ├── CHGCAR.xsf
│   │   │   ├── LOCPOT.xsf
│   │   │   └── mol.xyz
│   │   ├── Phtalocyanine
│   │   │   ├── CHGCAR.xsf
│   │   │   ├── LOCPOT.xsf
│   │   │   └── mol.xyz
│   │   └── PTCDA
│   │       ├── CHGCAR.xsf
│   │       ├── LOCPOT.xsf
│   │       └── mol.xyz
│   └── run_simulation.py
├── PTCDA_Hartree
│   ├── LOCPOT.xsf
│   ├── LOCPOT.xsf.zip
│   ├── params.ini
│   ├── run_gpu_easy.py
│   ├── run_gpu.py
│   └── run.sh
├── PTCDA_Hartree_dz2
│   ├── example_ptcda_hartree.py
│   ├── LOCPOT.xsf
│   ├── LOCPOT.xsf.zip
│   ├── params.ini
│   ├── run.sh
│   └── tip.py
├── PTCDA_single
│   ├── example_ptcda.py
│   ├── params.ini
│   ├── PTCDA.xyz
│   └── run.sh
├── pyridineDensOverlap
│   ├── params.ini
│   ├── run_gpu.py
│   ├── run.sh
│   ├── sample
│   │   ├── CHGCAR.xsf
│   │   └── LOCPOT.xsf
│   └── tip
│       └── CHGCAR.xsf
└── ToZenodo.md

@NikoOinonen
Copy link
Collaborator

NikoOinonen commented Nov 19, 2024

that in pyridine/tip was used in pyridine and is calculated for grid-shape matching pyridine sample.

If I recall, the density_CO.xsf in the paper figure was the same density as in the pyridine example, so there might be some redundancy in there.

in GPU version done by Niko there is some solution which allows different grid shapes for tip and sample (Am I right?).

Yes, there is an interpolation step that resamples the input file grids to the specified FF grid.

@ProkopHapala
Copy link
Collaborator

ProkopHapala commented Nov 25, 2024

If I recall, the density_CO.xsf in the paper figure was the same density as in the pyridine example, so there might be some redundancy in there.

Aha, I see, they are really the same (by diff). So I will probably reuse CO tip from pyridine, and reuse this one .

And what is CO_delta_density_aims.xsf - it has completely different size ? You use both for the benchmark for paper ?

Anyway - if nobody ( @ondrejkrejci @yakutovicha @mondracek ) has any suggestions/ objections I will procede with uploading.

@ondrejkrejci
Copy link
Collaborator Author

I agree. Maybe it is a question, why having that many PTCDA examples now, but historically they have their place.

@yakutovicha
Copy link
Collaborator

Go ahead @ProkopHapala. In the worst case (if something won't be organised optimally) we can always make v2 👍 .

@NikoOinonen
Copy link
Collaborator

And what is CO_delta_density_aims.xsf - it has completely different size ? You use both for the benchmark for paper ?

It is the delta electron density, difference from free atom density, that I used for the electrostatics part in the FDBM simulations in the paper.

@ProkopHapala
Copy link
Collaborator

Ok, so I did it. Now I sould mofidy the examples to download it properly

https://zenodo.org/records/14222456
DOI

@yakutovicha
Copy link
Collaborator

@ProkopHapala, I think the pyridine upload misses the tip data.

@ProkopHapala
Copy link
Collaborator

@ProkopHapala, I think the pyridine upload misses the tip data.

I think that is OK, the tip data can be copied from the
CO-densities
│ │ ├── CO_delta_density_aims.xsf
│ │ └── density_CO.xsf

The grids are the same
this is what we were discussing above with niko

but we need to modify the scripts

@yakutovicha
Copy link
Collaborator

I think that is OK, the tip data can be copied from the CO-densities │ │ ├── CO_delta_density_aims.xsf │ │ └── density_CO.xsf

The grids are the same

Thanks, @ProkopHapala, I will then update the part I am working on 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants