Remove phys prop #164

ndaelman-hu · 2025-02-04T09:44:33Z

Make data solely reliant on sections and quantities, reserving the use of Dataframe (new PhysicalProperty) for results.
Re-group properties to with a focus on shared normalized quantities, as well as visualization.

Closes #143 #85 #80

- Remove more spurious normalizers

- Remove 2 extra editable annotations

- TODO: localize sub-sections

- Move out standalone property defintions to `properties/__init__.py`

ndaelman-hu · 2025-02-04T10:48:49Z

Regarding re-grouping, I'm undecided on the final interface. Using electronic properties (e.g. eigenvalues, DOS, band structure), I'll illustrate the 2 options I see.

Generically, any of these electronic property can be identified using any of the following attributes: particle kind*, semantic group**, k-point, spin. This list is not necessarily exhaustive.
*: different kinds of particle models computed by the various diagonalization routines, e.g. Kohn-Sham, self-energy exchange or correlation.
**: groupings typically used to project / decompose the electronic states in. These are quite varied, but examples include: element, ion, (atomic) orbital like Ce(4f).

Overall, we'd like to present any user with an as similar as possible interface (similar idea in the old PhysicalProperty, just differently executed). As you can see in the hierarchies below, these setups mostly follow the same ordering, though they may miss attributes is these are nonsensical:

eigenvalues: particle kind -> (semantic group?) -> k-point -> spin
DOS: semantic group -> spin
band structure: semantic group -> spin -> particle life time (k-point is on the x-axis)
Fermi surface: (semantic group?) -> band -> spin (mapped into k-space)

(note: will add example figures later on)

Hierarchies like these are easy to express in JSON key-value pairs, but mean deeper and different paths to traverse.
total would then stand out as a privileged group with a short-cut. This is most similar in setup to the current state of the schema (with the old PhysicalProperty).

{
  'band_structure': {
    'kpoints': [{...}],
    'total': [
       {
        'spin_channel': [
          {
            'spin': {...},
            'energies': [...],
            'occupations': [...]
          }
    ],
    'groups': [
      {
        'group_label': {...},
        'spin_channel': [
          {
            'spin': {...},
            'energies': [...],
            'occupations': [...]
          }
        ]
      }
    ]
  }
}

Alternatively, we could have a container section with repeating subsections that annotate all identifying attributes together, at the same level. This means more homogeneity in structure. This also mimics the plots (which are just overlays) better.

It does make data traversal harder (especially without the metainfo browser to show name), as you have to scan identifiers to know what you're dealing with. This can be mitigated if we ensure that groupings is properly sorted by the end of the normalization.

{
  'band_structure': {
    'kpoints': [{...}],
    'groupings': [
      {
        'identifiers': {...},
        'energies': [...],
        'occupations': [...]
      }
    ]
  }
}

JosePizarro3 · 2025-02-04T11:03:14Z

Regarding re-grouping, I'm undecided on the final interface. Using electronic properties (e.g. eigenvalues, DOS, band structure), I'll illustrate the 2 options I see.

Generically, any of these electronic property can be identified using any of the following attributes: particle kind*, semantic group**, k-point, spin. This list is not necessarily exhaustive. *: different kinds of particle models computed by the various diagonalization routines, e.g. Kohn-Sham, self-energy exchange or correlation. **: groupings typically used to project / decompose the electronic states in. These are quite varied, but examples include: element, ion, (atomic) orbital like Ce(4f).

Overall, we'd like to present any user with an as similar as possible interface (similar idea in the old PhysicalProperty, just differently executed). As you can see in the hierarchies below, these setups mostly follow the same ordering, though they may miss attributes is these are nonsensical:

eigenvalues: particle kind -> (semantic group?) -> k-point -> spin

DOS: semantic group -> spin

band structure: semantic group -> spin -> particle life time (k-point is on the x-axis)

Fermi surface: (semantic group?) -> band -> spin (mapped into k-space)

I am not sure I understand what "particle kind" and "semantic group" mean. I think particle kind might be info stored in ModelMethod, is that so? Can you put an example?

And the same goes with semantic group: is this the degrees of freedom? Only corresponds to some index for orbital or plane-wave index depending on the basis?

Furthermore, I don't follow your hierarchy for the different properties. How I consider this, from purely the perspective of the property per se is that we have: $E_{k \sigma m}$ for the eigenvalues, where k is the k-points dof, $\sigma$ is the spin dof, and $m$ is the orbital/planewave dof. All the properties can then be derived from this information:

The DOS in the integral of that over $k$
The band structure is that over a specific $k$ path
The Fermi surface is those eigenvlaues close to $E_F$
The spectral function (which you didn't add, though you pointed to something with the life time) is an intensity for each of the eigenvalues $I(E_{k \sigma m})$

The first JSON looks good.

Now, also for @JFRudzinski: is the Variables idea of PhysicalProperty deprecated?

EBB2675 · 2025-02-04T11:23:57Z

I find it more intuitive and easier to follow when there is a structured tree where i can conceptually drill down

ndaelman-hu · 2025-02-04T11:47:47Z

I find it more intuitive and easier to follow when there is a structured tree where i can conceptually drill down

Sure, but imagine now having 5 of these trees side-by-side that have mostly similar, but not identical structures.
The advantage of the 2nd approach is that the core structure remains identical, making some common normalization / plotting easier.
But I admit, both come with issues, hence why I'm collecting opinions.

ndaelman-hu · 2025-02-04T12:06:48Z

Regarding re-grouping, I'm undecided on the final interface. Using electronic properties (e.g. eigenvalues, DOS, band structure), I'll illustrate the 2 options I see.
Generically, any of these electronic property can be identified using any of the following attributes: particle kind*, semantic group**, k-point, spin. This list is not necessarily exhaustive. *: different kinds of particle models computed by the various diagonalization routines, e.g. Kohn-Sham, self-energy exchange or correlation. **: groupings typically used to project / decompose the electronic states in. These are quite varied, but examples include: element, ion, (atomic) orbital like Ce(4f).
Overall, we'd like to present any user with an as similar as possible interface (similar idea in the old PhysicalProperty, just differently executed). As you can see in the hierarchies below, these setups mostly follow the same ordering, though they may miss attributes is these are nonsensical:

eigenvalues: particle kind -> (semantic group?) -> k-point -> spin

DOS: semantic group -> spin

band structure: semantic group -> spin -> particle life time (k-point is on the x-axis)

Fermi surface: (semantic group?) -> band -> spin (mapped into k-space)

I am not sure I understand what "particle kind" and "semantic group" mean. I think particle kind might be info stored in ModelMethod, is that so? Can you put an example?

And the same goes with semantic group: is this the degrees of freedom? Only corresponds to some index for orbital or plane-wave index depending on the basis?

Furthermore, I don't follow your hierarchy for the different properties. How I consider this, from purely the perspective of the property per se is that we have: E k σ m for the eigenvalues, where k is the k-points dof, σ is the spin dof, and m is the orbital/planewave dof. All the properties can then be derived from this information:
1. The DOS in the integral of that over 
     k

2. The band structure is that over a specific 
     k
    path

3. The Fermi surface is those eigenvlaues close to 
     
       E
       F

4. The spectral function (which you didn't add, though you pointed to something with the life time) is an intensity for each of the eigenvalues 
     I
     (
     
       E
       
         k
         σ
         m
       
     
     )
The first JSON looks good.

Now, also for @JFRudzinski: is the Variables idea of PhysicalProperty deprecated?

You're pointing out the same observations as I did:

E k σ m: there are several indices / identifiers / variables for a single property. Note that both spin and orbital (or rather, band) can be much more complex structures. Depending on the context (you know which, not going to iterate them here), they may become sections in their own right.
The relevant indices will change between properties, as you (and I) showed in our listing.

The 1st approach puts up a preferential structure / order to run over these indices. The 2nd approach groups them all together at the same level. This helps consistency and visualization. That's their main difference.
Approach 2 is in that respect similar to the PhysicalProperty concept, but does not allow for open-ended contexts (such as potentially additional variables). Context is relevant both for normalization and visualization. Here, it is extended via inheritance.

PhysicalProperty is being reworked by Area D with feedback from us. For both technical reasons and again, context, it will be used in results and worklfow2, rather than data.

- Apply standard template to `DensityOfStates` - Add naming convention to `OrbitalsState` - Add few comments + correct typos

JFRudzinski · 2025-02-04T13:48:53Z

@JosePizarro3 Thank you for keeping an eye out and giving feedback.

Sorry for the delay in updating you about PhysicalProperty, I hope this MR did not come as too much of a shock. There has been much movement in recent weeks and we are trying to make quick movement now in terms of schema dev and parser migration.

I want to just slightly expand upon what Nathan shared: Starting from your prototype, Markus made his own implementation of PhysicalProperty, a bit more in a dataframe-like style. If you are interested we will be able to share more information in the near future and there will also be a cafe about it.

Markus, Laurie, Hampus, and Nathan have been working to test and improve this implementation, and it is pretty close to usable now. Nathan did many tests of the new implementation into our schema. In the end, we came to the conclusion that applying this structure to every single property in our detailed schema was both tedious and probably not practical in the long run.

Instead we focus for now on applying this new structure/tool to try to improve interoperability at a higher level, hence Nathan's mention of results and workflow2. In principle, it could also be used in data if an appropriate use case comes along. It's just that we don't implement it everywhere as default. This also helps us to make more progress on our schema immediately, which is our top priority.

I'm happy to discuss further with you, also to see if there are specific use cases you have in mind that could be useful for further testing.

Otherwise, we plan to tag you on any relevant MRs for potential input as we go through here. We appreciate your continued input!

JosePizarro3 · 2025-02-04T15:22:27Z

Sure, I just asked if PhysicalProperty was being reworked, and Variables deprecated. Whatever you and the others decide is ok for me.

So the idea is to just do whatever in data in order to push forward the schema development? And then it is results and workflow2.results the ones responsible of interoperability? This sounds good for me anyways, just making sure I understand.

- Correct eigenvalues and DOS

JFRudzinski · 2025-02-05T09:01:48Z

Sure, I just asked if PhysicalProperty was being reworked, and Variables deprecated. Whatever you and the others decide is ok for me.

So the idea is to just do whatever in data in order to push forward the schema development? And then it is results and workflow2.results the ones responsible of interoperability? This sounds good for me anyways, just making sure I understand.

Of course, I want to keep you updated 👍

I wouldn't exactly say that the idea is to do "whatever" in data. We are trying to develop some standardization within the schema and also some quasi templates for people to easily use when extending later. You will see here that Nathan kept several aspects of your PP implementation. It's just that we don't allow for the flexibility in data, at least not by default for all properties. If a use cases arises, PP can also be used in data.

And yes, exactly, results and workflow2.results are responsible for interoperability. There are still some decisions exactly what is done with the results section.

JFRudzinski · 2025-02-05T09:03:02Z

From my side, the second structure that Nathan suggested has some advantages for standardization and plotting. I think this will become more clear with some concrete examples...

…normalization, and plotting

coveralls · 2025-02-05T12:02:53Z

Pull Request Test Coverage Report for Build 13209227390

Details

0 of 0 changed or added relevant lines in 0 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage decreased (-81.0%) to 0.0%

Totals
Change from base Build 13008515978:	-81.0%
Covered Lines:	0
Relevant Lines:	0

💛 - Coveralls

- Update test case

- Fix plotting legend

- Remove re-use of quantities

ndaelman-hu · 2025-02-07T22:51:40Z

Okay, feel free to give some preliminary feedback.
The main guiding approach is producing accompanying plots. This means grouping together the necessary data as well as context (i.e. metadata) to understand / retrieve the plot. For example:

overlaying some kind of labelled plot several times is very common, and the base template is in SemanticGroup and SemanticGroupContainer.
most ground state electronic structures require a common alignment, so they are grouped together. For testing purposes, I currently directly use this section under outputs. I also still have to fine-tune the normalization order there (actually, via deactivation + explicit calling, rather than levels).

The most fleshed out example is the DensityOfStates under properties/solid_state_electronics.py, you can trace the other changes back from there. There is a script for generating an example under tests/properties/visualize_electronic.py. If I upload it to a NOMAD server, I get the following result below. Some corrections are still in order here (toggle legend, single plot, show on overview), but the skeleton stands.

JFRudzinski · 2025-02-08T13:10:37Z

src/nomad_simulations/schema_packages/base_sections.py

+from nomad.metainfo import Quantity, Reference
+
+
+class ModelBaseSection(ArchiveSection):


just a small note: the terminology "model" here is a bit confusing to me considering the other uses of model within our schema

JFRudzinski · 2025-02-08T13:43:47Z

I'm starting to set this up for testing but already having some import problems.

../nomad-simulations/src/nomad_simulations/schema_packages/properties/decomposable.py:1: in <module>
    from nomad.metainfo import placeholder, Quantity, SubSection, Reference

I can only find placeholder defined in the javascript part of nomad-FAIR. Perhaps it's due to the older mapping annotation branch. @ladinesa could you let me know when you rebase your nomad-FAIR branch, I tried myself, but the conflicts are too complicated/unknown to me...also I guess if you have any alternative insight into the placeholder import in general 🙏

ndaelman added 15 commits January 24, 2025 13:50

Fix typo

74e0629

Remove spurious normalization

b11f5f9

- Deactivate EditQuantity annotations

55cc292

- Remove more spurious normalizers

Apply ruff

0e773dd

Remove annotations

f885f1e

- Remove ELNAnnotation imports

ea40f74

- Remove 2 extra editable annotations

Generate foundational base section, i.e. ModelBaseSection

f62eb98

Add new DOS interaction

95d403a

Add plotly visualization

66f234b

Remove spurious normalization

0d0fb7b

- Improve plotly visualization

5f01a65

- TODO: localize sub-sections

Set up test template generator

4430a3b

Add energy reference DOS

e4b2530

Generate container for solid state electronics

9d3d505

- Move out contribution setup to decomposable.py

05d6770

- Move out standalone property defintions to `properties/__init__.py`

ndaelman-hu added the improvement/fix Improvement or fix of a previous feature label Feb 4, 2025

ndaelman-hu self-assigned this Feb 4, 2025

ndaelman-hu marked this pull request as draft February 4, 2025 09:44

ndaelman added 2 commits February 4, 2025 14:32

Add todo to TotalForce

dd6b70a

- Define standard template

0b6f49f

- Apply standard template to `DensityOfStates` - Add naming convention to `OrbitalsState` - Add few comments + correct typos

- Add band structure

1699f6c

- Correct eigenvalues and DOS

ndaelman added 2 commits February 5, 2025 12:50

Buld out electronic eigenstates, basic solid state properties, cross-…

f47821d

…normalization, and plotting

Divide electronics between molecules and solid state

833c357

ndaelman added 7 commits February 5, 2025 17:10

- Solve normnalization bugs

ff3d388

- Update test case

Fix plotting

e9c016c

Test plotting

12c66f8

Fix OrbitalsState initialization

d76698a

- Fix dump error (missing metadata)

403a247

- Fix plotting legend

Make new definitions discoverable

e1f64ea

- Clean schema

206798f

- Remove re-use of quantities

ndaelman-hu requested a review from JFRudzinski February 7, 2025 22:40

JFRudzinski reviewed Feb 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove phys prop #164

Remove phys prop #164

ndaelman-hu commented Feb 4, 2025

ndaelman-hu commented Feb 4, 2025 •

edited

Loading

JosePizarro3 commented Feb 4, 2025

EBB2675 commented Feb 4, 2025

ndaelman-hu commented Feb 4, 2025 •

edited

Loading

ndaelman-hu commented Feb 4, 2025 •

edited

Loading

JFRudzinski commented Feb 4, 2025

JosePizarro3 commented Feb 4, 2025

JFRudzinski commented Feb 5, 2025

JFRudzinski commented Feb 5, 2025

coveralls commented Feb 5, 2025 •

edited

Loading

ndaelman-hu commented Feb 7, 2025

JFRudzinski Feb 8, 2025

JFRudzinski commented Feb 8, 2025

		from nomad.metainfo import Quantity, Reference


		class ModelBaseSection(ArchiveSection):

Remove phys prop #164

Are you sure you want to change the base?

Remove phys prop #164

Conversation

ndaelman-hu commented Feb 4, 2025

ndaelman-hu commented Feb 4, 2025 • edited Loading

JosePizarro3 commented Feb 4, 2025

EBB2675 commented Feb 4, 2025

ndaelman-hu commented Feb 4, 2025 • edited Loading

ndaelman-hu commented Feb 4, 2025 • edited Loading

JFRudzinski commented Feb 4, 2025

JosePizarro3 commented Feb 4, 2025

JFRudzinski commented Feb 5, 2025

JFRudzinski commented Feb 5, 2025

coveralls commented Feb 5, 2025 • edited Loading

Pull Request Test Coverage Report for Build 13209227390

Details

💛 - Coveralls

ndaelman-hu commented Feb 7, 2025

JFRudzinski Feb 8, 2025

Choose a reason for hiding this comment

JFRudzinski commented Feb 8, 2025

ndaelman-hu commented Feb 4, 2025 •

edited

Loading

ndaelman-hu commented Feb 4, 2025 •

edited

Loading

ndaelman-hu commented Feb 4, 2025 •

edited

Loading

coveralls commented Feb 5, 2025 •

edited

Loading