Skip to content

Extend parsing functionality#148

Merged
mbercx merged 3 commits into
aiidateam:mainfrom
mbercx:new/extend-parsing
May 1, 2026
Merged

Extend parsing functionality#148
mbercx merged 3 commits into
aiidateam:mainfrom
mbercx:new/extend-parsing

Conversation

@mbercx
Copy link
Copy Markdown
Member

@mbercx mbercx commented May 1, 2026

No description provided.

mbercx added 3 commits May 1, 2026 09:42
Add 14 Specs covering band eigenvalues, magnetism, run-status flags, and
HOMO/LUMO levels:

- `eigenvalues` and `occupations_kpoint` as numpy arrays of shape `(n_kpoints,
  n_spin, n_bands)`, derived from `band_structure.ks_energies`.
- Run-shape scalars from XML: `number_of_electrons`, `number_of_atoms`,
  `number_of_species`, `lsda`, `alat` (Å), `ibrav`.
- Magnetism (μB/cell): `total_magnetization`, `absolute_magnetization`.
- Run-status flags: `scf_converged` (from `convergence_info.scf_conv`) and
  `job_done` (presence of `<closed>`), each defaulting to `False` when the
  corresponding XML element is missing.
- `highest_occupied_level` / `lowest_unoccupied_level` (eV) parsed from the pw.x
  stdout. `PwStdoutParser.parse` matches both the single-value `highest occupied
  level (ev): X` line and the paired `highest occupied, lowest unoccupied levels
  (ev): X Y` line.
The `bands.x` post-processor reorders the raw `pw.x` band eigenvalues into a
plottable band-structure path and (optionally) labels each `(k, band)` with its
symmetry representation. The qe-tools side had no parser for the `filband`
family — users had to hand-roll text parsing and reshape arithmetic to lift an
`(n_kpoints, n_bands)` array out of the gnuplot-flavoured output.

Add a dedicated `BandsOutput` class plus three parsers:

- `BandsDatParser` reads the `&plot nbnd=..., nks=... /` header and reshapes the
  body into `k_points` `(n_kpoints, 3)` and `eigenvalues` `(n_kpoints, n_bands)`
  arrays (eV).
- `BandsRapParser` decodes the `*.dat.rap` symmetry-rep file when bands.x was
  run with `lsym=.true.`, exposing per-`(k, band)` representation indices and a
  `(n_kpoints,)` boolean flag for high-symmetry points.
- `BandsStdoutParser` lifts the `high-symmetry point: kx ky kz x coordinate X`
  lines out of the bands.x stdout, so plot tick labels can be placed at the
  correct cumulative path-length.
`projwfc.x` writes projected-DOS data into one file per `(atom, wavefunction)`
pair following the `<filpdos>.pdos_atm#N(El)_wfc#M(L)[_j#J]` naming scheme, plus
a `<filpdos>.pdos_tot` summary. Decoding the filename, parsing the
column-variable header, and reshaping the per-magnetic-quantum-number columns
into a usable array is fiddly enough that callers were inevitably reinventing
it.

Add a dedicated `ProjwfcOutput` class plus two file parsers:

- `PdosAtmWfcParser` reads a single `pdos_atm#N(El)_wfc#M(L)[_j#J]` file. It
  handles the spin-unpolarised, spin-polarised LSDA, and spin-orbit cases by
  counting `ldos`-prefixed columns and reshaping the trailing `pdos`-columns
  into `(n_energies, [spin,] 2*l + 1)`.
- `PdosTotParser` reads `<filpdos>.pdos_tot` into the matching `dos_total` /
  `pdos_total` arrays.

`ProjwfcOutput.from_dir` discovers every PDOS file in a directory via
`collect_pdos_files`, sorts the resulting records numerically by `(atom, wfc,
j)` so systems with more than nine atoms don't get scrambled by lexical order,
and skips the consumed paths when sniffing for the projwfc.x stdout. The output
exposes:

- `pdos`: a flat list of records keyed by `(atom, element, wfc, l, l_label[,
  j])` with the parsed `energies`, `ldos`, and per-magnetic-quantum-number
  `pdos_m` arrays. Filtering by element or `l` is a one-line list comprehension
  on the caller side.
- `pdos_total`: dict with `energies`, `dos_total`, `pdos_total`.
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 1, 2026

Codecov Report

❌ Patch coverage is 91.00719% with 25 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.08%. Comparing base (d0f8124) to head (88a86d7).
⚠️ Report is 123 commits behind head on main.

Files with missing lines Patch % Lines
src/qe_tools/outputs/projwfc.py 72.91% 13 Missing ⚠️
src/qe_tools/outputs/parsers/bands.py 91.30% 4 Missing ⚠️
src/qe_tools/outputs/bands.py 95.16% 3 Missing ⚠️
src/qe_tools/outputs/parsers/projwfc.py 95.71% 3 Missing ⚠️
src/qe_tools/outputs/parsers/pw.py 85.71% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #148      +/-   ##
==========================================
- Coverage   89.77%   88.08%   -1.70%     
==========================================
  Files          11       23      +12     
  Lines         489     1032     +543     
==========================================
+ Hits          439      909     +470     
- Misses         50      123      +73     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mbercx mbercx merged commit 4e4885f into aiidateam:main May 1, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants