Skip to content

Remove LFSeis#347

Merged
lispandfound merged 26 commits intomasterfrom
lfseis_rewrite
Mar 28, 2025
Merged

Remove LFSeis#347
lispandfound merged 26 commits intomasterfrom
lfseis_rewrite

Conversation

@lispandfound
Copy link
Contributor

The LFSeis class has a number of problems, and it is not reliable enough to make debugging complicated EMOD3D outputs simple. Moreover, it makes selecting slices of LF outputs difficult, and is hard for researchers to use. This PR proposes removing the LFSeis class in favour of a new xarray backed function.

Examples

Compare the output of the new method

>>> from qcore import timeseries
>>>
>>> lf_seis_xarray = timeseries.read_lfseis_directory('/path/to/OutBin')
>>> lf_seis_xarray
<xarray.Dataset> Size: 15MB
Dimensions:    (station: 102, time: 5925, component: 3)
Coordinates:
  * station    (station) <U4 2kB '042A' 'HSES' 'WTMC' ... 'LNSD' 'TOKS' 'AKCZ'
  * component  (component) <U1 12B 'x' 'y' 'z'
    x          (station) int32 408B 61 61 165 304 304 ... 344 316 363 511 569
    y          (station) int32 408B 84 84 66 92 91 168 ... 647 708 658 650 638
    lat        (station) float32 408B -42.52 -42.52 -42.62 ... -43.82 -43.87
    lon        (station) float32 408B 172.8 172.8 173.1 ... 172.5 172.8 172.9
  * time       (time) float64 47kB 0.0 0.01 0.02 0.03 ... 59.22 59.23 59.24
Data variables:
    waveforms  (station, time, component) float64 15MB -7.699e-21 ... -0.0003152
Attributes:
    dt:          0.01
    resolution:  0.2
    rotation:    40.06291
    units:       m/s

to the old method

>>> from qcore import timeseries
>>>
>>> lf_seis_old = timeseries.LFSeis('/path/to/OutBin')
>>> lf_seis_old
<LFSeis object at 0x7fad923a86e0>

and you'll immediately appreciate the improved UX provided by the xarray method. Selecting the velocity waveforms for a single station is roughly the same as before

>>> lf_seis_xarray.sel(station='WTMC')
<xarray.Dataset> Size: 190kB
Dimensions:    (time: 5925, component: 3)
Coordinates:
    station    <U4 16B 'WTMC'
  * component  (component) <U1 12B 'x' 'y' 'z'
    x          int32 4B 165
    y          int32 4B 66
    lat        float32 4B -42.62
    lon        float32 4B 173.1
  * time       (time) float64 47kB 0.0 0.01 0.02 0.03 ... 59.22 59.23 59.24
Data variables:
    waveforms  (time, component) float64 142kB 5.866e-21 1.391e-23 ... 0.0002672
Attributes:
    dt:          0.01
    resolution:  0.2
    rotation:    40.06291
    units:       m/s
>>> # The Old Approach
>>> lf_seis_old.vel('WTMC')

Howeever, xarray datasets can select based on far more sophisticated criteria

>>> # Select two stations at once
>>> lf_seis_xarray.sel(station=['A', 'B'])
>>> # Select all stations except 'A'
>>> lf_seis_xarray.sel(station=lf_seis_xarray.station != 'A')
>>> # Select all stations within with latitude between -42.6 and -42
>>> lf_seis_xarray.where((-42.6 <= lf_seis_xarray.lat) & (lf_seis_xarray.lat <= -42), drop=True)
>>> # Select all the t=0.505s waveform values all stations, using the nearest value to 0.5s if it does not exist
>>> lf_seis_xarray.sel(time=0.505, method='nearest')
<xarray.Dataset> Size: 6kB
Dimensions:    (station: 102, component: 3)
Coordinates:
  * station    (station) <U4 2kB '042A' 'HSES' 'WTMC' ... 'LNSD' 'TOKS' 'AKCZ'
  * component  (component) <U1 12B 'x' 'y' 'z'
    x          (station) int32 408B 61 61 165 304 304 ... 344 316 363 511 569
    y          (station) int32 408B 84 84 66 92 91 168 ... 647 708 658 650 638
    lat        (station) float32 408B -42.52 -42.52 -42.62 ... -43.82 -43.87
    lon        (station) float32 408B 172.8 172.8 173.1 ... 172.5 172.8 172.9
    time       float64 8B 0.51
Data variables:
    waveforms  (station, component) float64 2kB -5.269e-21 ... 2.002e-21
Attributes:
    dt:          0.01
    resolution:  0.2
    rotation:    40.06291
    units:       m/s

None of those examples are supported by the old code, and would've required between some and a significant amount of code to do.

As a bonus, we get a merge_ts-like workflow stage that can save LF seismograph outputs as HDF5 datasets.

>>> lf_seis_xarray.to_netcdf('lf_seis_xarray.h5', engine='h5netcdf')

Testing

There is a regression test in qcore/tests/test_timeseries/test_timeseries.py. It compares the old waveform outputs of LFSeis to the ones loaded by the new code.

@sungeunbae
Copy link
Member

If this is merged into the master, BB calculation (and LF2BB - need to retain to isolate HF effect) from the old workflow will break. We could (Option 1) deprecate LFSeis in stages letting it coexist with the new code with clear comment it is only for old workflow. or (Option 2) fix relevant bits in the old workflow, such as.
https://github.com/ucgmsim/slurm_gm_workflow/blob/25ad9fc930ef15a68c6df1e1a8b4b3e806f52865/workflow/calculation/bb_sim.py#L142

@lispandfound
Copy link
Contributor Author

lispandfound commented Mar 13, 2025

@sungeunbae I like the deprecation approach but it needs a timeline. Open-ended deprecation is just asking to leave old code lying around forever.

@lispandfound
Copy link
Contributor Author

Based on the discussion with @sungeunbae we will:

  1. Add a deprecation warning to LF/BB/HFSeis and,
  2. Create an issue on the old workflow to use the new interface: LFSeis is deprecated slurm_gm_workflow#536.

The deprecation of HF/BBSeis is a bit more open ended because they need to stay in the module to read old data. But once that old data is considered irrelevant or scientifically useless these should also be removed.

Copy link
Member

@sungeunbae sungeunbae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we deprecate BBSeis, it means all the archived BB waveforms are unreadable. I think we should create a conversion script to support old format.

@lispandfound
Copy link
Contributor Author

@sungeunbae I agree. This would be a good candidate for a script in slurm_gm_workflow. I don't think it should be a addressed here.

@sungeunbae sungeunbae self-requested a review March 19, 2025 04:54
@lispandfound lispandfound merged commit 4cfb78a into master Mar 28, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants