-
Notifications
You must be signed in to change notification settings - Fork 0
IM Calculation
This step computes Intensity Measures (IMs) for seismic waveform records processed through the NZGMDB pipeline. The implementation calculates various ground motion metrics including peak ground acceleration, spectral response, and frequency-domain characteristics for observed waveforms.
To calculate intensity measures for all processed waveform data, run the following Python script:
python -m nzgmdb.scripts.run_nzgmdb run-im-calculation <main_dir> <ko_directory>- <main_dir> is the top-level output directory where NZGMDB stores its results
- <ko_directory> is the path to the directory containing Konno-Ohmachi smoothing matrices
Example:
python -m nzgmdb.scripts.run_nzgmdb run-im-calculation nzgmdb_output/ /path/to/ko_matrices/Optional parameters include:
-
--output-dir: Custom directory for IM output files (default: main_dir/IM/) -
--n-procs: Number of processes for parallel processing (default: 1) -
--checkpoint: Skip already completed files for incremental processing -
--intensity-measures: Comma-separated list of specific IMs to calculate, defaults to all configured IMs inconfig.yaml
The IM Calculation step requires the following inputs from previous pipeline steps:
- Process Records (generates .000, .090, and .ver processed waveform files)
The step requires processed waveform text files (.000, .090, .ver) in the waveforms/ directory.
The system reads processed ASCII waveform files for each record:
File Structure Expected:
-
.000file: North-South (000Β°) horizontal component -
.090file: East-West (090Β°) horizontal component -
.verfile: Vertical component
Quality Control:
- Records are skipped if any component file is missing
- All three components must be available for successful IM calculation
- Sampling rate (dt) and waveform arrays are extracted from the ASCII files
The pipeline calculates the following intensity measures based on configuration settings:
- PGA - Peak Ground Acceleration
- PGV - Peak Ground Velocity
- CAV - Cumulative Absolute Velocity
- CAV5 - Cumulative Absolute Velocity (threshold: 5 cm/sΒ²)
- AI - Arias Intensity
- Ds575 - Duration containing 5-75% of Arias Intensity
- Ds595 - Duration containing 5-95% of Arias Intensity
- pSA - Pseudo-Spectral Acceleration (111 periods)
- FAS - Fourier Amplitude Spectrum (389 frequencies)
The pSA calculations use 111 standardised periods (in seconds):
0.010, 0.020, 0.022, 0.025, 0.029, 0.030, 0.032, 0.035, 0.036, 0.040,
0.042, 0.044, 0.045, 0.046, 0.048, 0.050, 0.055, 0.060, 0.065, 0.067,
0.070, 0.075, 0.080, 0.085, 0.090, 0.095, 0.100, 0.110, 0.120, 0.130,
0.133, 0.140, 0.150, 0.160, 0.170, 0.180, 0.190, 0.200, 0.220, 0.240,
0.250, 0.260, 0.280, 0.290, 0.300, 0.320, 0.340, 0.350, 0.360, 0.380,
0.400, 0.420, 0.440, 0.450, 0.460, 0.480, 0.500, 0.550, 0.600, 0.650,
0.667, 0.700, 0.750, 0.800, 0.850, 0.900, 0.950, 1.000, 1.100, 1.200,
1.300, 1.400, 1.500, 1.600, 1.700, 1.800, 1.900, 2.000, 2.200, 2.400,
2.500, 2.600, 2.800, 3.000, 3.200, 3.400, 3.500, 3.600, 3.800, 4.000,
4.200, 4.400, 4.600, 4.800, 5.000, 5.500, 6.000, 6.500, 7.000, 7.500,
8.000, 8.500, 9.000, 9.500, 10.000, 11.000, 12.000, 13.000, 14.000,
15.000, 20.000
Fourier Amplitude Spectrum calculations use 389 logarithmically spaced frequencies:
- Range: 0.01318257 β 100 Hz
- Distribution: Logarithmic spacing for optimal frequency resolution
IMs are calculated for multiple component orientations:
| Component | Description |
|---|---|
| 000 | North-South horizontal |
| 090 | East-West horizontal |
| ver | Vertical |
| geom | Geometric mean horizontal |
| rotd0 | Minimum rotated horizontal |
| rotd50 | Median rotated horizontal |
| rotd100 | Maximum rotated horizontal |
Note: For FAS calculations, an additional EAS component is computed specifically.
The step utilises multiprocessing for computational efficiency:
- Distributes IM calculations across available CPU cores
- Processes multiple records simultaneously
- Optimised memory usage for large waveform datasets
Primary output consists of CSV files stored in the directory structure:
IM/
βββ event_id/
βββ evid_station_channel_location_IM.csv
Each CSV file contains all calculated IMs with the following structure:
File Format: One row per component (8 rows total), with 509 columns containing all IM data.
| Column Type | Example Columns | Description |
|---|---|---|
| Identifiers |
record_id, component
|
Record identifier and component type |
| Time-Domain IMs |
PGA, PGV, CAV, CAV5, AI, Ds575, Ds595
|
Peak and duration-based measures |
| Spectral Acceleration |
pSA_0.01, pSA_0.1, pSA_1.0, pSA_20.0
|
Response spectra at 111 periods |
| Fourier Spectra |
FAS_0.013182570000000001, FAS_1.0, FAS_100.0
|
Amplitude spectra at 389 frequencies |
Component Rows Structure:
- Row 1-8: Each represents a different component (000, 090, ver, geom, rotd0, rotd50, rotd100, EAS)
- All IM values for that component are stored in the same row
- Period-specific values follow naming:
pSA_{period}(e.g.,pSA_0.1for 0.1-second period) - Frequency-specific values follow naming:
FAS_{frequency}(e.g.,FAS_1.0for 1.0 Hz)
| Intensity Measure | 000 | 090 | ver | geom | rotd0 | rotd50 | rotd100 | EAS |
|---|---|---|---|---|---|---|---|---|
| PGA | β | β | β | β | β | β | β | |
| PGV | β | β | β | β | β | β | β | |
| CAV | β | β | β | β | ||||
| CAV5 | β | β | β | β | ||||
| AI | β | β | β | β | ||||
| Ds575 | β | β | β | β | ||||
| Ds595 | β | β | β | β | ||||
| pSA | β | β | β | β | β | β | β | |
| FAS | β | β | β | β |
Additional output files are generated in the flatfiles/ directory:
-
IM_calc_skipped_records.csv: Contains records that could not be processed due to missing component files
| Column | Description |
|---|---|
record_id |
Unique identifier (evid_station_channel_location) |
reason |
Detailed explanation of processing failure |
Key configuration values from config.yaml:
| Parameter | Default Value | Description |
|---|---|---|
ims |
See list above | Intensity measures to calculate |
psa_periods |
111 periods | Spectral acceleration periods (seconds) |
common_frequency_start |
0.01318257 | FAS start frequency (Hz) |
common_frequency_end |
100 | FAS end frequency (Hz) |
common_frequency_num |
389 | Number of FAS frequency points |
ko_bandwidth |
40 | Konno-Ohmachi smoothing bandwidth |
For earthquake records from the NZGMDB:
- Uses processed ASCII waveform files (.000, .090, .ver)
- Applies rotational processing for RotD components
- Implements Konno-Ohmachi smoothing for FAS calculations
- Handles missing components gracefully with detailed error logging
- File Validation: Ensures all three component files exist before processing
- Error Handling: Logging of processing failures
- Checkpoint Support: Allows resumption of interrupted calculations
After individual IM calculations are complete, the results must be consolidated into a single comprehensive dataset. This subsequent step merges all individual IM CSV files and applies quality filtering.
To merge all calculated IM results, run:
python -m nzgmdb.scripts.run_nzgmdb merge-im-results <im_dir> <output_dir>- <im_dir> is the directory containing individual IM CSV files (typically main_dir/IM/)
- <output_dir> is the output directory for merged results (typically main_dir/flatfiles/)
Example:
python -m nzgmdb.scripts.run_nzgmdb merge-im-results nzgmdb_output/IM/ nzgmdb_output/flatfiles/Optional parameters:
-
--gmc-ffp: Path to GMC predictions file for quality score integration -
--fmax-ffp: Path to Fmax results file for frequency limit integration
The merge operation performs the following consolidation steps:
1. IM File Aggregation
- Recursively finds all
*_IM.csvfiles in the IM directory - Concatenates all individual record IM files into a single DataFrame
- Preserves all calculated intensity measures and component orientations
2. Metadata Integration
- GMC Data: Merges quality scores, minimum frequencies (fmin), and multi-component scores
- Fmax Data: Integrates maximum usable frequencies for each component
- Record Parsing: Extracts event ID, station, channel, and location from record identifiers
3. Column Organisation Structures the final dataset with logical column ordering:
record_id | evid | sta | loc | chan | component |
PGA | PGV | CAV | CAV5 | AI | Ds575 | Ds595 |
score_mean_X | fmin_mean_X | fmax_mean_X | multi_mean_X |
score_mean_Y | fmin_mean_Y | fmax_mean_Y | multi_mean_Y |
score_mean_Z | fmin_mean_Z | fmax_mean_Z | multi_mean_Z |
[pSA_periods...] | [FAS_frequencies...]
ground_motion_im_catalogue.csv
The consolidated catalogue contains:
| Field Group | Description | Example Columns |
|---|---|---|
| Identifiers | Record and location info |
record_id, evid, sta, chan, loc
|
| Components | Motion component type |
component (000, 090, ver, rotd50, etc.) |
| Time-Domain IMs | Peak and duration metrics |
PGA, PGV, CAV, AI, Ds575, Ds595
|
| Quality Metrics | GMC and frequency data |
score_mean_X, fmin_mean_X, fmax_mean_X
|
| Spectral IMs | Period-dependant response |
pSA_0.010, pSA_0.050, pSA_1.000
|
| Frequency IMs | Fourier amplitude data |
FAS_0.100, FAS_1.000, FAS_10.000
|
- Single Source: Consolidates thousands of individual files into one analysable dataset
- Metadata Enrichment: Combines IM data with quality metrics and processing parameters
- Previous: Process Records - Generates the processed waveform files required for IM calculation
- Next: Calculate Distances - Computes rupture distance metrics for the merged IM catalogue
- Related: Merge Flatfiles - Further processes the IM catalogue into component-specific flatfiles