From 71a0bd3cf6584ad23848846755f936ed6135e7eb Mon Sep 17 00:00:00 2001 From: "Documenter.jl" Date: Mon, 17 Jul 2023 18:49:36 +0000 Subject: [PATCH] build based on d67a747 --- previews/PR76/api/index.html | 71 +++++++++++++++++++ previews/PR76/convert-to-onda/index.html | 88 ++++++++++++++++++++++++ previews/PR76/index.html | 71 +------------------ previews/PR76/search/index.html | 2 +- previews/PR76/search_index.js | 2 +- 5 files changed, 162 insertions(+), 72 deletions(-) create mode 100644 previews/PR76/api/index.html create mode 100644 previews/PR76/convert-to-onda/index.html diff --git a/previews/PR76/api/index.html b/previews/PR76/api/index.html new file mode 100644 index 0000000..338e767 --- /dev/null +++ b/previews/PR76/api/index.html @@ -0,0 +1,71 @@ + +API Documentation · OndaEDF

API Documentation

Import EDF to Onda

OndaEDF.jl prefers "self-service" import over "automagic", and provides functionality to extract Onda.Samples and EDFAnnotationV1s (which extend Onda.AnnotationV1s) from an EDF.File. These can be written to disk (with Onda.store / Legolas.write or manipulated in memory as desired.

Import signal data as Samples

OndaEDF.edf_to_onda_samplesFunction
edf_to_onda_samples(edf::EDF.File, plan_table; validate=true, dither_storage=missing)

Convert Signals found in an EDF File to Onda.Samples according to the plan specified in plan_table (e.g., as generated by plan_edf_to_onda_samples), returning an iterable of the generated Onda.Samples and the plan as actually executed.

The input plan is transformed by using merge_samples_info to combine rows with the same :onda_signal_index into a common Onda.SamplesInfo. Then OndaEDF.onda_samples_from_edf_signals is used to combine the EDF signals data into a single Onda.Samples per group.

The label of the original EDF.Signals are preserved in the :edf_channels field of the resulting SamplesInfos for each Samples generated.

Any errors that occur are shown as Strings (with backtrace) and inserted into the :error column for the corresponding rows from the plan.

Samples are returned in the order of :onda_signal_index. Signals that could not be matched or otherwise caused an error during execution are not returned.

If validate=true (the default), the plan is validated against the FilePlanV2 schema, and the signal headers in the EDF.File.

If dither_storage=missing (the default), dither storage is allocated automatically as specified in the docstring for Onda.encode. dither_storage=nothing disables dithering.

source
edf_to_onda_samples(edf::EDF.File; kwargs...)

Read signals from an EDF.File into a vector of Onda.Samples. This is a convenience function that first formulates an import plan via plan_edf_to_onda_samples, and then immediately executes this plan with edf_to_onda_samples.

The samples and executed plan are returned; it is strongly advised that you review the plan for un-extracted signals (where :sensor_type or :channel is missing) and errors (non-nothing values in :error).

Collections of EDF.Signals are mapped as channels to Onda.Samples via plan_edf_to_onda_samples. The caller of this function can control the plan via the labels and units keyword arguments, all of which are forwarded to plan_edf_to_onda_samples.

EDF.Signal labels that are converted into Onda channel names undergo the following transformations:

  • the label is whitespace-stripped, parens-stripped, and lowercased
  • trailing generic EDF references (e.g. "ref", "ref2", etc.) are dropped
  • any instance of + is replaced with _plus_ and / with _over_
  • all component names are converted to their "canonical names" when possible (e.g. "m1" in an EEG-matched channel name will be converted to "a1").

See the OndaEDF README for additional details regarding EDF formatting expectations.

source
OndaEDF.plan_edf_to_onda_samplesFunction
plan_edf_to_onda_samples(header, seconds_per_record; labels=STANDARD_LABELS,
+                         units=STANDARD_UNITS)
+plan_edf_to_onda_samples(signal::EDF.Signal, args...; kwargs...)

Formulate a plan for converting an EDF signal into Onda format. This returns a Tables.jl row with all the columns from the signal header, plus additional columns for the Onda.SamplesInfo for this signal, and the seconds_per_record that is passed in here.

If no labels match, then the channel and kind columns are missing; the behavior of other SamplesInfo columns is undefined; they are currently set to missing but that may change in future versions.

Any errors that are thrown in the process will be wrapped as SampleInfoErrors and then printed with backtrace to a String in the error column.

Matching EDF label to Onda labels

The labels keyword argument determines how Onda channel and signal kind are extracted from the EDF label.

Labels are specified as an iterable of signal_names => channel_names pairs. signal_names should be an iterable of signal names, the first of which is the canonical name used as the Onda kind. Each element of channel_names gives the specification for one channel, which can either be a string, or a canonical_name => alternates pair. Occurences of alternates will be replaces with canonical_name in the generated channel label.

Matching is determined solely by the channel names. When matching, the signal names are only used to remove signal names occuring as prefixes (e.g., "[ECG] AVL") before matching channel names. See match_edf_label for details, and see OndaEDF.STANDARD_LABELS for the default labels.

As an example, here is (a subset of) the default labels for ECG signals:

["ecg", "ekg"] => ["i" => ["1"], "ii" => ["2"], "iii" => ["3"],
+                   "avl"=> ["ecgl", "ekgl", "ecg", "ekg", "l"], 
+                   "avr"=> ["ekgr", "ecgr", "r"], ...]

Matching is done in the order that labels iterates pairs, and will stop at the first match, with no warning if signals are ambiguous (although this may change in a future version)

source
plan_edf_to_onda_samples(edf::EDF.File;
+                         labels=STANDARD_LABELS,
+                         units=STANDARD_UNITS,
+                         onda_signal_groupby=(:sensor_type, :sample_unit, :sample_rate))

Formulate a plan for converting an EDF.File to Onda Samples. This applies plan_edf_to_onda_samples to each individual signal contained in the file, storing edf_signal_index as an additional column.

The resulting rows are then passed to plan_edf_to_onda_samples_groups and grouped according to onda_signal_groupby (by default, the :sensor_type, :sample_unit, and :sample_rate columns), and the group index is added as an additional column in onda_signal_index.

The resulting plan is returned as a table. No signal data is actually read from the EDF file; to execute this plan and generate Onda.Samples, use edf_to_onda_samples. The index of the EDF signal (after filtering out signals that are not EDF.Signals, e.g. annotation channels) for each row is stored in the :edf_signal_index column, and the rows are sorted in order of :onda_signal_index, and then by :edf_signal_index.

source
OndaEDF.plan_edf_to_onda_samples_groupsFunction
plan_edf_to_onda_samples_groups(plan_rows; onda_signal_groupby=(:sensor_type, :sample_unit, :sample_rate))

Group together plan_rows based on the values of the onda_signal_groupby columns, creating the :onda_signal_index column and promoting the Onda encodings for each group using OndaEDF.promote_encodings.

If the :edf_signal_index column is not present or otherwise missing, it will be filled in based on the order of the input rows.

The updated rows are returned, sorted first by the columns named in onda_signal_groupby and second by order of occurrence within the input rows.

source

Import annotations

OndaEDF.edf_to_onda_annotationsFunction
edf_to_onda_annotations(edf::EDF.File, uuid::UUID)

Extract EDF+ annotations from an EDF.File for recording with ID uuid and return them as a vector of Onda.Annotations. Each returned annotation has a value field that contains the string value of the corresponding EDF+ annotation.

If no EDF+ annotations are found in edf, then an empty Vector{Annotation} is returned.

source
OndaEDFSchemas.EDFAnnotationV1Type
@version EDFAnnotationV1 > AnnotationV1 begin
+    value::String
+end

A Legolas-generated record type that represents a single annotation imported from an EDF Annotation signal. The value field contains the annotation value as a string.

source

Import plan table schemas

OndaEDFSchemas.PlanV2Type
@version PlanV2 begin
+    # EDF.SignalHeader fields
+    label::String
+    transducer_type::String
+    physical_dimension::String
+    physical_minimum::Float32
+    physical_maximum::Float32
+    digital_minimum::Float32
+    digital_maximum::Float32
+    prefilter::String
+    samples_per_record::Int16
+    # EDF.FileHeader field
+    seconds_per_record::Float64
+    # Onda.SignalV2 fields (channels -> channel), may be missing
+    recording::Union{UUID,Missing} = passmissing(UUID)
+    sensor_type::Union{Missing,AbstractString}
+    sensor_label::Union{Missing,AbstractString}
+    channel::Union{Missing,AbstractString}
+    sample_unit::Union{Missing,AbstractString}
+    sample_resolution_in_unit::Union{Missing,Float64}
+    sample_offset_in_unit::Union{Missing,Float64}
+    sample_type::Union{Missing,AbstractString}
+    sample_rate::Union{Missing,Float64}
+    # errors, use `nothing` to indicate no error
+    error::Union{Nothing,String}
+end

A Legolas-generated record type describing a single EDF signal-to-Onda channel conversion. The columns are the union of

  • fields from EDF.SignalHeader (all mandatory)
  • the seconds_per_record field from EDF.FileHeader (mandatory)
  • fields from Onda.SignalV2 (optional, may be missing to indicate failed conversion), except for file_path
  • error, which is nothing for a conversion that is or is expected to be successful, and a String describing the source of the error (with backtrace) in the case of a caught error.
source
OndaEDFSchemas.FilePlanV2Type
@version FilePlanV2 > PlanV2 begin
+    edf_signal_index::Int
+    onda_signal_index::Int
+end

A Legolas-generated record type representing one EDF signal-to-Onda channel conversion, which includes the columns of a PlanV2 and additional file-level context:

  • edf_signal_index gives the index of the signals in the source EDF.File corresponding to this row
  • onda_signal_index gives the index of the output Onda.Samples.

Note that while the EDF index does correspond to the actual index in edf.signals, some Onda indices may be skipped in the output, so onda_signal_index is only to indicate order and grouping.

source
OndaEDF.write_planFunction
write_plan(io_or_path, plan_table; validate=true, kwargs...)

Write a plan table to io_or_path using Legolas.write, using the ondaedf.file-plan@1 schema.

source

Full-service import

For a more "full-service" experience, OndaEDF.jl also provides functionality to extract Onda.Samples and EDFAnnotationV1s and then write them to disk:

OndaEDF.store_edf_as_ondaFunction
store_edf_as_onda(edf::EDF.File, onda_dir, recording_uuid::UUID=uuid4();
+                  custom_extractors=STANDARD_EXTRACTORS, import_annotations::Bool=true,
+                  postprocess_samples=identity,
+                  signals_prefix="edf", annotations_prefix=signals_prefix)

Convert an EDF.File to Onda.Samples and Onda.Annotations, store the samples in $path/samples/, and write the Onda signals and annotations tables to $path/$(signals_prefix).onda.signals.arrow and $path/$(annotations_prefix).onda.annotations.arrow. The default prefix is "edf", and if a prefix is provided for signals but not annotations both will use the signals prefix. The prefixes cannot reference (sub)directories.

Returns (; recording_uuid, signals, annotations, signals_path, annotations_path, plan).

This is a convenience function that first formulates an import plan via plan_edf_to_onda_samples, and then immediately executes this plan with edf_to_onda_samples.

The samples and executed plan are returned; it is strongly advised that you review the plan for un-extracted signals (where :sensor_type or :channel is missing) and errors (non-nothing values in :error).

Groups of EDF.Signals are mapped as channels to Onda.Samples via plan_edf_to_onda_samples. The caller of this function can control the plan via the labels and units keyword arguments, all of which are forwarded to plan_edf_to_onda_samples.

EDF.Signal labels that are converted into Onda channel names undergo the following transformations:

  • the label is whitespace-stripped, parens-stripped, and lowercased
  • trailing generic EDF references (e.g. "ref", "ref2", etc.) are dropped
  • any instance of + is replaced with _plus_ and / with _over_
  • all component names are converted to their "canonical names" when possible (e.g. "3" in an ECG-matched channel name will be converted to "iii").

If more control (e.g. preprocessing signal labels) is required, callers should use plan_edf_to_onda_samples and edf_to_onda_samples directly, and Onda.store the resulting samples manually.

See the OndaEDF README for additional details regarding EDF formatting expectations.

source

Internal import utilities

OndaEDF.match_edf_labelFunction
OndaEDF.match_edf_label(label, signal_names, channel_name, canonical_names)

Return a normalized label matched from an EDF label. The purpose of this function is to remove signal names from the label, and to canonicalize the channel name(s) that remain. So something like "[eCG] avl-REF" will be transformed to "avl" (given signal_names=["ecg"], and channel_name="avl")

This returns nothing if channel_name does not match after normalization.

Canonicalization

  • ensures the given label is whitespace-stripped, lowercase, and parens-free
  • strips trailing generic EDF references (e.g. "ref", "ref2", etc.)
  • replaces all references with the appropriate name as specified by canonical_names
  • replaces + with _plus_ and / with _over_
  • returns the initial reference name (w/o prefix sign, if present) and the entire label; the initial reference name should match the canonical channel name, otherwise the channel extraction will be rejected.

Examples

match_edf_label("[ekG]  avl-REF", ["ecg", "ekg"], "avl", []) == "avl"
+match_edf_label("ECG 2", ["ecg", "ekg"], "ii", ["ii" => ["2", "two", "ecg2"]]) == "ii"

See the tests for more examples

Note

This is an internal function and is not meant to be called directly.

source
OndaEDF.merge_samples_infoFunction
OndaEDF.merge_samples_info(plan_rows)

Create a single, merged SamplesInfo from plan rows, such as generated by plan_edf_to_onda_samples. Encodings are promoted with promote_encodings.

The input rows must have the same values for :sensor_type, :sample_unit, and :sample_rate; otherwise an ArgumentError is thrown.

If any of these values is missing, or any row's :channel value is missing, this returns missing to indicate it is not possible to determine a shared SamplesInfo.

The original EDF labels are included in the output in the :edf_channels column.

Note

This is an internal function and is not meant to be called direclty.

source
OndaEDF.onda_samples_from_edf_signalsFunction
OndaEDF.onda_samples_from_edf_signals(target::Onda.SamplesInfo, edf_signals,
+                                      edf_seconds_per_record; dither_storage=missing)

Generate an Onda.Samples struct from an iterable of EDF.Signals, based on the Onda.SamplesInfo in target. This checks for matching sample rates in the source signals. If the encoding of target is the same as the encoding in a signal, its encoded (usually Int16) data is copied directly into the Samples data matrix; otherwise it is re-encoded.

If dither_storage=missing (the default), dither storage is allocated automatically as specified in the docstring for Onda.encode. dither_storage=nothing disables dithering. See Onda.encode's docstring for more details.

Note

This function is not meant to be called directly, but through edf_to_onda_samples

source
OndaEDF.promote_encodingsFunction
promote_encodings(encodings; pick_offset=(_ -> 0.0), pick_resolution=minimum)

Return a common encoding for input encodings, as a NamedTuple with fields sample_type, sample_offset_in_unit, sample_resolution_in_unit, and sample_rate. If input encodings' sample_rates are not all equal, an error is thrown. If sample rates/offests are not equal, then pick_offset and pick_resolution are used to combine them into a common offset/resolution.

Note

This is an internal function and is not meant to be called direclty.

source

Export EDF from Onda

OndaEDF.onda_to_edfFunction
onda_to_edf(samples::AbstractVector{<:Samples}, annotations=[]; kwargs...)

Return an EDF.File containing signal data converted from a collection of Onda Samples and (optionally) annotations from an annotations table.

Following the Onda v0.5 format, annotations can be any Tables.jl-compatible table (DataFrame, Arrow.Table, NamedTuple of vectors, vector of NamedTuples) which follows the annotation schema.

Each EDF.Signal in the returned EDF.File corresponds to a channel of an input Onda.Samples.

The ordering of EDF.Signals in the output will match the order of the input collection of Samples (and within each channel grouping, the order of the samples' channels).

Note

EDF signals are encoded as Int16, while Onda allows a range of different sample types, some of which provide considerably more resolution than Int16. During export, re-encoding may be necessary if the encoded Onda samples cannot be represented directly as Int16 values. In this case, new encoding (resolution and offset) will be chosen based on the minimum and maximum values actually present in each signal in the input Onda Samples. Thus, it may not always be possible to losslessly round trip Onda-formatted datasets to EDF and back.

source

Deprecations

To support deserializing plan tables generated with old versions of OndaEDF + Onda, the following schemas are provided. These are deprecated and will be removed in a future release.

OndaEDFSchemas.PlanV1Type
@version PlanV1 begin
+    # EDF.SignalHeader fields
+    label::String
+    transducer_type::String
+    physical_dimension::String
+    physical_minimum::Float32
+    physical_maximum::Float32
+    digital_minimum::Float32
+    digital_maximum::Float32
+    prefilter::String
+    samples_per_record::Int16
+    # EDF.FileHeader field
+    seconds_per_record::Float64
+    # Onda.SignalV1 fields (channels -> channel), may be missing
+    recording::Union{UUID,Missing} = passmissing(UUID)
+    kind::Union{Missing,AbstractString}
+    channel::Union{Missing,AbstractString}
+    sample_unit::Union{Missing,AbstractString}
+    sample_resolution_in_unit::Union{Missing,Float64}
+    sample_offset_in_unit::Union{Missing,Float64}
+    sample_type::Union{Missing,AbstractString}
+    sample_rate::Union{Missing,Float64}
+    # errors, use `nothing` to indicate no error
+    error::Union{Nothing,String}
+end

A Legolas-generated record type describing a single EDF signal-to-Onda channel conversion. The columns are the union of

  • fields from EDF.SignalHeader (all mandatory)
  • the seconds_per_record field from EDF.FileHeader (mandatory)
  • fields from Onda.SignalV1 (optional, may be missing to indicate failed conversion), except for file_path
  • error, which is nothing for a conversion that is or is expected to be successful, and a String describing the source of the error (with backtrace) in the case of a caught error.
source
OndaEDFSchemas.FilePlanV1Type
@version FilePlanV1 > PlanV1 begin
+    edf_signal_index::Int
+    onda_signal_index::Int
+end

A Legolas-generated record type representing one EDF signal-to-Onda channel conversion, which includes the columns of a PlanV1 and additional file-level context:

  • edf_signal_index gives the index of the signals in the source EDF.File corresponding to this row
  • onda_signal_index gives the index of the output Onda.Samples.

Note that while the EDF index does correspond to the actual index in edf.signals, some Onda indices may be skipped in the output, so onda_signal_index is only to indicate order and grouping.

source
diff --git a/previews/PR76/convert-to-onda/index.html b/previews/PR76/convert-to-onda/index.html new file mode 100644 index 0000000..5f2ccd4 --- /dev/null +++ b/previews/PR76/convert-to-onda/index.html @@ -0,0 +1,88 @@ + +Converting from EDF · OndaEDF

An Opinionated Guide to Converting EDFs to Onda

Basic workflow

At a high level, the basic workflow for EDF-to-Onda conversion is iterative:

  1. Formulate a "plan" which specifies how to convert the metadata associated with each EDF.Signal into Onda metadata (channel names, quantization/encoding parameters, etc.)
  2. Review the plan, making sure that all necessary EDF.Signals will be extracted, and that the quantization, sample rate, physical units, etc. are reasonable.
  3. Revise the plan as needed, repeating steps 1-2 until you're happy.
  4. Execute the plan, loading all EDF signal data if necessary and converting into Onda.Samples
  5. Review the executed plan for additional errors or issues, and iterate steps 1-4 as needed.

In the following sections, we expand on the philosophy behind OndaEDF's EDF-to-Onda design and present some detailed, opinionated workflows for converting a single EDF and multiple EDFs.

Philosophy

The motivation for separating the planning and execution is threefold.

First, while there is an EDF(+) specification, it's so commonly violated in so many various ways that an EDF conversion package that requires fully spec-compliant EDFs is of little practical use, but at the same time, anticipating and working around all these possible violations is not practical. Separating planning and execution during EDF-to-Onda conversion reverts more control to the user for how their particular EDF files are handled.

Second, making the plan a separate intermediate output means that not only can it be reviewed during conversion but can be persisted as a record of how any EDF-derived Onda signals were converted. This kind of provenance information is very useful when investigating issues with a dataset that may crop up long after the initial conversion.

Third, planning only requires that the headers of the EDF.Signals be read into memory, thereby separating the iterative part of the conversion process from the expensive, one-time step which requires all the signal data be read into memory. This enables workflows that would be impractical otherwise, like planning bulk conversion of thousands of EDFs at once. When dealing with large, messy datasets, we have found that metadata issues are both likely to occur and likely to be different across individual EDFs. This makes normalizing the EDF metadata one file at a time extremely tedious, since the metadata issues encountered in a single file may not be representative of the rest of the dataset. Thus, in practice it's better to deal with EDF metadata conversion in bulk, and the plan-then-execute workflow enables users to deal with these issues all at once, save out the plan, and then distribute the actual conversion work to as many workers as necessary to execute it in a reasonable timeframe.

Converting a single EDF to Onda

The following steps assume you have read an EDF file into memory with EDF.read or otherwise created an EDF.File. After the detailed workflow for converting a single EDF file to Onda format, we'll discuss how to handle batches of EDF files.

Generate a plan

This is straightforward, using plan_edf_to_onda_samples. As outlined in the documentation for plan_edf_to_onda_samples, a "plan" is a table with one row per EDF.Signal, which contains all the fields from the signal's header as well as the fields of the Onda.SamplesInfoV2 that will be generated when the plan is executed (with the caveat that the :channels field is called :channel to indicate that it corresponds to a single channel in the output). It also contains a few additional fields for defining the mapping between EDF and Onda signal indices, as well as a field to capture any errors thrown during planning (or, more likely, during execution of the plan):

  • :edf_signal_index, the 1-based numerical index of the source signal in edf.signals
  • :onda_signal_index, the ordinal index of the resulting samples (not necessarily the index into samples, since some groups might be skipped)
  • :error, any errors that were caught during planning and/or execution.

Review the plan.

Check for EDF signals whose label or physical_dimension could not be matched using the standard OndaEDF labels and units, as indicated by missing values in the channel/sensor_type (for un-matched label) or sample_unit (for un-matched physical_dimension). It's also a good idea at this point to review the other EDF signal header fields, and how they will be converted to Onda (especially the sample unit, resolution and offset, which correspond to the physical/digital minimum/maximum from the EDF signal header.) It's harder to fix these issues with the numerical signal header fields as they usually point to issues with how the data was encoded into an EDF initially. However, it's still better to detect and document any issues with the underlying EDF data at this stage to prevent nasty surprises down the road.

Revise the plan

If there are EDF signals with un-matched label or physical_dimension, you have a few options. We recommend you consider them in roughly this order.

Skip them

The first option to consider is to simply ignore these signals; not all signals are necessarily required for downstream use, and converting each and every signal in an EDF may be more work than is justified!

Provide custom labels and units

The second option you have is to provide custom labels= and units= keyword arguments to plan_edf_to_onda_samples. For unambiguous, spec-compliant labels and physical_dimensions, it's generally possible to create custom label= or unit= specifications to match them.

Note

Custom labels should be specified as lowercase, without reference, and without the sensor type prefix. So to match a label like "EEG R1-Ref", use a label like "eeg" => ["r1"], and not "EEG" => ["R1"] or "eeg" => ["r1-ref"]. See the documentation for plan_edf_to_onda_samples for more details, and the internal OndaEDF.match_edf_label for low-level details of how labels are matched.

Warning

Sometimes EDF labels are ambiguous and can be matched by multiple different OndaEDF label= specifications. Matching is greedy, in that the first label specification that matches is used regardless of any other possible matches, so you should add your custom labels to the end of an existing set, as in

my_labels = collect(pairs(OndaEDF.STANDARD_LABELS))
+push!(my_labels, ["eog"] => ["left" => "eyeleft", "right" => "eyeright"])

When using custom labels, make sure that they haven't accidentally changed how other labels are matched by reviewing the plan for any unintended changes.

Preprocess signal headers

The third option, for signals that must be converted and cannot be handled with custom labels (without undue hassle) is to pre-process the signal headers before generating the plan. While the canonical input to plan_edf_to_onda_samples is an EDF.File, the header-matching logic operates fundamentally one signal header at a time. Moreover, it does not actually require that the input be an EDF.SignalHeader, only that it have the same fields as an EDF.SignalHeader. This design decision is meant to support workflows where the signal headers cannot for some reason be processed as-is due to corrupt/malformed strings, labels that cannot be matched using the OndaEDF matching algorithm, or any other reason.

For example, we've encountered EDFs in the wild where the transducer_type and label fields are switched, and must be switched back before planning:

edf = EDF.File(my_edf_file_path)
+
+function corrected_header(signal::EDF.Signal)
+    header = signal.header
+    return Tables.rowmerge(header; 
+                           label=header.transducer_type, 
+                           transducer_type=header.label)
+end
+
+plans = map(plan_edf_to_onda_samples ∘ corrected_header, edf.signals)
+new_plan = plan_edf_to_onda_samples_groups(plans)

Note that an additional step of plan_edf_to_onda_samples_groups is required after planning the individual signals. This is due to the fact that EDF is a "single channel" format, where each signal is only a single channel, while Onda is a "multichannel" format where a signal can have mmultiple channels as long as the sampling rate, quantization, and other metadata are consistent. Normally, calling plan_edf_to_onda_samples with an EDF.File will do this grouping for you, but when planning individually pre-processed signal headers, we have to do it ourselves at the end.

Modify the generated plan

The fourth and final option is to modify the generated plan itself. This is the least preferred method because it removes a number of safeguards that OndaEDF provides as part of the planning process, but it's also the most flexible in that it enables completely hand-crafted conversion. Here are a few examples, motivated by EDFs we have seen in the wild.

Some EEG signals have the physical units set to millivolts, but biologically generated EEG signals are generally on the order of microvolts. During import, you want to correct this by adjusting the encoding settings used by Onda to store samples, by scaling the sample offset and resolution by 1000 and setting the physical units. This can be accomplished by modifying the rows of the plan like so:

edf = EDF.File(my_edf_file_path)
+plans = plan_edf_to_onda_samples(edf; label=my_labels)
+
+function fix_millivolts(plan)
+    if plan.sample_unit == "millivolt" && plan.sensor_type == "eeg"
+        sample_resolution_in_unit = plan.sample_resolution_in_unit * 1000
+        sample_offset_in_unit = plan.sample_offset_in_unit * 1000
+        return Tables.rowmerge(plan; sample_unit="microvolt",
+                               sample_resolution_in_unit,
+                               sample_offset_in_unit)
+    else
+        return plan
+    end
+end
+
+new_plan = map(fix_millivolts, Tables.rows(plans))

As another, similar example, sometimes EMG channels get recorded with different physical units. In such a case, OndaEDF cannot merge these channels and will create multiple separate Samples objects which each have sensor_type = "emg". This can be corrected in a similar way, for exmaple by converting millivolts to microvolts (adjusting of course depending on the nature of your dataset) and re-grouping into Onda samples:

edf = EDF.File(my_edf_file_path)
+plans = plan_edf_to_onda_samples(edf; label=my_labels)
+
+function fix_emg(plan)
+    if plan.sensor_type == "emg"
+        if plan.sample_unit == "millivolt"
+            sample_resolution_in_unit = plan.sample_resolution_in_unit * 1000
+            sample_offset_in_unit = plan.sample_offset_in_unit * 1000
+            plan = Tables.rowmerge(plan; sample_unit="microvolt",
+                                   sample_resolution_in_unit,
+                                   sample_offset_in_unit)
+        end
+        return plan
+    else
+        return plan
+    end
+end
+
+new_plan = map(fix_emg, Tables.rows(plans))
+# re-compute the grouping of EDF signals into Onda signals:
+new_plan = plan_edf_to_onda_samples_groups(new_plan)

Execute the plan

Once the plan has been reviewed and deemed satisfactory, execute the plan to generate Onda.Samples and an "executed plan" record. This is accomplished with the edf_to_onda_samples function, which takes an EDF.File and a plan as input, and returns a vector of Onda.Samples and the plan as executed. The executed plan may differ from the input plan. Most notably, if any errors were encountered during execution, they will be caught and the error and stacktrace will be stored as strings in the error field. It is important to review the executed plan a final time to ensure everything was converted as expected and no unexpected errors were encountered. If any errors were encountered, you may need to iterate further.

Store the output

The final step is to store both the Onda.Samples and the executed plan in some persistent storage. For storing Onda.Samples, see Onda.store, which supports serializing LPCM-encoded samples to any "path-like" type (i.e., anything that provides a method for write). For storing the plan, use OndaEDF.write_plan (or Legolas.write(file_path, plan, FilePlanV2SchemaVersion()) (see the documentation for Legolas.write and FilePlanV2.

Batch conversion of many EDFs

The workflow for bulk conversion of multiple EDFs is similar to the workflow for converting a single EDF. The major difference is that the "planning" steps can be conducted in bulk, while the "execution" steps (generally) need to be conducted one at a time, either serially or distributed across multiple workers. As discussed above, the planning stage requires only a few KB from the EDF file/signal headers, facilitating rapid plan-review-revise iteration of even fairly large collections of EDFs (10,000+).

Planning multiple EDFs

The main factor to consider when planning conversion of a large batch of EDF files is that planning requires only the (small number) of header bytes, even for very large EDF files. Thus, the first step is to read the file headers into memory without reading the signal data itself (which for more than a few EDF files will not usually fit into memory due to the large amount of signal data found in EDF files).

Reading headers from local filesystem

For EDF files stored on a normal filesystem, the EDF.File constructor will by default create a "header-only" EDF.File, so multiple files' headers can be read like

files = map(edf_paths) do path
+    open(EDF.File, path, "r")
+end

Reading headers from S3

Note

This section may become obsolete in a future version of EDF.jl which uses the conditional dependency functionality available from Julia 1.9+ to provide tighter integration with AWSS3.jl.

Unfortunately, open(path::S3Path) will fetch the entire contents of the object stored at path, so we need to be a bit clever to read only header bytes from an S3 file, especially given that the number of bytes we need to read depends on the number of signals. The following is an example of one technique for reading EDF file and signal headers from S3:

function EDF.read_file_header(path::S3Path)
+    bytes = s3_get(path.bucket, path.key; byte_range=1:256)
+    buffer = IOBuffer(bytes)
+    return EDF.read_file_header(buffer)
+end
+
+function EDF.File(path::S3Path)
+    _, n_signals = EDF.read_file_header(path)
+    bytes = s3_get(path.bucket, path.key; byte_range=1:(256 * (n_signals + 1)))
+    return EDF.File(IOBuffer(bytes))
+end
+
+# use asyncmap because this is mostly bound by request roundtrip latency
+files = asyncmap(EDF.File, edf_paths)

Concatenating plans into one big table

When doing bulk review of plans, it's generally helpful to have the individual files' plans concatenated into a single large table. It's important to keep track of which plan rows corresopnd to which input file, which can be accomplished via something like this:

# create a UUID namespace to make recording ID generation idempotent
+const NAMESPACE = UUID(...)
+function plan_all(edf_paths, files; kwargs...)
+    plans = mapreduce(vcat, edf_paths, files) do origin_uri, edf
+        plan = plan_edf_to_onda_samples(edf; kwargs...)
+        plan = DataFrame(plan)
+        # make sure this is the same every time this function is re-run!
+        recording = uuid5(NAMESPACE, string(origin_uri))
+        return insertcols!(plan, 
+                           :origin_uri => origin_uri,
+                           :recording => recording)
+    end
+end

Review and revise the plans

This "bulk plan" table can then be reviewed in bulk, looking for patterns in which labels are not matched, physical units associated with each sensor_type, etc. At a minimum, we find it useful to print some basic counts:

plans = plan_all(...)
+# helper function to tally rows per group
+tally(df, g, agg...=nrow => :count) = combine(groupby(df, g), agg...)
+unmatched_labels = filter(:channel => ismissing, plans)
+@info "unmatched labels:" tally(unmatched_labels, :label)
+
+unmatched_units = filter(:sample_unit => ismissing, plans)
+@info "unmatched labels:" tally(unmatched_units, :physical_dimension)
+
+matched = subset(plans, :channel => ByRow(!ismissing), :sample_unit => ByRow(!ismissing))
+@info "matched sensor types/channels:" tally(matched, [:sensor_type, :channel, :sample_unit])

Reviewing these summaries is a good first step when revising the plans. The revision process is basically the same as with a single EDF: update the labels= and units= as needed to capture any un-matched EDF signals, and failing that, preprocess the headers/postprocess the plan. Note that if it is necessary to run plan_edf_to_onda_samples_groups, this must be done one file at a time, using something like this to preserve the recording-level keys created above:

new_plans = combine(groupby(plans, [:recording, :origin_uri])) do plan
+    new_plan = plan_edf_to_onda_samples_groups(Tables.rows(plan))
+    return DataFrame(new_plan)
+end

Executing bulk plans and storing generated samples

The last step, as with single EDF conversion, is to execute the plans. Given that this requires loading signal data into memory, it's generally necessary to do this one recording at a time, either serially on a single process or using multiprocessing to distribute work over different processes or even machines. A complete introduction to multiprocessing in Julia is outside the scope of this guide, but we offer a few pointers in the hope that we can help avoid common pitfalls.

First, it's generally a good idea to create a function that accepts one recording's plan, EDF file path, and recording ID (or generally any additional metadata that is required to create a persistent record), which will execute the plan and persistently store the resulting samples and executed plan. This function then may return either the generated Onda.SignalV2 and OndaEDF.FilePlanV2 tables for the completed recording, or pointers to where these are stored. This way, the memory pressure involved in loading an entire EDF's signal data is confined to function scope which makes it slightly easier for Julia's garbage collector.

Second, a separate function should handle coordinating these individual jobs and then collecting these results into the ultimate aggregate signal and plan tables, and then persistently storing those to a final destination.

diff --git a/previews/PR76/index.html b/previews/PR76/index.html index 62690e8..12ec96f 100644 --- a/previews/PR76/index.html +++ b/previews/PR76/index.html @@ -1,71 +1,2 @@ -API Documentation · OndaEDF

API Documentation

Import EDF to Onda

OndaEDF.jl prefers "self-service" import over "automagic", and provides functionality to extract Onda.Samples and EDFAnnotationV1s (which extend Onda.AnnotationV1s) from an EDF.File. These can be written to disk (with Onda.store / Legolas.write or manipulated in memory as desired.

Import signal data as Samples

OndaEDF.edf_to_onda_samplesFunction
edf_to_onda_samples(edf::EDF.File, plan_table; validate=true, dither_storage=missing)

Convert Signals found in an EDF File to Onda.Samples according to the plan specified in plan_table (e.g., as generated by plan_edf_to_onda_samples), returning an iterable of the generated Onda.Samples and the plan as actually executed.

The input plan is transformed by using merge_samples_info to combine rows with the same :onda_signal_index into a common Onda.SamplesInfo. Then OndaEDF.onda_samples_from_edf_signals is used to combine the EDF signals data into a single Onda.Samples per group.

The label of the original EDF.Signals are preserved in the :edf_channels field of the resulting SamplesInfos for each Samples generated.

Any errors that occur are shown as Strings (with backtrace) and inserted into the :error column for the corresponding rows from the plan.

Samples are returned in the order of :onda_signal_index. Signals that could not be matched or otherwise caused an error during execution are not returned.

If validate=true (the default), the plan is validated against the FilePlanV2 schema, and the signal headers in the EDF.File.

If dither_storage=missing (the default), dither storage is allocated automatically as specified in the docstring for Onda.encode. dither_storage=nothing disables dithering.

source
edf_to_onda_samples(edf::EDF.File; kwargs...)

Read signals from an EDF.File into a vector of Onda.Samples. This is a convenience function that first formulates an import plan via plan_edf_to_onda_samples, and then immediately executes this plan with edf_to_onda_samples.

The samples and executed plan are returned; it is strongly advised that you review the plan for un-extracted signals (where :sensor_type or :channel is missing) and errors (non-nothing values in :error).

Collections of EDF.Signals are mapped as channels to Onda.Samples via plan_edf_to_onda_samples. The caller of this function can control the plan via the labels and units keyword arguments, all of which are forwarded to plan_edf_to_onda_samples.

EDF.Signal labels that are converted into Onda channel names undergo the following transformations:

  • the label is whitespace-stripped, parens-stripped, and lowercased
  • trailing generic EDF references (e.g. "ref", "ref2", etc.) are dropped
  • any instance of + is replaced with _plus_ and / with _over_
  • all component names are converted to their "canonical names" when possible (e.g. "m1" in an EEG-matched channel name will be converted to "a1").

See the OndaEDF README for additional details regarding EDF formatting expectations.

source
OndaEDF.plan_edf_to_onda_samplesFunction
plan_edf_to_onda_samples(header, seconds_per_record; labels=STANDARD_LABELS,
-                         units=STANDARD_UNITS)
-plan_edf_to_onda_samples(signal::EDF.Signal, args...; kwargs...)

Formulate a plan for converting an EDF signal into Onda format. This returns a Tables.jl row with all the columns from the signal header, plus additional columns for the Onda.SamplesInfo for this signal, and the seconds_per_record that is passed in here.

If no labels match, then the channel and kind columns are missing; the behavior of other SamplesInfo columns is undefined; they are currently set to missing but that may change in future versions.

Any errors that are thrown in the process will be wrapped as SampleInfoErrors and then printed with backtrace to a String in the error column.

Matching EDF label to Onda labels

The labels keyword argument determines how Onda channel and signal kind are extracted from the EDF label.

Labels are specified as an iterable of signal_names => channel_names pairs. signal_names should be an iterable of signal names, the first of which is the canonical name used as the Onda kind. Each element of channel_names gives the specification for one channel, which can either be a string, or a canonical_name => alternates pair. Occurences of alternates will be replaces with canonical_name in the generated channel label.

Matching is determined solely by the channel names. When matching, the signal names are only used to remove signal names occuring as prefixes (e.g., "[ECG] AVL") before matching channel names. See match_edf_label for details, and see OndaEDF.STANDARD_LABELS for the default labels.

As an example, here is (a subset of) the default labels for ECG signals:

["ecg", "ekg"] => ["i" => ["1"], "ii" => ["2"], "iii" => ["3"],
-                   "avl"=> ["ecgl", "ekgl", "ecg", "ekg", "l"], 
-                   "avr"=> ["ekgr", "ecgr", "r"], ...]

Matching is done in the order that labels iterates pairs, and will stop at the first match, with no warning if signals are ambiguous (although this may change in a future version)

source
plan_edf_to_onda_samples(edf::EDF.File;
-                         labels=STANDARD_LABELS,
-                         units=STANDARD_UNITS,
-                         onda_signal_groupby=(:sensor_type, :sample_unit, :sample_rate))

Formulate a plan for converting an EDF.File to Onda Samples. This applies plan_edf_to_onda_samples to each individual signal contained in the file, storing edf_signal_index as an additional column.

The resulting rows are then passed to plan_edf_to_onda_samples_groups and grouped according to onda_signal_groupby (by default, the :sensor_type, :sample_unit, and :sample_rate columns), and the group index is added as an additional column in onda_signal_index.

The resulting plan is returned as a table. No signal data is actually read from the EDF file; to execute this plan and generate Onda.Samples, use edf_to_onda_samples. The index of the EDF signal (after filtering out signals that are not EDF.Signals, e.g. annotation channels) for each row is stored in the :edf_signal_index column, and the rows are sorted in order of :onda_signal_index, and then by :edf_signal_index.

source
OndaEDF.plan_edf_to_onda_samples_groupsFunction
plan_edf_to_onda_samples_groups(plan_rows; onda_signal_groupby=(:sensor_type, :sample_unit, :sample_rate))

Group together plan_rows based on the values of the onda_signal_groupby columns, creating the :onda_signal_index column and promoting the Onda encodings for each group using OndaEDF.promote_encodings.

If the :edf_signal_index column is not present or otherwise missing, it will be filled in based on the order of the input rows.

The updated rows are returned, sorted first by the columns named in onda_signal_groupby and second by order of occurrence within the input rows.

source

Import annotations

OndaEDF.edf_to_onda_annotationsFunction
edf_to_onda_annotations(edf::EDF.File, uuid::UUID)

Extract EDF+ annotations from an EDF.File for recording with ID uuid and return them as a vector of Onda.Annotations. Each returned annotation has a value field that contains the string value of the corresponding EDF+ annotation.

If no EDF+ annotations are found in edf, then an empty Vector{Annotation} is returned.

source
OndaEDFSchemas.EDFAnnotationV1Type
@version EDFAnnotationV1 > AnnotationV1 begin
-    value::String
-end

A Legolas-generated record type that represents a single annotation imported from an EDF Annotation signal. The value field contains the annotation value as a string.

source

Import plan table schemas

OndaEDFSchemas.PlanV2Type
@version PlanV2 begin
-    # EDF.SignalHeader fields
-    label::String
-    transducer_type::String
-    physical_dimension::String
-    physical_minimum::Float32
-    physical_maximum::Float32
-    digital_minimum::Float32
-    digital_maximum::Float32
-    prefilter::String
-    samples_per_record::Int16
-    # EDF.FileHeader field
-    seconds_per_record::Float64
-    # Onda.SignalV2 fields (channels -> channel), may be missing
-    recording::Union{UUID,Missing} = passmissing(UUID)
-    sensor_type::Union{Missing,AbstractString}
-    sensor_label::Union{Missing,AbstractString}
-    channel::Union{Missing,AbstractString}
-    sample_unit::Union{Missing,AbstractString}
-    sample_resolution_in_unit::Union{Missing,Float64}
-    sample_offset_in_unit::Union{Missing,Float64}
-    sample_type::Union{Missing,AbstractString}
-    sample_rate::Union{Missing,Float64}
-    # errors, use `nothing` to indicate no error
-    error::Union{Nothing,String}
-end

A Legolas-generated record type describing a single EDF signal-to-Onda channel conversion. The columns are the union of

  • fields from EDF.SignalHeader (all mandatory)
  • the seconds_per_record field from EDF.FileHeader (mandatory)
  • fields from Onda.SignalV2 (optional, may be missing to indicate failed conversion), except for file_path
  • error, which is nothing for a conversion that is or is expected to be successful, and a String describing the source of the error (with backtrace) in the case of a caught error.
source
OndaEDFSchemas.FilePlanV2Type
@version FilePlanV2 > PlanV2 begin
-    edf_signal_index::Int
-    onda_signal_index::Int
-end

A Legolas-generated record type representing one EDF signal-to-Onda channel conversion, which includes the columns of a PlanV2 and additional file-level context:

  • edf_signal_index gives the index of the signals in the source EDF.File corresponding to this row
  • onda_signal_index gives the index of the output Onda.Samples.

Note that while the EDF index does correspond to the actual index in edf.signals, some Onda indices may be skipped in the output, so onda_signal_index is only to indicate order and grouping.

source
OndaEDF.write_planFunction
write_plan(io_or_path, plan_table; validate=true, kwargs...)

Write a plan table to io_or_path using Legolas.write, using the ondaedf.file-plan@1 schema.

source

Full-service import

For a more "full-service" experience, OndaEDF.jl also provides functionality to extract Onda.Samples and EDFAnnotationV1s and then write them to disk:

OndaEDF.store_edf_as_ondaFunction
store_edf_as_onda(edf::EDF.File, onda_dir, recording_uuid::UUID=uuid4();
-                  custom_extractors=STANDARD_EXTRACTORS, import_annotations::Bool=true,
-                  postprocess_samples=identity,
-                  signals_prefix="edf", annotations_prefix=signals_prefix)

Convert an EDF.File to Onda.Samples and Onda.Annotations, store the samples in $path/samples/, and write the Onda signals and annotations tables to $path/$(signals_prefix).onda.signals.arrow and $path/$(annotations_prefix).onda.annotations.arrow. The default prefix is "edf", and if a prefix is provided for signals but not annotations both will use the signals prefix. The prefixes cannot reference (sub)directories.

Returns (; recording_uuid, signals, annotations, signals_path, annotations_path, plan).

This is a convenience function that first formulates an import plan via plan_edf_to_onda_samples, and then immediately executes this plan with edf_to_onda_samples.

The samples and executed plan are returned; it is strongly advised that you review the plan for un-extracted signals (where :sensor_type or :channel is missing) and errors (non-nothing values in :error).

Groups of EDF.Signals are mapped as channels to Onda.Samples via plan_edf_to_onda_samples. The caller of this function can control the plan via the labels and units keyword arguments, all of which are forwarded to plan_edf_to_onda_samples.

EDF.Signal labels that are converted into Onda channel names undergo the following transformations:

  • the label is whitespace-stripped, parens-stripped, and lowercased
  • trailing generic EDF references (e.g. "ref", "ref2", etc.) are dropped
  • any instance of + is replaced with _plus_ and / with _over_
  • all component names are converted to their "canonical names" when possible (e.g. "3" in an ECG-matched channel name will be converted to "iii").

If more control (e.g. preprocessing signal labels) is required, callers should use plan_edf_to_onda_samples and edf_to_onda_samples directly, and Onda.store the resulting samples manually.

See the OndaEDF README for additional details regarding EDF formatting expectations.

source

Internal import utilities

OndaEDF.match_edf_labelFunction
OndaEDF.match_edf_label(label, signal_names, channel_name, canonical_names)

Return a normalized label matched from an EDF label. The purpose of this function is to remove signal names from the label, and to canonicalize the channel name(s) that remain. So something like "[eCG] avl-REF" will be transformed to "avl" (given signal_names=["ecg"], and channel_name="avl")

This returns nothing if channel_name does not match after normalization.

Canonicalization

  • ensures the given label is whitespace-stripped, lowercase, and parens-free
  • strips trailing generic EDF references (e.g. "ref", "ref2", etc.)
  • replaces all references with the appropriate name as specified by canonical_names
  • replaces + with _plus_ and / with _over_
  • returns the initial reference name (w/o prefix sign, if present) and the entire label; the initial reference name should match the canonical channel name, otherwise the channel extraction will be rejected.

Examples

match_edf_label("[ekG]  avl-REF", ["ecg", "ekg"], "avl", []) == "avl"
-match_edf_label("ECG 2", ["ecg", "ekg"], "ii", ["ii" => ["2", "two", "ecg2"]]) == "ii"

See the tests for more examples

Note

This is an internal function and is not meant to be called directly.

source
OndaEDF.merge_samples_infoFunction
OndaEDF.merge_samples_info(plan_rows)

Create a single, merged SamplesInfo from plan rows, such as generated by plan_edf_to_onda_samples. Encodings are promoted with promote_encodings.

The input rows must have the same values for :sensor_type, :sample_unit, and :sample_rate; otherwise an ArgumentError is thrown.

If any of these values is missing, or any row's :channel value is missing, this returns missing to indicate it is not possible to determine a shared SamplesInfo.

The original EDF labels are included in the output in the :edf_channels column.

Note

This is an internal function and is not meant to be called direclty.

source
OndaEDF.onda_samples_from_edf_signalsFunction
OndaEDF.onda_samples_from_edf_signals(target::Onda.SamplesInfo, edf_signals,
-                                      edf_seconds_per_record; dither_storage=missing)

Generate an Onda.Samples struct from an iterable of EDF.Signals, based on the Onda.SamplesInfo in target. This checks for matching sample rates in the source signals. If the encoding of target is the same as the encoding in a signal, its encoded (usually Int16) data is copied directly into the Samples data matrix; otherwise it is re-encoded.

If dither_storage=missing (the default), dither storage is allocated automatically as specified in the docstring for Onda.encode. dither_storage=nothing disables dithering. See Onda.encode's docstring for more details.

Note

This function is not meant to be called directly, but through edf_to_onda_samples

source
OndaEDF.promote_encodingsFunction
promote_encodings(encodings; pick_offset=(_ -> 0.0), pick_resolution=minimum)

Return a common encoding for input encodings, as a NamedTuple with fields sample_type, sample_offset_in_unit, sample_resolution_in_unit, and sample_rate. If input encodings' sample_rates are not all equal, an error is thrown. If sample rates/offests are not equal, then pick_offset and pick_resolution are used to combine them into a common offset/resolution.

Note

This is an internal function and is not meant to be called direclty.

source

Export EDF from Onda

OndaEDF.onda_to_edfFunction
onda_to_edf(samples::AbstractVector{<:Samples}, annotations=[]; kwargs...)

Return an EDF.File containing signal data converted from a collection of Onda Samples and (optionally) annotations from an annotations table.

Following the Onda v0.5 format, annotations can be any Tables.jl-compatible table (DataFrame, Arrow.Table, NamedTuple of vectors, vector of NamedTuples) which follows the annotation schema.

Each EDF.Signal in the returned EDF.File corresponds to a channel of an input Onda.Samples.

The ordering of EDF.Signals in the output will match the order of the input collection of Samples (and within each channel grouping, the order of the samples' channels).

Note

EDF signals are encoded as Int16, while Onda allows a range of different sample types, some of which provide considerably more resolution than Int16. During export, re-encoding may be necessary if the encoded Onda samples cannot be represented directly as Int16 values. In this case, new encoding (resolution and offset) will be chosen based on the minimum and maximum values actually present in each signal in the input Onda Samples. Thus, it may not always be possible to losslessly round trip Onda-formatted datasets to EDF and back.

source

Deprecations

To support deserializing plan tables generated with old versions of OndaEDF + Onda, the following schemas are provided. These are deprecated and will be removed in a future release.

OndaEDFSchemas.PlanV1Type
@version PlanV1 begin
-    # EDF.SignalHeader fields
-    label::String
-    transducer_type::String
-    physical_dimension::String
-    physical_minimum::Float32
-    physical_maximum::Float32
-    digital_minimum::Float32
-    digital_maximum::Float32
-    prefilter::String
-    samples_per_record::Int16
-    # EDF.FileHeader field
-    seconds_per_record::Float64
-    # Onda.SignalV1 fields (channels -> channel), may be missing
-    recording::Union{UUID,Missing} = passmissing(UUID)
-    kind::Union{Missing,AbstractString}
-    channel::Union{Missing,AbstractString}
-    sample_unit::Union{Missing,AbstractString}
-    sample_resolution_in_unit::Union{Missing,Float64}
-    sample_offset_in_unit::Union{Missing,Float64}
-    sample_type::Union{Missing,AbstractString}
-    sample_rate::Union{Missing,Float64}
-    # errors, use `nothing` to indicate no error
-    error::Union{Nothing,String}
-end

A Legolas-generated record type describing a single EDF signal-to-Onda channel conversion. The columns are the union of

  • fields from EDF.SignalHeader (all mandatory)
  • the seconds_per_record field from EDF.FileHeader (mandatory)
  • fields from Onda.SignalV1 (optional, may be missing to indicate failed conversion), except for file_path
  • error, which is nothing for a conversion that is or is expected to be successful, and a String describing the source of the error (with backtrace) in the case of a caught error.
source
OndaEDFSchemas.FilePlanV1Type
@version FilePlanV1 > PlanV1 begin
-    edf_signal_index::Int
-    onda_signal_index::Int
-end

A Legolas-generated record type representing one EDF signal-to-Onda channel conversion, which includes the columns of a PlanV1 and additional file-level context:

  • edf_signal_index gives the index of the signals in the source EDF.File corresponding to this row
  • onda_signal_index gives the index of the output Onda.Samples.

Note that while the EDF index does correspond to the actual index in edf.signals, some Onda indices may be skipped in the output, so onda_signal_index is only to indicate order and grouping.

source
+OndaEDF · OndaEDF
diff --git a/previews/PR76/search/index.html b/previews/PR76/search/index.html index aabb912..76cbd0d 100644 --- a/previews/PR76/search/index.html +++ b/previews/PR76/search/index.html @@ -1,2 +1,2 @@ -Search · OndaEDF

Loading search...

    +Search · OndaEDF

    Loading search...

      diff --git a/previews/PR76/search_index.js b/previews/PR76/search_index.js index 14a4084..c21d592 100644 --- a/previews/PR76/search_index.js +++ b/previews/PR76/search_index.js @@ -1,3 +1,3 @@ var documenterSearchIndex = {"docs": -[{"location":"#API-Documentation","page":"API Documentation","title":"API Documentation","text":"","category":"section"},{"location":"","page":"API Documentation","title":"API Documentation","text":"CurrentModule = OndaEDF","category":"page"},{"location":"#Import-EDF-to-Onda","page":"API Documentation","title":"Import EDF to Onda","text":"","category":"section"},{"location":"","page":"API Documentation","title":"API Documentation","text":"OndaEDF.jl prefers \"self-service\" import over \"automagic\", and provides functionality to extract Onda.Samples and EDFAnnotationV1s (which extend Onda.AnnotationV1s) from an EDF.File. These can be written to disk (with Onda.store / Legolas.write or manipulated in memory as desired.","category":"page"},{"location":"#Import-signal-data-as-Samples","page":"API Documentation","title":"Import signal data as Samples","text":"","category":"section"},{"location":"","page":"API Documentation","title":"API Documentation","text":"edf_to_onda_samples\nplan_edf_to_onda_samples\nplan_edf_to_onda_samples_groups","category":"page"},{"location":"#OndaEDF.edf_to_onda_samples","page":"API Documentation","title":"OndaEDF.edf_to_onda_samples","text":"edf_to_onda_samples(edf::EDF.File, plan_table; validate=true, dither_storage=missing)\n\nConvert Signals found in an EDF File to Onda.Samples according to the plan specified in plan_table (e.g., as generated by plan_edf_to_onda_samples), returning an iterable of the generated Onda.Samples and the plan as actually executed.\n\nThe input plan is transformed by using merge_samples_info to combine rows with the same :onda_signal_index into a common Onda.SamplesInfo. Then OndaEDF.onda_samples_from_edf_signals is used to combine the EDF signals data into a single Onda.Samples per group.\n\nThe label of the original EDF.Signals are preserved in the :edf_channels field of the resulting SamplesInfos for each Samples generated.\n\nAny errors that occur are shown as Strings (with backtrace) and inserted into the :error column for the corresponding rows from the plan.\n\nSamples are returned in the order of :onda_signal_index. Signals that could not be matched or otherwise caused an error during execution are not returned.\n\nIf validate=true (the default), the plan is validated against the FilePlanV2 schema, and the signal headers in the EDF.File.\n\nIf dither_storage=missing (the default), dither storage is allocated automatically as specified in the docstring for Onda.encode. dither_storage=nothing disables dithering.\n\n\n\n\n\nedf_to_onda_samples(edf::EDF.File; kwargs...)\n\nRead signals from an EDF.File into a vector of Onda.Samples. This is a convenience function that first formulates an import plan via plan_edf_to_onda_samples, and then immediately executes this plan with edf_to_onda_samples.\n\nThe samples and executed plan are returned; it is strongly advised that you review the plan for un-extracted signals (where :sensor_type or :channel is missing) and errors (non-nothing values in :error).\n\nCollections of EDF.Signals are mapped as channels to Onda.Samples via plan_edf_to_onda_samples. The caller of this function can control the plan via the labels and units keyword arguments, all of which are forwarded to plan_edf_to_onda_samples.\n\nEDF.Signal labels that are converted into Onda channel names undergo the following transformations:\n\nthe label is whitespace-stripped, parens-stripped, and lowercased\ntrailing generic EDF references (e.g. \"ref\", \"ref2\", etc.) are dropped\nany instance of + is replaced with _plus_ and / with _over_\nall component names are converted to their \"canonical names\" when possible (e.g. \"m1\" in an EEG-matched channel name will be converted to \"a1\").\n\nSee the OndaEDF README for additional details regarding EDF formatting expectations.\n\n\n\n\n\n","category":"function"},{"location":"#OndaEDF.plan_edf_to_onda_samples","page":"API Documentation","title":"OndaEDF.plan_edf_to_onda_samples","text":"plan_edf_to_onda_samples(header, seconds_per_record; labels=STANDARD_LABELS,\n units=STANDARD_UNITS)\nplan_edf_to_onda_samples(signal::EDF.Signal, args...; kwargs...)\n\nFormulate a plan for converting an EDF signal into Onda format. This returns a Tables.jl row with all the columns from the signal header, plus additional columns for the Onda.SamplesInfo for this signal, and the seconds_per_record that is passed in here.\n\nIf no labels match, then the channel and kind columns are missing; the behavior of other SamplesInfo columns is undefined; they are currently set to missing but that may change in future versions.\n\nAny errors that are thrown in the process will be wrapped as SampleInfoErrors and then printed with backtrace to a String in the error column.\n\nMatching EDF label to Onda labels\n\nThe labels keyword argument determines how Onda channel and signal kind are extracted from the EDF label.\n\nLabels are specified as an iterable of signal_names => channel_names pairs. signal_names should be an iterable of signal names, the first of which is the canonical name used as the Onda kind. Each element of channel_names gives the specification for one channel, which can either be a string, or a canonical_name => alternates pair. Occurences of alternates will be replaces with canonical_name in the generated channel label.\n\nMatching is determined solely by the channel names. When matching, the signal names are only used to remove signal names occuring as prefixes (e.g., \"[ECG] AVL\") before matching channel names. See match_edf_label for details, and see OndaEDF.STANDARD_LABELS for the default labels.\n\nAs an example, here is (a subset of) the default labels for ECG signals:\n\n[\"ecg\", \"ekg\"] => [\"i\" => [\"1\"], \"ii\" => [\"2\"], \"iii\" => [\"3\"],\n \"avl\"=> [\"ecgl\", \"ekgl\", \"ecg\", \"ekg\", \"l\"], \n \"avr\"=> [\"ekgr\", \"ecgr\", \"r\"], ...]\n\nMatching is done in the order that labels iterates pairs, and will stop at the first match, with no warning if signals are ambiguous (although this may change in a future version)\n\n\n\n\n\nplan_edf_to_onda_samples(edf::EDF.File;\n labels=STANDARD_LABELS,\n units=STANDARD_UNITS,\n onda_signal_groupby=(:sensor_type, :sample_unit, :sample_rate))\n\nFormulate a plan for converting an EDF.File to Onda Samples. This applies plan_edf_to_onda_samples to each individual signal contained in the file, storing edf_signal_index as an additional column. \n\nThe resulting rows are then passed to plan_edf_to_onda_samples_groups and grouped according to onda_signal_groupby (by default, the :sensor_type, :sample_unit, and :sample_rate columns), and the group index is added as an additional column in onda_signal_index.\n\nThe resulting plan is returned as a table. No signal data is actually read from the EDF file; to execute this plan and generate Onda.Samples, use edf_to_onda_samples. The index of the EDF signal (after filtering out signals that are not EDF.Signals, e.g. annotation channels) for each row is stored in the :edf_signal_index column, and the rows are sorted in order of :onda_signal_index, and then by :edf_signal_index.\n\n\n\n\n\n","category":"function"},{"location":"#OndaEDF.plan_edf_to_onda_samples_groups","page":"API Documentation","title":"OndaEDF.plan_edf_to_onda_samples_groups","text":"plan_edf_to_onda_samples_groups(plan_rows; onda_signal_groupby=(:sensor_type, :sample_unit, :sample_rate))\n\nGroup together plan_rows based on the values of the onda_signal_groupby columns, creating the :onda_signal_index column and promoting the Onda encodings for each group using OndaEDF.promote_encodings.\n\nIf the :edf_signal_index column is not present or otherwise missing, it will be filled in based on the order of the input rows.\n\nThe updated rows are returned, sorted first by the columns named in onda_signal_groupby and second by order of occurrence within the input rows.\n\n\n\n\n\n","category":"function"},{"location":"#Import-annotations","page":"API Documentation","title":"Import annotations","text":"","category":"section"},{"location":"","page":"API Documentation","title":"API Documentation","text":"edf_to_onda_annotations\nEDFAnnotationV1","category":"page"},{"location":"#OndaEDF.edf_to_onda_annotations","page":"API Documentation","title":"OndaEDF.edf_to_onda_annotations","text":"edf_to_onda_annotations(edf::EDF.File, uuid::UUID)\n\nExtract EDF+ annotations from an EDF.File for recording with ID uuid and return them as a vector of Onda.Annotations. Each returned annotation has a value field that contains the string value of the corresponding EDF+ annotation.\n\nIf no EDF+ annotations are found in edf, then an empty Vector{Annotation} is returned.\n\n\n\n\n\n","category":"function"},{"location":"#OndaEDFSchemas.EDFAnnotationV1","page":"API Documentation","title":"OndaEDFSchemas.EDFAnnotationV1","text":"@version EDFAnnotationV1 > AnnotationV1 begin\n value::String\nend\n\nA Legolas-generated record type that represents a single annotation imported from an EDF Annotation signal. The value field contains the annotation value as a string.\n\n\n\n\n\n","category":"type"},{"location":"#Import-plan-table-schemas","page":"API Documentation","title":"Import plan table schemas","text":"","category":"section"},{"location":"","page":"API Documentation","title":"API Documentation","text":"PlanV2\nFilePlanV2\nwrite_plan","category":"page"},{"location":"#OndaEDFSchemas.PlanV2","page":"API Documentation","title":"OndaEDFSchemas.PlanV2","text":"@version PlanV2 begin\n # EDF.SignalHeader fields\n label::String\n transducer_type::String\n physical_dimension::String\n physical_minimum::Float32\n physical_maximum::Float32\n digital_minimum::Float32\n digital_maximum::Float32\n prefilter::String\n samples_per_record::Int16\n # EDF.FileHeader field\n seconds_per_record::Float64\n # Onda.SignalV2 fields (channels -> channel), may be missing\n recording::Union{UUID,Missing} = passmissing(UUID)\n sensor_type::Union{Missing,AbstractString}\n sensor_label::Union{Missing,AbstractString}\n channel::Union{Missing,AbstractString}\n sample_unit::Union{Missing,AbstractString}\n sample_resolution_in_unit::Union{Missing,Float64}\n sample_offset_in_unit::Union{Missing,Float64}\n sample_type::Union{Missing,AbstractString}\n sample_rate::Union{Missing,Float64}\n # errors, use `nothing` to indicate no error\n error::Union{Nothing,String}\nend\n\nA Legolas-generated record type describing a single EDF signal-to-Onda channel conversion. The columns are the union of\n\nfields from EDF.SignalHeader (all mandatory)\nthe seconds_per_record field from EDF.FileHeader (mandatory)\nfields from Onda.SignalV2 (optional, may be missing to indicate failed conversion), except for file_path\nerror, which is nothing for a conversion that is or is expected to be successful, and a String describing the source of the error (with backtrace) in the case of a caught error.\n\n\n\n\n\n","category":"type"},{"location":"#OndaEDFSchemas.FilePlanV2","page":"API Documentation","title":"OndaEDFSchemas.FilePlanV2","text":"@version FilePlanV2 > PlanV2 begin\n edf_signal_index::Int\n onda_signal_index::Int\nend\n\nA Legolas-generated record type representing one EDF signal-to-Onda channel conversion, which includes the columns of a PlanV2 and additional file-level context:\n\nedf_signal_index gives the index of the signals in the source EDF.File corresponding to this row\nonda_signal_index gives the index of the output Onda.Samples.\n\nNote that while the EDF index does correspond to the actual index in edf.signals, some Onda indices may be skipped in the output, so onda_signal_index is only to indicate order and grouping.\n\n\n\n\n\n","category":"type"},{"location":"#OndaEDF.write_plan","page":"API Documentation","title":"OndaEDF.write_plan","text":"write_plan(io_or_path, plan_table; validate=true, kwargs...)\n\nWrite a plan table to io_or_path using Legolas.write, using the ondaedf.file-plan@1 schema.\n\n\n\n\n\n","category":"function"},{"location":"#Full-service-import","page":"API Documentation","title":"Full-service import","text":"","category":"section"},{"location":"","page":"API Documentation","title":"API Documentation","text":"For a more \"full-service\" experience, OndaEDF.jl also provides functionality to extract Onda.Samples and EDFAnnotationV1s and then write them to disk:","category":"page"},{"location":"","page":"API Documentation","title":"API Documentation","text":"store_edf_as_onda","category":"page"},{"location":"#OndaEDF.store_edf_as_onda","page":"API Documentation","title":"OndaEDF.store_edf_as_onda","text":"store_edf_as_onda(edf::EDF.File, onda_dir, recording_uuid::UUID=uuid4();\n custom_extractors=STANDARD_EXTRACTORS, import_annotations::Bool=true,\n postprocess_samples=identity,\n signals_prefix=\"edf\", annotations_prefix=signals_prefix)\n\nConvert an EDF.File to Onda.Samples and Onda.Annotations, store the samples in $path/samples/, and write the Onda signals and annotations tables to $path/$(signals_prefix).onda.signals.arrow and $path/$(annotations_prefix).onda.annotations.arrow. The default prefix is \"edf\", and if a prefix is provided for signals but not annotations both will use the signals prefix. The prefixes cannot reference (sub)directories.\n\nReturns (; recording_uuid, signals, annotations, signals_path, annotations_path, plan).\n\nThis is a convenience function that first formulates an import plan via plan_edf_to_onda_samples, and then immediately executes this plan with edf_to_onda_samples.\n\nThe samples and executed plan are returned; it is strongly advised that you review the plan for un-extracted signals (where :sensor_type or :channel is missing) and errors (non-nothing values in :error).\n\nGroups of EDF.Signals are mapped as channels to Onda.Samples via plan_edf_to_onda_samples. The caller of this function can control the plan via the labels and units keyword arguments, all of which are forwarded to plan_edf_to_onda_samples.\n\nEDF.Signal labels that are converted into Onda channel names undergo the following transformations:\n\nthe label is whitespace-stripped, parens-stripped, and lowercased\ntrailing generic EDF references (e.g. \"ref\", \"ref2\", etc.) are dropped\nany instance of + is replaced with _plus_ and / with _over_\nall component names are converted to their \"canonical names\" when possible (e.g. \"3\" in an ECG-matched channel name will be converted to \"iii\").\n\nIf more control (e.g. preprocessing signal labels) is required, callers should use plan_edf_to_onda_samples and edf_to_onda_samples directly, and Onda.store the resulting samples manually.\n\nSee the OndaEDF README for additional details regarding EDF formatting expectations.\n\n\n\n\n\n","category":"function"},{"location":"#Internal-import-utilities","page":"API Documentation","title":"Internal import utilities","text":"","category":"section"},{"location":"","page":"API Documentation","title":"API Documentation","text":"OndaEDF.match_edf_label\nOndaEDF.merge_samples_info\nOndaEDF.onda_samples_from_edf_signals\nOndaEDF.promote_encodings","category":"page"},{"location":"#OndaEDF.match_edf_label","page":"API Documentation","title":"OndaEDF.match_edf_label","text":"OndaEDF.match_edf_label(label, signal_names, channel_name, canonical_names)\n\nReturn a normalized label matched from an EDF label. The purpose of this function is to remove signal names from the label, and to canonicalize the channel name(s) that remain. So something like \"[eCG] avl-REF\" will be transformed to \"avl\" (given signal_names=[\"ecg\"], and channel_name=\"avl\")\n\nThis returns nothing if channel_name does not match after normalization.\n\nCanonicalization\n\nensures the given label is whitespace-stripped, lowercase, and parens-free\nstrips trailing generic EDF references (e.g. \"ref\", \"ref2\", etc.)\nreplaces all references with the appropriate name as specified by canonical_names\nreplaces + with _plus_ and / with _over_\nreturns the initial reference name (w/o prefix sign, if present) and the entire label; the initial reference name should match the canonical channel name, otherwise the channel extraction will be rejected.\n\nExamples\n\nmatch_edf_label(\"[ekG] avl-REF\", [\"ecg\", \"ekg\"], \"avl\", []) == \"avl\"\nmatch_edf_label(\"ECG 2\", [\"ecg\", \"ekg\"], \"ii\", [\"ii\" => [\"2\", \"two\", \"ecg2\"]]) == \"ii\"\n\nSee the tests for more examples\n\nnote: Note\nThis is an internal function and is not meant to be called directly.\n\n\n\n\n\n","category":"function"},{"location":"#OndaEDF.merge_samples_info","page":"API Documentation","title":"OndaEDF.merge_samples_info","text":"OndaEDF.merge_samples_info(plan_rows)\n\nCreate a single, merged SamplesInfo from plan rows, such as generated by plan_edf_to_onda_samples. Encodings are promoted with promote_encodings.\n\nThe input rows must have the same values for :sensor_type, :sample_unit, and :sample_rate; otherwise an ArgumentError is thrown.\n\nIf any of these values is missing, or any row's :channel value is missing, this returns missing to indicate it is not possible to determine a shared SamplesInfo.\n\nThe original EDF labels are included in the output in the :edf_channels column.\n\nnote: Note\nThis is an internal function and is not meant to be called direclty.\n\n\n\n\n\n","category":"function"},{"location":"#OndaEDF.onda_samples_from_edf_signals","page":"API Documentation","title":"OndaEDF.onda_samples_from_edf_signals","text":"OndaEDF.onda_samples_from_edf_signals(target::Onda.SamplesInfo, edf_signals,\n edf_seconds_per_record; dither_storage=missing)\n\nGenerate an Onda.Samples struct from an iterable of EDF.Signals, based on the Onda.SamplesInfo in target. This checks for matching sample rates in the source signals. If the encoding of target is the same as the encoding in a signal, its encoded (usually Int16) data is copied directly into the Samples data matrix; otherwise it is re-encoded.\n\nIf dither_storage=missing (the default), dither storage is allocated automatically as specified in the docstring for Onda.encode. dither_storage=nothing disables dithering. See Onda.encode's docstring for more details.\n\nnote: Note\nThis function is not meant to be called directly, but through edf_to_onda_samples\n\n\n\n\n\n","category":"function"},{"location":"#OndaEDF.promote_encodings","page":"API Documentation","title":"OndaEDF.promote_encodings","text":"promote_encodings(encodings; pick_offset=(_ -> 0.0), pick_resolution=minimum)\n\nReturn a common encoding for input encodings, as a NamedTuple with fields sample_type, sample_offset_in_unit, sample_resolution_in_unit, and sample_rate. If input encodings' sample_rates are not all equal, an error is thrown. If sample rates/offests are not equal, then pick_offset and pick_resolution are used to combine them into a common offset/resolution.\n\nnote: Note\nThis is an internal function and is not meant to be called direclty.\n\n\n\n\n\n","category":"function"},{"location":"#Export-EDF-from-Onda","page":"API Documentation","title":"Export EDF from Onda","text":"","category":"section"},{"location":"","page":"API Documentation","title":"API Documentation","text":"onda_to_edf","category":"page"},{"location":"#OndaEDF.onda_to_edf","page":"API Documentation","title":"OndaEDF.onda_to_edf","text":"onda_to_edf(samples::AbstractVector{<:Samples}, annotations=[]; kwargs...)\n\nReturn an EDF.File containing signal data converted from a collection of Onda Samples and (optionally) annotations from an annotations table.\n\nFollowing the Onda v0.5 format, annotations can be any Tables.jl-compatible table (DataFrame, Arrow.Table, NamedTuple of vectors, vector of NamedTuples) which follows the annotation schema.\n\nEach EDF.Signal in the returned EDF.File corresponds to a channel of an input Onda.Samples.\n\nThe ordering of EDF.Signals in the output will match the order of the input collection of Samples (and within each channel grouping, the order of the samples' channels).\n\nnote: Note\nEDF signals are encoded as Int16, while Onda allows a range of different sample types, some of which provide considerably more resolution than Int16. During export, re-encoding may be necessary if the encoded Onda samples cannot be represented directly as Int16 values. In this case, new encoding (resolution and offset) will be chosen based on the minimum and maximum values actually present in each signal in the input Onda Samples. Thus, it may not always be possible to losslessly round trip Onda-formatted datasets to EDF and back.\n\n\n\n\n\n","category":"function"},{"location":"#Deprecations","page":"API Documentation","title":"Deprecations","text":"","category":"section"},{"location":"","page":"API Documentation","title":"API Documentation","text":"To support deserializing plan tables generated with old versions of OndaEDF + Onda, the following schemas are provided. These are deprecated and will be removed in a future release.","category":"page"},{"location":"","page":"API Documentation","title":"API Documentation","text":"PlanV1\nFilePlanV1","category":"page"},{"location":"#OndaEDFSchemas.PlanV1","page":"API Documentation","title":"OndaEDFSchemas.PlanV1","text":"@version PlanV1 begin\n # EDF.SignalHeader fields\n label::String\n transducer_type::String\n physical_dimension::String\n physical_minimum::Float32\n physical_maximum::Float32\n digital_minimum::Float32\n digital_maximum::Float32\n prefilter::String\n samples_per_record::Int16\n # EDF.FileHeader field\n seconds_per_record::Float64\n # Onda.SignalV1 fields (channels -> channel), may be missing\n recording::Union{UUID,Missing} = passmissing(UUID)\n kind::Union{Missing,AbstractString}\n channel::Union{Missing,AbstractString}\n sample_unit::Union{Missing,AbstractString}\n sample_resolution_in_unit::Union{Missing,Float64}\n sample_offset_in_unit::Union{Missing,Float64}\n sample_type::Union{Missing,AbstractString}\n sample_rate::Union{Missing,Float64}\n # errors, use `nothing` to indicate no error\n error::Union{Nothing,String}\nend\n\nA Legolas-generated record type describing a single EDF signal-to-Onda channel conversion. The columns are the union of\n\nfields from EDF.SignalHeader (all mandatory)\nthe seconds_per_record field from EDF.FileHeader (mandatory)\nfields from Onda.SignalV1 (optional, may be missing to indicate failed conversion), except for file_path\nerror, which is nothing for a conversion that is or is expected to be successful, and a String describing the source of the error (with backtrace) in the case of a caught error.\n\n\n\n\n\n","category":"type"},{"location":"#OndaEDFSchemas.FilePlanV1","page":"API Documentation","title":"OndaEDFSchemas.FilePlanV1","text":"@version FilePlanV1 > PlanV1 begin\n edf_signal_index::Int\n onda_signal_index::Int\nend\n\nA Legolas-generated record type representing one EDF signal-to-Onda channel conversion, which includes the columns of a PlanV1 and additional file-level context:\n\nedf_signal_index gives the index of the signals in the source EDF.File corresponding to this row\nonda_signal_index gives the index of the output Onda.Samples.\n\nNote that while the EDF index does correspond to the actual index in edf.signals, some Onda indices may be skipped in the output, so onda_signal_index is only to indicate order and grouping.\n\n\n\n\n\n","category":"type"}] +[{"location":"convert-to-onda/#An-Opinionated-Guide-to-Converting-EDFs-to-Onda","page":"Converting from EDF","title":"An Opinionated Guide to Converting EDFs to Onda","text":"","category":"section"},{"location":"convert-to-onda/#Basic-workflow","page":"Converting from EDF","title":"Basic workflow","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"At a high level, the basic workflow for EDF-to-Onda conversion is iterative:","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"Formulate a \"plan\" which specifies how to convert the metadata associated with each EDF.Signal into Onda metadata (channel names, quantization/encoding parameters, etc.)\nReview the plan, making sure that all necessary EDF.Signals will be extracted, and that the quantization, sample rate, physical units, etc. are reasonable.\nRevise the plan as needed, repeating steps 1-2 until you're happy.\nExecute the plan, loading all EDF signal data if necessary and converting into Onda.Samples\nReview the executed plan for additional errors or issues, and iterate steps 1-4 as needed.","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"In the following sections, we expand on the philosophy behind OndaEDF's EDF-to-Onda design and present some detailed, opinionated workflows for converting a single EDF and multiple EDFs.","category":"page"},{"location":"convert-to-onda/#Philosophy","page":"Converting from EDF","title":"Philosophy","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"The motivation for separating the planning and execution is threefold.","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"First, while there is an EDF(+) specification, it's so commonly violated in so many various ways that an EDF conversion package that requires fully spec-compliant EDFs is of little practical use, but at the same time, anticipating and working around all these possible violations is not practical. Separating planning and execution during EDF-to-Onda conversion reverts more control to the user for how their particular EDF files are handled.","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"Second, making the plan a separate intermediate output means that not only can it be reviewed during conversion but can be persisted as a record of how any EDF-derived Onda signals were converted. This kind of provenance information is very useful when investigating issues with a dataset that may crop up long after the initial conversion.","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"Third, planning only requires that the headers of the EDF.Signals be read into memory, thereby separating the iterative part of the conversion process from the expensive, one-time step which requires all the signal data be read into memory. This enables workflows that would be impractical otherwise, like planning bulk conversion of thousands of EDFs at once. When dealing with large, messy datasets, we have found that metadata issues are both likely to occur and likely to be different across individual EDFs. This makes normalizing the EDF metadata one file at a time extremely tedious, since the metadata issues encountered in a single file may not be representative of the rest of the dataset. Thus, in practice it's better to deal with EDF metadata conversion in bulk, and the plan-then-execute workflow enables users to deal with these issues all at once, save out the plan, and then distribute the actual conversion work to as many workers as necessary to execute it in a reasonable timeframe.","category":"page"},{"location":"convert-to-onda/#Converting-a-single-EDF-to-Onda","page":"Converting from EDF","title":"Converting a single EDF to Onda","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"The following steps assume you have read an EDF file into memory with EDF.read or otherwise created an EDF.File. After the detailed workflow for converting a single EDF file to Onda format, we'll discuss how to handle batches of EDF files.","category":"page"},{"location":"convert-to-onda/#Generate-a-plan","page":"Converting from EDF","title":"Generate a plan","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"This is straightforward, using plan_edf_to_onda_samples. As outlined in the documentation for plan_edf_to_onda_samples, a \"plan\" is a table with one row per EDF.Signal, which contains all the fields from the signal's header as well as the fields of the Onda.SamplesInfoV2 that will be generated when the plan is executed (with the caveat that the :channels field is called :channel to indicate that it corresponds to a single channel in the output). It also contains a few additional fields for defining the mapping between EDF and Onda signal indices, as well as a field to capture any errors thrown during planning (or, more likely, during execution of the plan):","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":":edf_signal_index, the 1-based numerical index of the source signal in edf.signals\n:onda_signal_index, the ordinal index of the resulting samples (not necessarily the index into samples, since some groups might be skipped)\n:error, any errors that were caught during planning and/or execution.","category":"page"},{"location":"convert-to-onda/#Review-the-plan.","page":"Converting from EDF","title":"Review the plan.","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"Check for EDF signals whose label or physical_dimension could not be matched using the standard OndaEDF labels and units, as indicated by missing values in the channel/sensor_type (for un-matched label) or sample_unit (for un-matched physical_dimension). It's also a good idea at this point to review the other EDF signal header fields, and how they will be converted to Onda (especially the sample unit, resolution and offset, which correspond to the physical/digital minimum/maximum from the EDF signal header.) It's harder to fix these issues with the numerical signal header fields as they usually point to issues with how the data was encoded into an EDF initially. However, it's still better to detect and document any issues with the underlying EDF data at this stage to prevent nasty surprises down the road.","category":"page"},{"location":"convert-to-onda/#Revise-the-plan","page":"Converting from EDF","title":"Revise the plan","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"If there are EDF signals with un-matched label or physical_dimension, you have a few options. We recommend you consider them in roughly this order.","category":"page"},{"location":"convert-to-onda/#Skip-them","page":"Converting from EDF","title":"Skip them","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"The first option to consider is to simply ignore these signals; not all signals are necessarily required for downstream use, and converting each and every signal in an EDF may be more work than is justified!","category":"page"},{"location":"convert-to-onda/#Provide-custom-labels-and-units","page":"Converting from EDF","title":"Provide custom labels and units","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"The second option you have is to provide custom labels= and units= keyword arguments to plan_edf_to_onda_samples. For unambiguous, spec-compliant labels and physical_dimensions, it's generally possible to create custom label= or unit= specifications to match them.","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"note: Note\nCustom labels should be specified as lowercase, without reference, and without the sensor type prefix. So to match a label like \"EEG R1-Ref\", use a label like \"eeg\" => [\"r1\"], and not \"EEG\" => [\"R1\"] or \"eeg\" => [\"r1-ref\"]. See the documentation for plan_edf_to_onda_samples for more details, and the internal OndaEDF.match_edf_label for low-level details of how labels are matched.","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"warning: Warning\nSometimes EDF labels are ambiguous and can be matched by multiple different OndaEDF label= specifications. Matching is greedy, in that the first label specification that matches is used regardless of any other possible matches, so you should add your custom labels to the end of an existing set, as inmy_labels = collect(pairs(OndaEDF.STANDARD_LABELS))\npush!(my_labels, [\"eog\"] => [\"left\" => \"eyeleft\", \"right\" => \"eyeright\"])When using custom labels, make sure that they haven't accidentally changed how other labels are matched by reviewing the plan for any unintended changes.","category":"page"},{"location":"convert-to-onda/#Preprocess-signal-headers","page":"Converting from EDF","title":"Preprocess signal headers","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"The third option, for signals that must be converted and cannot be handled with custom labels (without undue hassle) is to pre-process the signal headers before generating the plan. While the canonical input to plan_edf_to_onda_samples is an EDF.File, the header-matching logic operates fundamentally one signal header at a time. Moreover, it does not actually require that the input be an EDF.SignalHeader, only that it have the same fields as an EDF.SignalHeader. This design decision is meant to support workflows where the signal headers cannot for some reason be processed as-is due to corrupt/malformed strings, labels that cannot be matched using the OndaEDF matching algorithm, or any other reason.","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"For example, we've encountered EDFs in the wild where the transducer_type and label fields are switched, and must be switched back before planning:","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"edf = EDF.File(my_edf_file_path)\n\nfunction corrected_header(signal::EDF.Signal)\n header = signal.header\n return Tables.rowmerge(header; \n label=header.transducer_type, \n transducer_type=header.label)\nend\n\nplans = map(plan_edf_to_onda_samples ∘ corrected_header, edf.signals)\nnew_plan = plan_edf_to_onda_samples_groups(plans)","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"Note that an additional step of plan_edf_to_onda_samples_groups is required after planning the individual signals. This is due to the fact that EDF is a \"single channel\" format, where each signal is only a single channel, while Onda is a \"multichannel\" format where a signal can have mmultiple channels as long as the sampling rate, quantization, and other metadata are consistent. Normally, calling plan_edf_to_onda_samples with an EDF.File will do this grouping for you, but when planning individually pre-processed signal headers, we have to do it ourselves at the end.","category":"page"},{"location":"convert-to-onda/#Modify-the-generated-plan","page":"Converting from EDF","title":"Modify the generated plan","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"The fourth and final option is to modify the generated plan itself. This is the least preferred method because it removes a number of safeguards that OndaEDF provides as part of the planning process, but it's also the most flexible in that it enables completely hand-crafted conversion. Here are a few examples, motivated by EDFs we have seen in the wild.","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"Some EEG signals have the physical units set to millivolts, but biologically generated EEG signals are generally on the order of microvolts. During import, you want to correct this by adjusting the encoding settings used by Onda to store samples, by scaling the sample offset and resolution by 1000 and setting the physical units. This can be accomplished by modifying the rows of the plan like so:","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"edf = EDF.File(my_edf_file_path)\nplans = plan_edf_to_onda_samples(edf; label=my_labels)\n\nfunction fix_millivolts(plan)\n if plan.sample_unit == \"millivolt\" && plan.sensor_type == \"eeg\"\n sample_resolution_in_unit = plan.sample_resolution_in_unit * 1000\n sample_offset_in_unit = plan.sample_offset_in_unit * 1000\n return Tables.rowmerge(plan; sample_unit=\"microvolt\",\n sample_resolution_in_unit,\n sample_offset_in_unit)\n else\n return plan\n end\nend\n\nnew_plan = map(fix_millivolts, Tables.rows(plans))","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"As another, similar example, sometimes EMG channels get recorded with different physical units. In such a case, OndaEDF cannot merge these channels and will create multiple separate Samples objects which each have sensor_type = \"emg\". This can be corrected in a similar way, for exmaple by converting millivolts to microvolts (adjusting of course depending on the nature of your dataset) and re-grouping into Onda samples:","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"edf = EDF.File(my_edf_file_path)\nplans = plan_edf_to_onda_samples(edf; label=my_labels)\n\nfunction fix_emg(plan)\n if plan.sensor_type == \"emg\"\n if plan.sample_unit == \"millivolt\"\n sample_resolution_in_unit = plan.sample_resolution_in_unit * 1000\n sample_offset_in_unit = plan.sample_offset_in_unit * 1000\n plan = Tables.rowmerge(plan; sample_unit=\"microvolt\",\n sample_resolution_in_unit,\n sample_offset_in_unit)\n end\n return plan\n else\n return plan\n end\nend\n\nnew_plan = map(fix_emg, Tables.rows(plans))\n# re-compute the grouping of EDF signals into Onda signals:\nnew_plan = plan_edf_to_onda_samples_groups(new_plan)","category":"page"},{"location":"convert-to-onda/#Execute-the-plan","page":"Converting from EDF","title":"Execute the plan","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"Once the plan has been reviewed and deemed satisfactory, execute the plan to generate Onda.Samples and an \"executed plan\" record. This is accomplished with the edf_to_onda_samples function, which takes an EDF.File and a plan as input, and returns a vector of Onda.Samples and the plan as executed. The executed plan may differ from the input plan. Most notably, if any errors were encountered during execution, they will be caught and the error and stacktrace will be stored as strings in the error field. It is important to review the executed plan a final time to ensure everything was converted as expected and no unexpected errors were encountered. If any errors were encountered, you may need to iterate further.","category":"page"},{"location":"convert-to-onda/#Store-the-output","page":"Converting from EDF","title":"Store the output","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"The final step is to store both the Onda.Samples and the executed plan in some persistent storage. For storing Onda.Samples, see Onda.store, which supports serializing LPCM-encoded samples to any \"path-like\" type (i.e., anything that provides a method for write). For storing the plan, use OndaEDF.write_plan (or Legolas.write(file_path, plan, FilePlanV2SchemaVersion()) (see the documentation for Legolas.write and FilePlanV2.","category":"page"},{"location":"convert-to-onda/#Batch-conversion-of-many-EDFs","page":"Converting from EDF","title":"Batch conversion of many EDFs","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"The workflow for bulk conversion of multiple EDFs is similar to the workflow for converting a single EDF. The major difference is that the \"planning\" steps can be conducted in bulk, while the \"execution\" steps (generally) need to be conducted one at a time, either serially or distributed across multiple workers. As discussed above, the planning stage requires only a few KB from the EDF file/signal headers, facilitating rapid plan-review-revise iteration of even fairly large collections of EDFs (10,000+).","category":"page"},{"location":"convert-to-onda/#Planning-multiple-EDFs","page":"Converting from EDF","title":"Planning multiple EDFs","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"The main factor to consider when planning conversion of a large batch of EDF files is that planning requires only the (small number) of header bytes, even for very large EDF files. Thus, the first step is to read the file headers into memory without reading the signal data itself (which for more than a few EDF files will not usually fit into memory due to the large amount of signal data found in EDF files).","category":"page"},{"location":"convert-to-onda/#Reading-headers-from-local-filesystem","page":"Converting from EDF","title":"Reading headers from local filesystem","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"For EDF files stored on a normal filesystem, the EDF.File constructor will by default create a \"header-only\" EDF.File, so multiple files' headers can be read like","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"files = map(edf_paths) do path\n open(EDF.File, path, \"r\")\nend","category":"page"},{"location":"convert-to-onda/#Reading-headers-from-S3","page":"Converting from EDF","title":"Reading headers from S3","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"note: Note\nThis section may become obsolete in a future version of EDF.jl which uses the conditional dependency functionality available from Julia 1.9+ to provide tighter integration with AWSS3.jl.","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"Unfortunately, open(path::S3Path) will fetch the entire contents of the object stored at path, so we need to be a bit clever to read only header bytes from an S3 file, especially given that the number of bytes we need to read depends on the number of signals. The following is an example of one technique for reading EDF file and signal headers from S3:","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"function EDF.read_file_header(path::S3Path)\n bytes = s3_get(path.bucket, path.key; byte_range=1:256)\n buffer = IOBuffer(bytes)\n return EDF.read_file_header(buffer)\nend\n\nfunction EDF.File(path::S3Path)\n _, n_signals = EDF.read_file_header(path)\n bytes = s3_get(path.bucket, path.key; byte_range=1:(256 * (n_signals + 1)))\n return EDF.File(IOBuffer(bytes))\nend\n\n# use asyncmap because this is mostly bound by request roundtrip latency\nfiles = asyncmap(EDF.File, edf_paths)","category":"page"},{"location":"convert-to-onda/#Concatenating-plans-into-one-big-table","page":"Converting from EDF","title":"Concatenating plans into one big table","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"When doing bulk review of plans, it's generally helpful to have the individual files' plans concatenated into a single large table. It's important to keep track of which plan rows corresopnd to which input file, which can be accomplished via something like this:","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"# create a UUID namespace to make recording ID generation idempotent\nconst NAMESPACE = UUID(...)\nfunction plan_all(edf_paths, files; kwargs...)\n plans = mapreduce(vcat, edf_paths, files) do origin_uri, edf\n plan = plan_edf_to_onda_samples(edf; kwargs...)\n plan = DataFrame(plan)\n # make sure this is the same every time this function is re-run!\n recording = uuid5(NAMESPACE, string(origin_uri))\n return insertcols!(plan, \n :origin_uri => origin_uri,\n :recording => recording)\n end\nend","category":"page"},{"location":"convert-to-onda/#Review-and-revise-the-plans","page":"Converting from EDF","title":"Review and revise the plans","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"This \"bulk plan\" table can then be reviewed in bulk, looking for patterns in which labels are not matched, physical units associated with each sensor_type, etc. At a minimum, we find it useful to print some basic counts:","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"plans = plan_all(...)\n# helper function to tally rows per group\ntally(df, g, agg...=nrow => :count) = combine(groupby(df, g), agg...)\nunmatched_labels = filter(:channel => ismissing, plans)\n@info \"unmatched labels:\" tally(unmatched_labels, :label)\n\nunmatched_units = filter(:sample_unit => ismissing, plans)\n@info \"unmatched labels:\" tally(unmatched_units, :physical_dimension)\n\nmatched = subset(plans, :channel => ByRow(!ismissing), :sample_unit => ByRow(!ismissing))\n@info \"matched sensor types/channels:\" tally(matched, [:sensor_type, :channel, :sample_unit])","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"Reviewing these summaries is a good first step when revising the plans. The revision process is basically the same as with a single EDF: update the labels= and units= as needed to capture any un-matched EDF signals, and failing that, preprocess the headers/postprocess the plan. Note that if it is necessary to run plan_edf_to_onda_samples_groups, this must be done one file at a time, using something like this to preserve the recording-level keys created above:","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"new_plans = combine(groupby(plans, [:recording, :origin_uri])) do plan\n new_plan = plan_edf_to_onda_samples_groups(Tables.rows(plan))\n return DataFrame(new_plan)\nend","category":"page"},{"location":"convert-to-onda/#Executing-bulk-plans-and-storing-generated-samples","page":"Converting from EDF","title":"Executing bulk plans and storing generated samples","text":"","category":"section"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"The last step, as with single EDF conversion, is to execute the plans. Given that this requires loading signal data into memory, it's generally necessary to do this one recording at a time, either serially on a single process or using multiprocessing to distribute work over different processes or even machines. A complete introduction to multiprocessing in Julia is outside the scope of this guide, but we offer a few pointers in the hope that we can help avoid common pitfalls.","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"First, it's generally a good idea to create a function that accepts one recording's plan, EDF file path, and recording ID (or generally any additional metadata that is required to create a persistent record), which will execute the plan and persistently store the resulting samples and executed plan. This function then may return either the generated Onda.SignalV2 and OndaEDF.FilePlanV2 tables for the completed recording, or pointers to where these are stored. This way, the memory pressure involved in loading an entire EDF's signal data is confined to function scope which makes it slightly easier for Julia's garbage collector.","category":"page"},{"location":"convert-to-onda/","page":"Converting from EDF","title":"Converting from EDF","text":"Second, a separate function should handle coordinating these individual jobs and then collecting these results into the ultimate aggregate signal and plan tables, and then persistently storing those to a final destination.","category":"page"},{"location":"api/#API-Documentation","page":"API Documentation","title":"API Documentation","text":"","category":"section"},{"location":"api/","page":"API Documentation","title":"API Documentation","text":"CurrentModule = OndaEDF","category":"page"},{"location":"api/#Import-EDF-to-Onda","page":"API Documentation","title":"Import EDF to Onda","text":"","category":"section"},{"location":"api/","page":"API Documentation","title":"API Documentation","text":"OndaEDF.jl prefers \"self-service\" import over \"automagic\", and provides functionality to extract Onda.Samples and EDFAnnotationV1s (which extend Onda.AnnotationV1s) from an EDF.File. These can be written to disk (with Onda.store / Legolas.write or manipulated in memory as desired.","category":"page"},{"location":"api/#Import-signal-data-as-Samples","page":"API Documentation","title":"Import signal data as Samples","text":"","category":"section"},{"location":"api/","page":"API Documentation","title":"API Documentation","text":"edf_to_onda_samples\nplan_edf_to_onda_samples\nplan_edf_to_onda_samples_groups","category":"page"},{"location":"api/#OndaEDF.edf_to_onda_samples","page":"API Documentation","title":"OndaEDF.edf_to_onda_samples","text":"edf_to_onda_samples(edf::EDF.File, plan_table; validate=true, dither_storage=missing)\n\nConvert Signals found in an EDF File to Onda.Samples according to the plan specified in plan_table (e.g., as generated by plan_edf_to_onda_samples), returning an iterable of the generated Onda.Samples and the plan as actually executed.\n\nThe input plan is transformed by using merge_samples_info to combine rows with the same :onda_signal_index into a common Onda.SamplesInfo. Then OndaEDF.onda_samples_from_edf_signals is used to combine the EDF signals data into a single Onda.Samples per group.\n\nThe label of the original EDF.Signals are preserved in the :edf_channels field of the resulting SamplesInfos for each Samples generated.\n\nAny errors that occur are shown as Strings (with backtrace) and inserted into the :error column for the corresponding rows from the plan.\n\nSamples are returned in the order of :onda_signal_index. Signals that could not be matched or otherwise caused an error during execution are not returned.\n\nIf validate=true (the default), the plan is validated against the FilePlanV2 schema, and the signal headers in the EDF.File.\n\nIf dither_storage=missing (the default), dither storage is allocated automatically as specified in the docstring for Onda.encode. dither_storage=nothing disables dithering.\n\n\n\n\n\nedf_to_onda_samples(edf::EDF.File; kwargs...)\n\nRead signals from an EDF.File into a vector of Onda.Samples. This is a convenience function that first formulates an import plan via plan_edf_to_onda_samples, and then immediately executes this plan with edf_to_onda_samples.\n\nThe samples and executed plan are returned; it is strongly advised that you review the plan for un-extracted signals (where :sensor_type or :channel is missing) and errors (non-nothing values in :error).\n\nCollections of EDF.Signals are mapped as channels to Onda.Samples via plan_edf_to_onda_samples. The caller of this function can control the plan via the labels and units keyword arguments, all of which are forwarded to plan_edf_to_onda_samples.\n\nEDF.Signal labels that are converted into Onda channel names undergo the following transformations:\n\nthe label is whitespace-stripped, parens-stripped, and lowercased\ntrailing generic EDF references (e.g. \"ref\", \"ref2\", etc.) are dropped\nany instance of + is replaced with _plus_ and / with _over_\nall component names are converted to their \"canonical names\" when possible (e.g. \"m1\" in an EEG-matched channel name will be converted to \"a1\").\n\nSee the OndaEDF README for additional details regarding EDF formatting expectations.\n\n\n\n\n\n","category":"function"},{"location":"api/#OndaEDF.plan_edf_to_onda_samples","page":"API Documentation","title":"OndaEDF.plan_edf_to_onda_samples","text":"plan_edf_to_onda_samples(header, seconds_per_record; labels=STANDARD_LABELS,\n units=STANDARD_UNITS)\nplan_edf_to_onda_samples(signal::EDF.Signal, args...; kwargs...)\n\nFormulate a plan for converting an EDF signal into Onda format. This returns a Tables.jl row with all the columns from the signal header, plus additional columns for the Onda.SamplesInfo for this signal, and the seconds_per_record that is passed in here.\n\nIf no labels match, then the channel and kind columns are missing; the behavior of other SamplesInfo columns is undefined; they are currently set to missing but that may change in future versions.\n\nAny errors that are thrown in the process will be wrapped as SampleInfoErrors and then printed with backtrace to a String in the error column.\n\nMatching EDF label to Onda labels\n\nThe labels keyword argument determines how Onda channel and signal kind are extracted from the EDF label.\n\nLabels are specified as an iterable of signal_names => channel_names pairs. signal_names should be an iterable of signal names, the first of which is the canonical name used as the Onda kind. Each element of channel_names gives the specification for one channel, which can either be a string, or a canonical_name => alternates pair. Occurences of alternates will be replaces with canonical_name in the generated channel label.\n\nMatching is determined solely by the channel names. When matching, the signal names are only used to remove signal names occuring as prefixes (e.g., \"[ECG] AVL\") before matching channel names. See match_edf_label for details, and see OndaEDF.STANDARD_LABELS for the default labels.\n\nAs an example, here is (a subset of) the default labels for ECG signals:\n\n[\"ecg\", \"ekg\"] => [\"i\" => [\"1\"], \"ii\" => [\"2\"], \"iii\" => [\"3\"],\n \"avl\"=> [\"ecgl\", \"ekgl\", \"ecg\", \"ekg\", \"l\"], \n \"avr\"=> [\"ekgr\", \"ecgr\", \"r\"], ...]\n\nMatching is done in the order that labels iterates pairs, and will stop at the first match, with no warning if signals are ambiguous (although this may change in a future version)\n\n\n\n\n\nplan_edf_to_onda_samples(edf::EDF.File;\n labels=STANDARD_LABELS,\n units=STANDARD_UNITS,\n onda_signal_groupby=(:sensor_type, :sample_unit, :sample_rate))\n\nFormulate a plan for converting an EDF.File to Onda Samples. This applies plan_edf_to_onda_samples to each individual signal contained in the file, storing edf_signal_index as an additional column. \n\nThe resulting rows are then passed to plan_edf_to_onda_samples_groups and grouped according to onda_signal_groupby (by default, the :sensor_type, :sample_unit, and :sample_rate columns), and the group index is added as an additional column in onda_signal_index.\n\nThe resulting plan is returned as a table. No signal data is actually read from the EDF file; to execute this plan and generate Onda.Samples, use edf_to_onda_samples. The index of the EDF signal (after filtering out signals that are not EDF.Signals, e.g. annotation channels) for each row is stored in the :edf_signal_index column, and the rows are sorted in order of :onda_signal_index, and then by :edf_signal_index.\n\n\n\n\n\n","category":"function"},{"location":"api/#OndaEDF.plan_edf_to_onda_samples_groups","page":"API Documentation","title":"OndaEDF.plan_edf_to_onda_samples_groups","text":"plan_edf_to_onda_samples_groups(plan_rows; onda_signal_groupby=(:sensor_type, :sample_unit, :sample_rate))\n\nGroup together plan_rows based on the values of the onda_signal_groupby columns, creating the :onda_signal_index column and promoting the Onda encodings for each group using OndaEDF.promote_encodings.\n\nIf the :edf_signal_index column is not present or otherwise missing, it will be filled in based on the order of the input rows.\n\nThe updated rows are returned, sorted first by the columns named in onda_signal_groupby and second by order of occurrence within the input rows.\n\n\n\n\n\n","category":"function"},{"location":"api/#Import-annotations","page":"API Documentation","title":"Import annotations","text":"","category":"section"},{"location":"api/","page":"API Documentation","title":"API Documentation","text":"edf_to_onda_annotations\nEDFAnnotationV1","category":"page"},{"location":"api/#OndaEDF.edf_to_onda_annotations","page":"API Documentation","title":"OndaEDF.edf_to_onda_annotations","text":"edf_to_onda_annotations(edf::EDF.File, uuid::UUID)\n\nExtract EDF+ annotations from an EDF.File for recording with ID uuid and return them as a vector of Onda.Annotations. Each returned annotation has a value field that contains the string value of the corresponding EDF+ annotation.\n\nIf no EDF+ annotations are found in edf, then an empty Vector{Annotation} is returned.\n\n\n\n\n\n","category":"function"},{"location":"api/#OndaEDFSchemas.EDFAnnotationV1","page":"API Documentation","title":"OndaEDFSchemas.EDFAnnotationV1","text":"@version EDFAnnotationV1 > AnnotationV1 begin\n value::String\nend\n\nA Legolas-generated record type that represents a single annotation imported from an EDF Annotation signal. The value field contains the annotation value as a string.\n\n\n\n\n\n","category":"type"},{"location":"api/#Import-plan-table-schemas","page":"API Documentation","title":"Import plan table schemas","text":"","category":"section"},{"location":"api/","page":"API Documentation","title":"API Documentation","text":"PlanV2\nFilePlanV2\nwrite_plan","category":"page"},{"location":"api/#OndaEDFSchemas.PlanV2","page":"API Documentation","title":"OndaEDFSchemas.PlanV2","text":"@version PlanV2 begin\n # EDF.SignalHeader fields\n label::String\n transducer_type::String\n physical_dimension::String\n physical_minimum::Float32\n physical_maximum::Float32\n digital_minimum::Float32\n digital_maximum::Float32\n prefilter::String\n samples_per_record::Int16\n # EDF.FileHeader field\n seconds_per_record::Float64\n # Onda.SignalV2 fields (channels -> channel), may be missing\n recording::Union{UUID,Missing} = passmissing(UUID)\n sensor_type::Union{Missing,AbstractString}\n sensor_label::Union{Missing,AbstractString}\n channel::Union{Missing,AbstractString}\n sample_unit::Union{Missing,AbstractString}\n sample_resolution_in_unit::Union{Missing,Float64}\n sample_offset_in_unit::Union{Missing,Float64}\n sample_type::Union{Missing,AbstractString}\n sample_rate::Union{Missing,Float64}\n # errors, use `nothing` to indicate no error\n error::Union{Nothing,String}\nend\n\nA Legolas-generated record type describing a single EDF signal-to-Onda channel conversion. The columns are the union of\n\nfields from EDF.SignalHeader (all mandatory)\nthe seconds_per_record field from EDF.FileHeader (mandatory)\nfields from Onda.SignalV2 (optional, may be missing to indicate failed conversion), except for file_path\nerror, which is nothing for a conversion that is or is expected to be successful, and a String describing the source of the error (with backtrace) in the case of a caught error.\n\n\n\n\n\n","category":"type"},{"location":"api/#OndaEDFSchemas.FilePlanV2","page":"API Documentation","title":"OndaEDFSchemas.FilePlanV2","text":"@version FilePlanV2 > PlanV2 begin\n edf_signal_index::Int\n onda_signal_index::Int\nend\n\nA Legolas-generated record type representing one EDF signal-to-Onda channel conversion, which includes the columns of a PlanV2 and additional file-level context:\n\nedf_signal_index gives the index of the signals in the source EDF.File corresponding to this row\nonda_signal_index gives the index of the output Onda.Samples.\n\nNote that while the EDF index does correspond to the actual index in edf.signals, some Onda indices may be skipped in the output, so onda_signal_index is only to indicate order and grouping.\n\n\n\n\n\n","category":"type"},{"location":"api/#OndaEDF.write_plan","page":"API Documentation","title":"OndaEDF.write_plan","text":"write_plan(io_or_path, plan_table; validate=true, kwargs...)\n\nWrite a plan table to io_or_path using Legolas.write, using the ondaedf.file-plan@1 schema.\n\n\n\n\n\n","category":"function"},{"location":"api/#Full-service-import","page":"API Documentation","title":"Full-service import","text":"","category":"section"},{"location":"api/","page":"API Documentation","title":"API Documentation","text":"For a more \"full-service\" experience, OndaEDF.jl also provides functionality to extract Onda.Samples and EDFAnnotationV1s and then write them to disk:","category":"page"},{"location":"api/","page":"API Documentation","title":"API Documentation","text":"store_edf_as_onda","category":"page"},{"location":"api/#OndaEDF.store_edf_as_onda","page":"API Documentation","title":"OndaEDF.store_edf_as_onda","text":"store_edf_as_onda(edf::EDF.File, onda_dir, recording_uuid::UUID=uuid4();\n custom_extractors=STANDARD_EXTRACTORS, import_annotations::Bool=true,\n postprocess_samples=identity,\n signals_prefix=\"edf\", annotations_prefix=signals_prefix)\n\nConvert an EDF.File to Onda.Samples and Onda.Annotations, store the samples in $path/samples/, and write the Onda signals and annotations tables to $path/$(signals_prefix).onda.signals.arrow and $path/$(annotations_prefix).onda.annotations.arrow. The default prefix is \"edf\", and if a prefix is provided for signals but not annotations both will use the signals prefix. The prefixes cannot reference (sub)directories.\n\nReturns (; recording_uuid, signals, annotations, signals_path, annotations_path, plan).\n\nThis is a convenience function that first formulates an import plan via plan_edf_to_onda_samples, and then immediately executes this plan with edf_to_onda_samples.\n\nThe samples and executed plan are returned; it is strongly advised that you review the plan for un-extracted signals (where :sensor_type or :channel is missing) and errors (non-nothing values in :error).\n\nGroups of EDF.Signals are mapped as channels to Onda.Samples via plan_edf_to_onda_samples. The caller of this function can control the plan via the labels and units keyword arguments, all of which are forwarded to plan_edf_to_onda_samples.\n\nEDF.Signal labels that are converted into Onda channel names undergo the following transformations:\n\nthe label is whitespace-stripped, parens-stripped, and lowercased\ntrailing generic EDF references (e.g. \"ref\", \"ref2\", etc.) are dropped\nany instance of + is replaced with _plus_ and / with _over_\nall component names are converted to their \"canonical names\" when possible (e.g. \"3\" in an ECG-matched channel name will be converted to \"iii\").\n\nIf more control (e.g. preprocessing signal labels) is required, callers should use plan_edf_to_onda_samples and edf_to_onda_samples directly, and Onda.store the resulting samples manually.\n\nSee the OndaEDF README for additional details regarding EDF formatting expectations.\n\n\n\n\n\n","category":"function"},{"location":"api/#Internal-import-utilities","page":"API Documentation","title":"Internal import utilities","text":"","category":"section"},{"location":"api/","page":"API Documentation","title":"API Documentation","text":"OndaEDF.match_edf_label\nOndaEDF.merge_samples_info\nOndaEDF.onda_samples_from_edf_signals\nOndaEDF.promote_encodings","category":"page"},{"location":"api/#OndaEDF.match_edf_label","page":"API Documentation","title":"OndaEDF.match_edf_label","text":"OndaEDF.match_edf_label(label, signal_names, channel_name, canonical_names)\n\nReturn a normalized label matched from an EDF label. The purpose of this function is to remove signal names from the label, and to canonicalize the channel name(s) that remain. So something like \"[eCG] avl-REF\" will be transformed to \"avl\" (given signal_names=[\"ecg\"], and channel_name=\"avl\")\n\nThis returns nothing if channel_name does not match after normalization.\n\nCanonicalization\n\nensures the given label is whitespace-stripped, lowercase, and parens-free\nstrips trailing generic EDF references (e.g. \"ref\", \"ref2\", etc.)\nreplaces all references with the appropriate name as specified by canonical_names\nreplaces + with _plus_ and / with _over_\nreturns the initial reference name (w/o prefix sign, if present) and the entire label; the initial reference name should match the canonical channel name, otherwise the channel extraction will be rejected.\n\nExamples\n\nmatch_edf_label(\"[ekG] avl-REF\", [\"ecg\", \"ekg\"], \"avl\", []) == \"avl\"\nmatch_edf_label(\"ECG 2\", [\"ecg\", \"ekg\"], \"ii\", [\"ii\" => [\"2\", \"two\", \"ecg2\"]]) == \"ii\"\n\nSee the tests for more examples\n\nnote: Note\nThis is an internal function and is not meant to be called directly.\n\n\n\n\n\n","category":"function"},{"location":"api/#OndaEDF.merge_samples_info","page":"API Documentation","title":"OndaEDF.merge_samples_info","text":"OndaEDF.merge_samples_info(plan_rows)\n\nCreate a single, merged SamplesInfo from plan rows, such as generated by plan_edf_to_onda_samples. Encodings are promoted with promote_encodings.\n\nThe input rows must have the same values for :sensor_type, :sample_unit, and :sample_rate; otherwise an ArgumentError is thrown.\n\nIf any of these values is missing, or any row's :channel value is missing, this returns missing to indicate it is not possible to determine a shared SamplesInfo.\n\nThe original EDF labels are included in the output in the :edf_channels column.\n\nnote: Note\nThis is an internal function and is not meant to be called direclty.\n\n\n\n\n\n","category":"function"},{"location":"api/#OndaEDF.onda_samples_from_edf_signals","page":"API Documentation","title":"OndaEDF.onda_samples_from_edf_signals","text":"OndaEDF.onda_samples_from_edf_signals(target::Onda.SamplesInfo, edf_signals,\n edf_seconds_per_record; dither_storage=missing)\n\nGenerate an Onda.Samples struct from an iterable of EDF.Signals, based on the Onda.SamplesInfo in target. This checks for matching sample rates in the source signals. If the encoding of target is the same as the encoding in a signal, its encoded (usually Int16) data is copied directly into the Samples data matrix; otherwise it is re-encoded.\n\nIf dither_storage=missing (the default), dither storage is allocated automatically as specified in the docstring for Onda.encode. dither_storage=nothing disables dithering. See Onda.encode's docstring for more details.\n\nnote: Note\nThis function is not meant to be called directly, but through edf_to_onda_samples\n\n\n\n\n\n","category":"function"},{"location":"api/#OndaEDF.promote_encodings","page":"API Documentation","title":"OndaEDF.promote_encodings","text":"promote_encodings(encodings; pick_offset=(_ -> 0.0), pick_resolution=minimum)\n\nReturn a common encoding for input encodings, as a NamedTuple with fields sample_type, sample_offset_in_unit, sample_resolution_in_unit, and sample_rate. If input encodings' sample_rates are not all equal, an error is thrown. If sample rates/offests are not equal, then pick_offset and pick_resolution are used to combine them into a common offset/resolution.\n\nnote: Note\nThis is an internal function and is not meant to be called direclty.\n\n\n\n\n\n","category":"function"},{"location":"api/#Export-EDF-from-Onda","page":"API Documentation","title":"Export EDF from Onda","text":"","category":"section"},{"location":"api/","page":"API Documentation","title":"API Documentation","text":"onda_to_edf","category":"page"},{"location":"api/#OndaEDF.onda_to_edf","page":"API Documentation","title":"OndaEDF.onda_to_edf","text":"onda_to_edf(samples::AbstractVector{<:Samples}, annotations=[]; kwargs...)\n\nReturn an EDF.File containing signal data converted from a collection of Onda Samples and (optionally) annotations from an annotations table.\n\nFollowing the Onda v0.5 format, annotations can be any Tables.jl-compatible table (DataFrame, Arrow.Table, NamedTuple of vectors, vector of NamedTuples) which follows the annotation schema.\n\nEach EDF.Signal in the returned EDF.File corresponds to a channel of an input Onda.Samples.\n\nThe ordering of EDF.Signals in the output will match the order of the input collection of Samples (and within each channel grouping, the order of the samples' channels).\n\nnote: Note\nEDF signals are encoded as Int16, while Onda allows a range of different sample types, some of which provide considerably more resolution than Int16. During export, re-encoding may be necessary if the encoded Onda samples cannot be represented directly as Int16 values. In this case, new encoding (resolution and offset) will be chosen based on the minimum and maximum values actually present in each signal in the input Onda Samples. Thus, it may not always be possible to losslessly round trip Onda-formatted datasets to EDF and back.\n\n\n\n\n\n","category":"function"},{"location":"api/#Deprecations","page":"API Documentation","title":"Deprecations","text":"","category":"section"},{"location":"api/","page":"API Documentation","title":"API Documentation","text":"To support deserializing plan tables generated with old versions of OndaEDF + Onda, the following schemas are provided. These are deprecated and will be removed in a future release.","category":"page"},{"location":"api/","page":"API Documentation","title":"API Documentation","text":"PlanV1\nFilePlanV1","category":"page"},{"location":"api/#OndaEDFSchemas.PlanV1","page":"API Documentation","title":"OndaEDFSchemas.PlanV1","text":"@version PlanV1 begin\n # EDF.SignalHeader fields\n label::String\n transducer_type::String\n physical_dimension::String\n physical_minimum::Float32\n physical_maximum::Float32\n digital_minimum::Float32\n digital_maximum::Float32\n prefilter::String\n samples_per_record::Int16\n # EDF.FileHeader field\n seconds_per_record::Float64\n # Onda.SignalV1 fields (channels -> channel), may be missing\n recording::Union{UUID,Missing} = passmissing(UUID)\n kind::Union{Missing,AbstractString}\n channel::Union{Missing,AbstractString}\n sample_unit::Union{Missing,AbstractString}\n sample_resolution_in_unit::Union{Missing,Float64}\n sample_offset_in_unit::Union{Missing,Float64}\n sample_type::Union{Missing,AbstractString}\n sample_rate::Union{Missing,Float64}\n # errors, use `nothing` to indicate no error\n error::Union{Nothing,String}\nend\n\nA Legolas-generated record type describing a single EDF signal-to-Onda channel conversion. The columns are the union of\n\nfields from EDF.SignalHeader (all mandatory)\nthe seconds_per_record field from EDF.FileHeader (mandatory)\nfields from Onda.SignalV1 (optional, may be missing to indicate failed conversion), except for file_path\nerror, which is nothing for a conversion that is or is expected to be successful, and a String describing the source of the error (with backtrace) in the case of a caught error.\n\n\n\n\n\n","category":"type"},{"location":"api/#OndaEDFSchemas.FilePlanV1","page":"API Documentation","title":"OndaEDFSchemas.FilePlanV1","text":"@version FilePlanV1 > PlanV1 begin\n edf_signal_index::Int\n onda_signal_index::Int\nend\n\nA Legolas-generated record type representing one EDF signal-to-Onda channel conversion, which includes the columns of a PlanV1 and additional file-level context:\n\nedf_signal_index gives the index of the signals in the source EDF.File corresponding to this row\nonda_signal_index gives the index of the output Onda.Samples.\n\nNote that while the EDF index does correspond to the actual index in edf.signals, some Onda indices may be skipped in the output, so onda_signal_index is only to indicate order and grouping.\n\n\n\n\n\n","category":"type"},{"location":"#OndaEDF","page":"OndaEDF","title":"OndaEDF","text":"","category":"section"},{"location":"","page":"OndaEDF","title":"OndaEDF","text":"","category":"page"}] }