Specificity and Improvements to `captures` array #124

jacobagilbert · 2021-05-25T20:09:30Z

It is unclear to me whether the captures array elements must cover the entire file duration. Currently it is allowed to define a nonzero sample_start for the first captures element, but it is not possible to define an end point for this capture other than to create another element. It is also not explicit within the spec if optional fields in captures[0] apply until captures[1] is defined (or EOF), or if they persist until explicitly redefined.

I would also propose that a mechanism to exclude portions of data from the captures array should exist but is missing. For example one use of the captures array to provide frequency information for a segment of data; if I have a device that does not retune instantly, this would look like some duration of data at frequency 1, some duration of invalid data, then data at frequency 2, etc. One could argue that annotations could be used to flag this, however it seems awkward to use annotations to convey information that is otherwise conveniently included in the captures array. The same can be true for gain or saturation metadata.

Two suggestions for which I do not have a strong opinion on the best approach:

Create a 'data_valid' boolean entry in the captures array
Create a 'comment' section that can be used for this purpose
Create a sample_end entry that can be used to specify discontinuous chunks of the file.

I also propose the addition of an optional 'gain' field into the core captures definition for the same reasons frequency is included.

These are very simple PRs (some language on capture element scope, addition of a few fields) but I would like some opinions before I do that.

The text was updated successfully, but these errors were encountered:

bhilburn · 2021-05-26T16:06:13Z

Responding in bullet form to organize what I think are related but separate lines of discussion:

Regarding discontinuous chunks of the file, that's the intention of the global_index field. Does that not provide the relevant functionality for this item?
Excellent point re: optional fields carrying over into the next captures element. I consider this a bug and it should be addressed explicitly by adding text to the spec that says they do NOT carry over, IMO.

Given the two points above, it seems like the additional ask, here, would be to add an optional "num samps" sort of field to captures, is that right? I think this is a topic we've discussed at-length in some of the GRCon SigMF breakouts, and each time have arrived at the same conclusion of not doing it. What are your thoughts?

jacobagilbert · 2021-05-26T16:17:37Z

I think discontinuities are handled effectively with the global_index provided they are manifested as dropped samples (i.e.: not in the sigmf-data). The issue here is that currently there is no way to specify where a segment stops other than to specify a new segment (for which 'bad' data included in the sigmf-data for whatever reason is one example of why you may want to do this), and there is no way to categorize segments other than by frequency or datetime.

Adding a sample_count field (as there is for annotations) is one solution, though it sounds like the drawbacks have been discussed at length. I think my best suggestion instead is to add an optional description field to captures as there is in global, a simple, non-breaking change.

I agree about language specifying that captures fields apply from the explicitly identified sample_start until the sample_start of the next capture array item or EOF. This inherently prevents capture overlap which is a major reason not to include a sample_count field.

bhilburn · 2021-05-26T17:50:54Z

Ahh, I see your point, @jacobagilbert. So, it's not that there are "missing samples", but rather samples that you want to ignore because they are bad for whatever reason.

Setting aside the other arguments against a sample_count sort of field, I actually don't think that would be a good solution, here, anyway -- those samples are "real", and genuinely part of the capture, but rather you are trying to indicate something about the utility of them.

I'm spinning on your proposal to add some sort of description field (optional) to each captures element. My knee-jerk is that on the surface that seems like the right approach, but looking deeper my concern is that it means everyone will have their own "standard" for how to describe common things they want do in it. I think your use case, for example, is probably not uncommon.

So, new question: is it worth creating a canonical extension that adds a field to captures for this purpose and specifies labels that can be used for common things to be indicated?

jacobagilbert · 2021-05-26T18:24:53Z

Samples to ignore, or maybe just flag as not useful, or something that does not fall into "freq W.XYZ" which is the only non-temporal mechanism currently included.

I definitely see your point about description being a free for all; its sort of intended to be non-critical IMO, and maybe it should be there but not be abused for this. If this is really common enough we could just have a 'valid' bool where 0 would allow a machine-readable indication not to use this capture segment and the reason (mid-tune, saturation, or otherwise not useful data) could live in the description in a human readable format. Not trying to pollute this, but i feel fairly strongly that 'description' is a good thing to have as optional for general human readable information.

bhilburn · 2021-05-26T18:33:57Z

I think the last piece of what you wrote is what I'm questioning, @jacobagilbert --

for general human readable information.

I think if we are going to introduce this feature, there's a lot of benefit to making it machine-readable, which means high specificity. For example, there may be readers that want to automatically "shade" samples that are marked as "not useful" but still display that period of time in a visualizer.

jacobagilbert · 2021-05-26T18:51:18Z

Yes that is a perfect example of how one might use this to focus on important portions of large datasets. Just to summarize (make sure we are on same page) where I think we are on this whole issue:

[AGREE] Add text specifying scope of captures metadata fields explicitly
[UNSURE] Address important/not-important captures segments by either:
a. Add an optional contents (or some other name) field which is an enum of {'normal', 'invalid', 'saturation', 'unknown', ...?}
b. Add an optional valid boolean field to indicated whether or not this capture segment should be considered valid, and an optional description field similar to that in the core global object.
[UNSURE] Add optional core:gain captures object.

bhilburn · 2021-05-26T19:09:54Z

Agreed on the first one. That should be a quick bug & PR, which I think can be filed outside of this ticket and pursued.

On the second, I agree it's a useful feature, but I think we have to work out the best way to go about it (I like some aspects of your (a) and (b), but not others. I think this topic ought to be the focus of the discussion in this ticket now.

I'm for sure not sold on a core:gain field, actually. My issue is that's really more of a description of the radio system that made the capture than the capture itself, and is not actually needed to "understand" the data. I understand how this is related, but I think it's a different topic. I think this discussion ought to get broken out into a separate ticket to think through.

jacobagilbert · 2021-05-26T19:12:22Z

OK, will PR 1 and move 3 to a different issue; definitely not urgent and I don't have a pressing use case for that.

Thanks.

Per discussion in sigmf#124 this makes the scope of captures fields explicitly defined. Signed-off-by: Jacob Gilbert <[email protected]>

jacobagilbert · 2021-05-27T17:03:13Z

Done, and while resolution of the second item would be nice for v1.0 it is not necessary so im going to push this out of the critical path while we mull on it.

gmabey · 2021-06-14T20:30:39Z

Does merge #129 effectively close this issue then?

gmabey · 2021-06-14T20:31:33Z

I ask because I still see it in the list of v1.0.0 issues.

bhilburn · 2021-06-14T20:38:16Z

You're right, this can be closed, and remaining discussion can be moved to #130

bhilburn added this to the Release v1.0.0 milestone May 26, 2021

bhilburn added bug question suggestion labels May 26, 2021

bhilburn assigned jacobagilbert May 26, 2021

jacobagilbert pushed a commit to jacobagilbert/SigMF that referenced this issue May 27, 2021

Explicit Scope for Captures Fields

8eeff93

Per discussion in sigmf#124 this makes the scope of captures fields explicitly defined. Signed-off-by: Jacob Gilbert <[email protected]>

This was referenced May 27, 2021

Explicit Scope for Captures Fields #129

Merged

Consider additional Captures fields #130

Open

jacobagilbert modified the milestones: Release v1.0.0, Future (v2.x) May 27, 2021

bhilburn closed this as completed Jun 14, 2021

jacobagilbert mentioned this issue Jul 30, 2021

add 'capture_tags' field to captures metadata #146

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specificity and Improvements to `captures` array #124

Specificity and Improvements to `captures` array #124

jacobagilbert commented May 25, 2021 •

edited

Loading

bhilburn commented May 26, 2021

jacobagilbert commented May 26, 2021

bhilburn commented May 26, 2021

jacobagilbert commented May 26, 2021

bhilburn commented May 26, 2021

jacobagilbert commented May 26, 2021 •

edited

Loading

bhilburn commented May 26, 2021

jacobagilbert commented May 26, 2021

jacobagilbert commented May 27, 2021

gmabey commented Jun 14, 2021

gmabey commented Jun 14, 2021

bhilburn commented Jun 14, 2021

Specificity and Improvements to captures array #124

Specificity and Improvements to captures array #124

Comments

jacobagilbert commented May 25, 2021 • edited Loading

bhilburn commented May 26, 2021

jacobagilbert commented May 26, 2021

bhilburn commented May 26, 2021

jacobagilbert commented May 26, 2021

bhilburn commented May 26, 2021

jacobagilbert commented May 26, 2021 • edited Loading

bhilburn commented May 26, 2021

jacobagilbert commented May 26, 2021

jacobagilbert commented May 27, 2021

gmabey commented Jun 14, 2021

gmabey commented Jun 14, 2021

bhilburn commented Jun 14, 2021

Specificity and Improvements to `captures` array #124

Specificity and Improvements to `captures` array #124

jacobagilbert commented May 25, 2021 •

edited

Loading

jacobagilbert commented May 26, 2021 •

edited

Loading