First schema suggestions for specific data types to be added in core #523

Open

Assignees

Labels

major updaterequestv5.0 candidate

opened

This was started in the openMINDS_neuroimaging extension, but @lzehl and I would like to suggest to move them to core. The main reason is that any research product could profit from using them as (sort of) extensions to their associated files to provide more details about the data. It would be cumbersome to have to use multiple openMINDS extensions to find suitable data type schemas, the list of data types is not as extensive as e.g., devices where we concluded that we collect them in the most suitable openMINDS extension repos instead and it would also allow us to link to these specific data types as the data for specific schemas like e.g., coordinate frameworks in the upcoming SANDS update. The latter would be important in cases where the specific files may not be easy to share, but to still provide some metadata. This logic may also apply to other extensions as well, but should be discussed in a developers meeting.

I have made a draft for some data types based on the draft for some of the data types that @Peyman-N has prepared in the neuroimaging extension. This draft includes the types needed to cover any kind of files used as reference data for atlases.

RasterGraphic

property	required?	count	value	comment
additionalRemarks		1	string
capturedWith		n	CAT/DeviceUsage
colorDepth		1	[emb] QuantitativeValue
compressionType	X	1	ControlledTerms terminology or ContentType or (Anaylsis)Technique?	generally, lossy or loseless, but each can be achieved through different techniques, e.g., loseless - RLE vs. LZW
compressionRatio		1	[emb] QuantitativeValue	typically expressed as percentage of original image size with 100% = not resize
coordinateSpace	X	1	CustomCoordinateFramework or CommonCoordinateSpace	values don't exist yet and are still in discussion
dataLocation	X	1	File
dimension	X	2	integer
isPartOf		n	Image or Volume	'shortcut' to describe the process of creating an image/volume from multiple images, e.g., tile scans or 3D reconstructions
lookupLabel		1	string
resolution	X	2	[emb] QuantitativeValue

VectorGraphic

property	required?	count	value	comment
additionalRemarks		1	string
coordinateSpace	X	1	CustomCoordinateFramework or CommonCoordinateSpace	values don't exist yet and are still in discussion
dataLocation	X	1	File
dimension	X	2	integer
lookupLabel		1	string
software		1	SoftwareVersion	existing property name used in computation extension, should be discussed
???				anything more? ChatGPT suggests detailed content descriptions like layer information (number of, names, order, visibility), color information (used model/palette or specific color codes), object properties (stroke width, opacity, etc.) and font/text information; this seems to extensive

ImageStack

NOTE: This is defined as 'A group of images with a similar frame of reference.' and include captured stacks (like z-stacks), reconstructed stacks from single image captures and images that could be reconstructed into a stack because they have a similar frame of reference (like 2D atlas reference data - data has same coordinate space and typically are single captures from the same brain at different spatial locations).
QUESTION: When does a ImageStack become a Volume?

property	required?	count	value	comment
additionalRemarks		1	string
capturedWith		n	CAT/DeviceUsage
colorDepth		1	[emb] QuantitativeValue
compressionType	X	1	ControlledTerms terminology or ContentType or (Anaylsis)Technique?	generally, lossy or loseless, but each can be achieved through different techniques, e.g., loseless - RLE vs. LZW
compressionRatio		1	[emb] QuantitativeValue	typically expressed as percentage of original image size with 100% = not resize
coordinateSpace	X	1	CustomCoordinateFramework or CommonCoordinateSpace	values don't exist yet and are still in discussion
dataLocation	X	n	File or Image or Volume?	not sure how to solve this
dimension	X	2 or 3?	integer	if each image under 'dataLocation' has a dimension, should this be removed or reflect that information (in 2D) or reflect that information combined with the number of images (aka depth, in 3D)?
isPartOf		n	Volume	any point in having this here?
lookupLabel		1	string
numberOfImages	X	2-n	integer
resolution	X	2 or 3?	[emb] QuantitativeValue	same questions as for dimensions
sliceThickness or scanningDepth or StackDepth or depth		1	[emb] QuantitativeValue	not sure if 'sliceThickness' is the best name; e.g., a z-stack can be captured from a 30um thick tissue slice but total scanning depth might be 10um somewhere within that slice
imageSpacing or spacing		1	[emb] QuantitativeValue	with negative values indicating an overlap

Volume

property	required?	count	value	comment
additionalRemarks		1	string
capturedWith		n	CAT/DeviceUsage
colorDepth		1	[emb] QuantitativeValue
compressionType	X	1	ControlledTerms terminology or ContentType or (Anaylsis)Technique?	generally, lossy or loseless, but each can be achieved through different techniques, e.g., loseless - RLE vs. LZW
compressionRatio		1	[emb] QuantitativeValue	typically expressed as percentage of original image size with 100% = not resize
coordinateSpace	X	1	CustomCoordinateFramework or CommonCoordinateSpace	values don't exist yet and are still in discussion
dataLocation	X	1	File
dimension	X	3	integer
isPartOf		n	Volume	similar 'shortcut' as in RasterGraphic, needed?
lookupLabel		1	string
resolution	X	3	[emb] QuantitativeValue

3DMesh (one surface type)

NOTE: I know we need this somehow, but I'm not sure how to solve it since I don't really know this data type.

property	required?	count	value	comment
additionalRemarks		1	string
compressionType	X	1	ControlledTerms terminology or ContentType or (Anaylsis)Technique?	generally, lossy or loseless, but each can be achieved through different techniques, e.g., loseless - RLE vs. LZW
compressionRatio		1	[emb] QuantitativeValue	typically expressed as percentage of original image size with 100% = not resize
coordinateSpace	X	1	CustomCoordinateFramework or CommonCoordinateSpace	values don't exist yet and are still in discussion
dataLocation	X	1	File
dimension	X	3	integer
isPartOf		n	Volume	similar 'shortcut' as in RasterGraphic, needed?
lookupLabel		1	string
resolution	X	1?	[emb] QuantitativeValue	as far as I can tell, this reflects the density of vertices which would be a single value?
software		1	SoftwareVersion	existing property name used in computation extension, should be discussed
vertexCount	X	1	integer
faceCount	X	1	integer
???				anything more? Seems like there are also texture coordinates and normals which may or may not be important? Also, a faces is made up of a specific number of vertices, but as far as I understand this does not need to be uniform within a mesh (e.g., not always all triangles) but often is solved that way, should we have a property like '(mesh)FaceType' with a ControlledTerms terminology (e.g., triangular/triangle, quadrilateral/quad) and allow n or have a term 'mixed' in this terminology restricting it to exactly one; alternatively, it could be called 'faceDimension' with a list of or single integer (e.g., 3 would represent a triangle and [3, 4] woudl represent a mix of triangles and quads

This is a rather long issue. @lzehl if this seems like a reasonable start point I can make this into a PR for easier review of each property. @openMetadataInitiative/openminds-admins and @openMetadataInitiative/openminds-developers, any feedback?

Metadata

Assignees

UlrikeS91

Labels

major updaterequestv5.0 candidate

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests