Skip to content
Open
Show file tree
Hide file tree
Changes from 128 commits
Commits
Show all changes
162 commits
Select commit Hold shift + click to select a range
4b7382b
wip: zarr 3 and sharding support
calvinchai Apr 1, 2025
fd64096
wip: zarr 3 and sharding, add auto shard size
calvinchai Apr 3, 2025
0713701
wip: cleanup
calvinchai Apr 3, 2025
5b331d6
wip
calvinchai Apr 3, 2025
450bb1e
wip
calvinchai Apr 3, 2025
da01c5f
wip
calvinchai Apr 3, 2025
c8480b7
wip: added a trivial pipeline for testing
calvinchai Apr 7, 2025
4336097
wip: added a trivial pipeline for testing
calvinchai Apr 7, 2025
3fa46ff
Merge branch 'refs/heads/zarr3-temp' into zarr3
calvinchai Apr 7, 2025
6ba6374
wip: add logging
calvinchai Apr 7, 2025
d4ec4ad
wip: try catch for wk modality
calvinchai Apr 7, 2025
3e9239a
wip
calvinchai Apr 7, 2025
2f9d868
wip: refactor generate_pyramid.py
calvinchai Apr 8, 2025
4d14b81
wip: psoct
calvinchai Apr 8, 2025
577d434
wip: add a trivial pipeline for psoct for testing purpose
calvinchai Apr 8, 2025
b525f6c
wip
calvinchai Apr 9, 2025
1c7c1c2
wip
calvinchai Apr 10, 2025
caca57f
wip
calvinchai Apr 10, 2025
843a698
wip
calvinchai Apr 17, 2025
5ccaad5
wip
calvinchai Apr 21, 2025
61d00b9
wip: oct data conversion
calvinchai Apr 22, 2025
581166c
wip: psoct
calvinchai Apr 23, 2025
8ce8336
wip: oct data conversion, optimization
calvinchai Apr 24, 2025
1a3c473
Merge branch 'main' into zarr3
calvinchai Apr 24, 2025
3c94fe6
wip: oct data conversion, optimization
calvinchai Apr 24, 2025
b9d0a44
merge fix
calvinchai Apr 24, 2025
d2c00e8
feat: parallelization for pyramid generation
calvinchai May 1, 2025
5589d90
wip: psoct
calvinchai May 15, 2025
382181f
wip: psoct
calvinchai May 21, 2025
5778ce6
wip: psoct
calvinchai May 22, 2025
2b4dc36
wip: psoct, sum in loop
calvinchai May 29, 2025
e66b7e4
wip: psoct
calvinchai Jun 5, 2025
c84b16d
wip: psoct
calvinchai Jun 5, 2025
b0052e6
Merge branch 'zarr3' of https://github.com/lincbrain/linc-convert int…
kabilar Jun 13, 2025
fbb89d2
Add attribution for ps-oct code
kabilar Jun 13, 2025
dfcc911
Merge pull request #55 from kabilar/zarr3-psoct
kabilar Jun 17, 2025
78931ce
wip: psoct
calvinchai Jun 11, 2025
8d55c7b
wip: psoct, attempts optimize dask pipeline
calvinchai Jun 12, 2025
99cb6f6
wip: psoct, finalize optimization
calvinchai Jun 17, 2025
cc80221
stage: remove testing modalities
calvinchai Jun 20, 2025
39b0e65
stage: remove mosaic pipelines for review
calvinchai Jun 20, 2025
516e93c
refactor: change tests file structure
calvinchai Jun 20, 2025
21ff852
fix: zarr config replace
calvinchai Jun 20, 2025
95fddfc
tests: psoct pipeline
calvinchai Jun 20, 2025
536bf23
refactor
calvinchai Jun 23, 2025
21bb3e1
fix: add exception
calvinchai Jun 23, 2025
bcc119c
feat: use tensorstore in pyramid generation
calvinchai Jun 23, 2025
6b38bae
tests: add unit test for psoct pipelines
calvinchai Jun 23, 2025
ac83ab0
chore: add backward func
calvinchai Jun 23, 2025
21b937d
fix: generate pyramid use correct zarr version
calvinchai Jun 23, 2025
d75a4a9
feat: use abstract classes for zarrio
calvinchai Jun 23, 2025
5a28dc8
fix: add dependency, dask
calvinchai Jun 23, 2025
22adc21
tests: refactor test helper function
calvinchai Jun 24, 2025
28b3126
tests: add sample data
calvinchai Jun 24, 2025
81cdef9
tests: shadowed old tests for pipelines that needs revise for zarr in…
calvinchai Jun 24, 2025
3d52f2a
tests: updated oct pipeline unit tests
calvinchai Jun 24, 2025
883edcc
fix: zarr 3 requires python >=3.11, update project requirements
calvinchai Jun 24, 2025
1b6373e
chore: disable file logger for now,
calvinchai Jun 24, 2025
fcab95c
fix: dependency add tensorstore
calvinchai Jun 24, 2025
fdd5fcf
feat: update multi_slice.py to use unified zarrio interface
calvinchai Jun 24, 2025
78f2428
chore: remove unused code
calvinchai Jun 24, 2025
0c8f3e0
feat(wip): zarrio for tensorstore
calvinchai Jun 24, 2025
995c059
feat: zarrio implementation for tensorstore
calvinchai Jun 25, 2025
026d3c9
feat: zarrio implementation for tensorstore, test update
calvinchai Jun 25, 2025
20c16cd
feat: pyramid generation level data persist
calvinchai Jun 25, 2025
3719aec
refactor: split zarrio into files
calvinchai Jun 26, 2025
8c35154
update dependencies
calvinchai Jun 26, 2025
fc05df0
test: update testing util
calvinchai Jun 26, 2025
1f03f24
refactor
calvinchai Jun 26, 2025
78930a9
fix: tensorstore fix chunk size
calvinchai Jun 26, 2025
c598f7c
refactor
calvinchai Jun 26, 2025
82fc791
tests: recover lsm test
calvinchai Jun 26, 2025
7438b3d
style: cleanup
calvinchai Jun 26, 2025
299b985
fix: use zarr-python for metadata
calvinchai Jun 26, 2025
f6edb8c
fix: improve tensorstore import error handling
calvinchai Jun 26, 2025
6846335
chore: cleanup
calvinchai Jul 1, 2025
3f65c55
chore: improve type hint
calvinchai Jul 1, 2025
e944d39
cleanup
calvinchai Jul 1, 2025
997f37c
test: add linc test data dir
calvinchai Jul 1, 2025
76fa219
test: ignore real test data files
calvinchai Jul 1, 2025
29fb116
chore: improve type hint
calvinchai Jul 1, 2025
ccf4249
Merge branch 'main' into zarr3
calvinchai Jul 1, 2025
809b8b2
fix: add dependencies
calvinchai Jul 1, 2025
59f68d2
tests: move dir
calvinchai Jul 1, 2025
eb5153d
tests: update .gitignore
calvinchai Jul 1, 2025
5daaad1
tests: optimize oct tests
calvinchai Jul 1, 2025
fabd7af
tests: refactor
calvinchai Jul 1, 2025
b301477
tests: update oct test data
calvinchai Jul 1, 2025
16398a8
tests: add heavy tests data from dandi
calvinchai Jul 1, 2025
14c06e0
tests: cleanup
calvinchai Jul 3, 2025
ec871b0
wip: refresh df modality
calvinchai Jul 3, 2025
9e3a223
feat: add optional argument to skip compare nii header for fast debug…
calvinchai Jul 8, 2025
237bdf2
tests: add pytests config
calvinchai Jul 10, 2025
3e3e608
docs
calvinchai Jul 10, 2025
9bb8245
tests: refresh regression test for df pipelines
calvinchai Jul 10, 2025
92cfefd
tests: refresh regression test for df pipelines
calvinchai Jul 10, 2025
3e0b0be
fix: update single_slice.py for df pipeline with new apis
calvinchai Jul 15, 2025
170ce6b
test: update df single_slice tests
calvinchai Jul 15, 2025
8e177f4
feat: unified ome_metadata and nifti_header write interface
calvinchai Jul 15, 2025
9ad0f04
feat: update using new interface
calvinchai Jul 15, 2025
43acb8b
tests: multi driver tests
calvinchai Jul 15, 2025
44a9c78
chore
calvinchai Jul 15, 2025
972b184
tests: use multi-driver tests
calvinchai Jul 15, 2025
af0ee98
style
calvinchai Jul 15, 2025
ae73445
test: improve testing logic and fix typo
calvinchai Jul 15, 2025
4db69b3
feat: lsm mosaic use new api
calvinchai Jul 15, 2025
bb3fe91
refactor
calvinchai Jul 17, 2025
3fe8c69
feat: update lsm pipelines to new api
calvinchai Jul 17, 2025
171f02e
tests
calvinchai Jul 17, 2025
45a7c8d
tests
calvinchai Jul 17, 2025
f93b95b
tests
calvinchai Jul 17, 2025
0851a63
tests
calvinchai Jul 17, 2025
22e2f1b
style
calvinchai Jul 17, 2025
e069e87
style
calvinchai Jul 17, 2025
aa26601
cleanup
calvinchai Jul 17, 2025
90215b4
style
calvinchai Jul 18, 2025
5888310
docs
calvinchai Jul 21, 2025
74054a0
fix: skip wkw unit test
calvinchai Jul 21, 2025
7fd73fd
fix: update chunk correctly
calvinchai Jul 21, 2025
b4b508a
docs
calvinchai Jul 21, 2025
03bb132
fix: api update
calvinchai Jul 21, 2025
4c0da18
fix: api update
calvinchai Jul 21, 2025
cc43bbe
docs
calvinchai Jul 21, 2025
ab24a58
fix: update api
calvinchai Jul 21, 2025
eaa8d6c
fix: update api
calvinchai Jul 21, 2025
80b9eb0
fix: update api
calvinchai Jul 21, 2025
89c4d6a
docs and style
calvinchai Jul 22, 2025
5692afa
docs
calvinchai Jul 24, 2025
d9635b4
style: fix indent
calvinchai Aug 1, 2025
c384fa4
style: fix indent
calvinchai Aug 1, 2025
044b6ba
style: fix indent
calvinchai Aug 1, 2025
9d826da
fix: remove duplicated unused function
calvinchai Aug 1, 2025
d3c70b4
fix: add to __all__ only when import succeeded
calvinchai Aug 1, 2025
3ce5ff9
fix: remove redundant import
calvinchai Aug 1, 2025
bbf81a1
chore: remove unused code
calvinchai Aug 4, 2025
204ffba
style
calvinchai Aug 4, 2025
54cb047
fix: wrong comment
calvinchai Aug 4, 2025
26f3c84
feat: add setitem handling for zarrio
calvinchai Aug 4, 2025
0575741
style
calvinchai Aug 4, 2025
768b26a
fix: ensure string
calvinchai Aug 4, 2025
8963632
fix: make sure out is string
calvinchai Aug 4, 2025
44b8cee
feat: fsspec support
calvinchai Aug 4, 2025
4d5748d
docs
calvinchai Aug 4, 2025
3a53075
feat: use mean for default pyramid generation. support passing function
calvinchai Aug 4, 2025
0718426
fix: use mean for pyramid generation
calvinchai Aug 4, 2025
2b0ac6b
fix: wrong element used
calvinchai Aug 5, 2025
c58116c
BREAKING CHANGE: use zarr-python to manage zarr group in ZarrTSGroup
calvinchai Aug 7, 2025
cbf5601
chore: remove unused func
calvinchai Aug 7, 2025
46a759a
refactor
calvinchai Aug 7, 2025
34fbff7
docs
calvinchai Aug 7, 2025
83649ab
cleanup
calvinchai Aug 7, 2025
b3502d2
feat: factory implemented
calvinchai Aug 7, 2025
4427642
feat: read array attributes with tsarray
calvinchai Aug 7, 2025
a0de94d
cleanup
calvinchai Aug 7, 2025
79e0eee
feat: read array attributes with tsarray
calvinchai Aug 7, 2025
fd88fe1
revert: stop using zarr-python for managing groups.
calvinchai Aug 11, 2025
6338e9f
feat: support attributes natively
calvinchai Aug 12, 2025
2c9c02c
feat: support metadata natively
calvinchai Aug 12, 2025
517a6a6
cleanup: remove unused file
calvinchai Aug 12, 2025
de8440d
docs, style: fix ruff errors
calvinchai Aug 12, 2025
69ce85c
feat: get rid of default zarr-python usage
calvinchai Aug 12, 2025
aefbb34
feat: get rid of default zarr-python usage
calvinchai Aug 12, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -157,4 +157,6 @@ cython_debug/
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
.idea/

tests/data/000051/
4 changes: 1 addition & 3 deletions conda.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,14 @@ name: linc-convert
channels:
- conda-forge
dependencies:
- python=3.10
- python
- ipython
- typer
- numpy
- glymur
- zarr
- nibabel
- tifffile
- wkw
- tensorstore
- pytest
- ruff
- tifffile
Expand Down
1 change: 1 addition & 0 deletions linc_convert/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
"""Data conversion tools for the LINC project."""

__all__ = ["modalities", "utils"]

from . import modalities, utils
1 change: 1 addition & 0 deletions linc_convert/modalities/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
"""Converters for all imaging modalities."""

__all__ = ["df", "lsm", "wk", "psoct"]

from . import df, lsm, psoct, wk
1 change: 1 addition & 0 deletions linc_convert/modalities/df/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import glymur as _ # noqa: F401

__all__ = ["cli", "multi_slice", "single_slice"]

from . import cli, multi_slice, single_slice
except ImportError:
pass
189 changes: 53 additions & 136 deletions linc_convert/modalities/df/multi_slice.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,50 +6,44 @@
"""

# stdlib
import ast
import json
import os
from typing import Unpack

# externals
import glymur
import nibabel as nib
import numpy as np
import zarr
from cyclopts import App


# internals
from linc_convert import utils
from linc_convert.modalities.df.cli import df
from linc_convert.utils.j2k import WrappedJ2K, get_pixelsize
from linc_convert.utils.io.j2k import WrappedJ2K, get_pixelsize
from linc_convert.utils.io.zarr import from_config
from linc_convert.utils.math import ceildiv, floordiv
from linc_convert.utils.orientation import center_affine, orientation_to_affine
from linc_convert.utils.zarr.compressor import make_compressor
from linc_convert.utils.zarr.zarr_config import ZarrConfig
from linc_convert.utils.zarr_config import ZarrConfig, update_default_config

HOME = "/space/aspasia/2/users/linc/000003"

# Path to LincBrain dataset
LINCSET = os.path.join(HOME, "sourcedata")
LINCOUT = os.path.join(HOME, "rawdata")


ms = App(name="multislice", help_format="markdown")
df.command(ms)


@ms.default
def convert(
inp: list[str],
*,
out: str,
zarr_config: ZarrConfig = None,
max_load: int = 16384,
orientation: str = "coronal",
center: bool = True,
thickness: float | None = None,
**kwargs
) -> None:
inp: list[str],
*,
zarr_config: ZarrConfig = None,
orientation: str = "coronal",
center: bool = True,
thickness: float | None = None,
**kwargs: Unpack[ZarrConfig],
) -> None:
"""
Convert JPEG2000 files generated by MBF-Neurolucida into a Zarr pyramid.

Expand All @@ -71,7 +65,7 @@ def convert(
* the second letter corresponds to the vertical dimension and
indicates the anatomical meaning of the _bottom_ of the jp2 image,
* the third letter corresponds to the slice dimension and
indicates the anatomical meaninff of the _end_ of the stack.
indicates the anatomical meaning of the _end_ of the stack.

We also provide the aliases

Expand All @@ -85,35 +79,17 @@ def convert(
----------
inp
Path to the input slices
out
Path to the output Zarr directory [<INP>.ome.zarr]
max_load
Maximum input chunk size
orientation
Orientation of the slice
center
Set RAS[0, 0, 0] at FOV center
thickness
Slice thickness
"""
zarr_config = utils.zarr.zarr_config.update(zarr_config, **kwargs)
chunk: int = zarr_config.chunk[0]
compressor: str = zarr_config.compressor
compressor_opt: str = zarr_config.compressor_opt
nii: bool = zarr_config.nii

# Default output path
if not out:
out = os.path.splitext(inp[0])[0]
out += ".nii.zarr" if nii else ".ome.zarr"
nii = nii or out.endswith(".nii.zarr")

if isinstance(compressor_opt, str):
compressor_opt = ast.literal_eval(compressor_opt)

# Prepare Zarr group
omz = zarr.storage.DirectoryStore(out)
omz = zarr.group(store=omz, overwrite=True)
zarr_config = update_default_config(zarr_config, **kwargs)
zarr_config.set_default_name(os.path.splitext(inp[0])[0])
max_load = zarr_config.max_load
omz = from_config(zarr_config)

nblevel, has_channel, dtype_jp2 = float("inf"), float("inf"), ""

Expand All @@ -132,24 +108,16 @@ def convert(
if has_channel:
new_size += (3,)
print(len(inp), new_size, nblevel, has_channel)

# Prepare chunking options
opt = {
"chunks": list(new_size[2:]) + [1] + [chunk, chunk],
"dimension_separator": r"/",
"order": "F",
"dtype": dtype_jp2,
"fill_value": 0,
"compressor": make_compressor(compressor, **compressor_opt),
}
print(opt)
chunks = list(new_size[2:]) + [1] + list(zarr_config.chunk[-2:])
zarr_config.chunk = tuple(chunks)
print(new_size)
# Write each level
for level in range(nblevel):
shape = [ceildiv(s, 2**level) for s in new_size[:2]]
shape = [ceildiv(s, 2 ** level) for s in new_size[:2]]
shape = [new_size[2]] + [len(inp)] + shape

omz.create_dataset(f"{level}", shape=shape, **opt)
# omz.create_dataset(f"{level}", shape=shape, **opt)
omz.create_array(str(level), shape, dtype=dtype_jp2, zarr_config=zarr_config)
array = omz[f"{level}"]

# Write each slice
Expand All @@ -159,109 +127,67 @@ def convert(
subdat = WrappedJ2K(j2k, level=level)
subdat_size = subdat.shape
print(
"Convert level",
level,
"with shape",
shape,
"for slice",
idx,
"with size",
subdat_size,
)
"Convert level",
level,
"with shape",
shape,
"for slice",
idx,
"with size",
subdat_size,
)

# offset while attaching
x = floordiv(shape[-2] - subdat_size[-2], 2)
y = floordiv(shape[-1] - subdat_size[-1], 2)

for channel in range(3):
if max_load is None or (
subdat_size[-2] < max_load and subdat_size[-1] < max_load
subdat_size[-2] < max_load and subdat_size[-1] < max_load
):
array[
channel, idx, x : x + subdat_size[-2], y : y + subdat_size[-1]
] = subdat[channel : channel + 1, ...][0]
channel, idx, x: x + subdat_size[-2], y: y + subdat_size[-1]
] = subdat[channel: channel + 1, ...][0]
else:
ni = ceildiv(subdat_size[-2], max_load)
nj = ceildiv(subdat_size[-1], max_load)

for i in range(ni):
for j in range(nj):
print(f"\r{i+1}/{ni}, {j+1}/{nj}", end=" ")
print(f"\r{i + 1}/{ni}, {j + 1}/{nj}", end=" ")
start_x, end_x = (
i * max_load,
min((i + 1) * max_load, subdat_size[-2]),
)
)
start_y, end_y = (
j * max_load,
min((j + 1) * max_load, subdat_size[-1]),
)
)

array[
channel,
idx,
x + start_x : x + end_x,
y + start_y : y + end_y,
channel,
idx,
x + start_x: x + end_x,
y + start_y: y + end_y,
] = subdat[
channel : channel + 1,
channel: channel + 1,
start_x:end_x,
start_y:end_y,
][0]
][0]

print("")

# Write OME-Zarr multiscale metadata
print("Write metadata")
multiscales = [
{
"version": "0.4",
"axes": [
{"name": "z", "type": "space", "unit": "micrometer"},
{"name": "y", "type": "distance", "unit": "micrometer"},
{"name": "x", "type": "space", "unit": "micrometer"},
],
"datasets": [],
"type": "jpeg2000",
"name": "",
}
]
axes = ["z", "y", "x"]
if has_channel:
multiscales[0]["axes"].insert(0, {"name": "c", "type": "channel"})

for n in range(nblevel):
shape0 = omz["0"].shape[-2:]
shape = omz[str(n)].shape[-2:]
multiscales[0]["datasets"].append({})
level = multiscales[0]["datasets"][-1]
level["path"] = str(n)

# I assume that wavelet transforms end up aligning voxel edges
# across levels, so the effective scaling is the shape ratio,
# and there is a half voxel shift wrt to the "center of first voxel"
# frame
level["coordinateTransformations"] = [
{
"type": "scale",
"scale": [1.0] * has_channel
+ [
1.0,
(shape0[0] / shape[0]) * vxh,
(shape0[1] / shape[1]) * vxw,
],
},
{
"type": "translation",
"translation": [0.0] * has_channel
+ [
0.0,
(shape0[0] / shape[0] - 1) * vxh * 0.5,
(shape0[1] / shape[1] - 1) * vxw * 0.5,
],
},
]
multiscales[0]["coordinateTransformations"] = [
{"scale": [1.0] * (3 + has_channel), "type": "scale"}
]
omz.attrs["multiscales"] = multiscales
axes.insert(0, "c")
omz.write_ome_metadata(
axes=axes,
space_scale=[1.0] + list(get_pixelsize(j2k)),
multiscales_type="jpeg2000",
no_pool=0,
)

# Write NIfTI-Zarr header
# NOTE: we use nifti2 because dimensions typically do not fit in a short
Expand All @@ -280,19 +206,10 @@ def convert(
header.set_sform(affine)
header.set_xyzt_units(nib.nifti1.unit_codes.code["micron"])
header.structarr["magic"] = b"n+2\0"
header = np.frombuffer(header.structarr.tobytes(), dtype="u1")
opt = {
"chunks": [len(header)],
"dimension_separator": r"/",
"order": "F",
"dtype": "|u1",
"fill_value": None,
"compressor": None,
}
omz.create_dataset("nifti", data=header, shape=shape, **opt)
omz.write_nifti_header(header)

# Write sidecar .json file
json_name = os.path.splitext(out)[0]
json_name = os.path.splitext(zarr_config.out)[0]
json_name += ".json"
dic = {}
dic["PixelSize"] = json.dumps([vxw, vxh])
Expand Down
Loading