library(zaro)
> store <- zaro("virtualizarr://https://raw.githubusercontent.com/mdsumner/virtualized/refs/heads/main/remote/ocean_temp_2023.parq")
[zaro] opening VirtualiZarr Parquet reference store: https://raw.githubusercontent.com/mdsumner/virtualized/refs/heads/main/remote/ocean_temp_2023.parq
> meta <- zaro_meta(store)
[zaro] found .zmetadata (Zarr V2 consolidated)
[zaro] 11 arrays: Time, Time_bnds, average_DT, average_T1, average_T2, nv, st_edges_ocean, st_ocean, temp, xt_ocean, yt_ocean
> data <- zaro_read(store, "temp", start = c(0, 0, 0, 0), count = c(1, 1, 1500, 3600), meta = temp)
[zaro] reading 60 chunk(s) for path 'temp' (V2)
Error in dim(values) <- actual_chunk_shape :
dims [product 90000] do not match the length of object [37093]
In addition: Warning messages:
1: unknown codec 'zlib', passing through unchanged
2: unknown codec 'shuffle', passing through unchanged
zlib and shuffle codecs not recognized — the V2 filters use names "zlib" and "shuffle" but the codec pipeline probably only knows "gzip" / "zstd" etc. zlib maps to gzip in Arrow (arrow::Codec$create("gzip")), and shuffle is a byte-reordering filter that needs its own implementation (unshuffle bytes by element size before decompression).
The byte_range_read VSI path — needs updating from procedural vsi_open/vsi_seek/vsi_read to VSIFile class, or better yet, add the curl fallback for HTTP so this path stays GDAL-free:
if (grepl("^https?://", url) && requireNamespace("curl", quietly = TRUE)) {
resp <- curl::curl_fetch_memory(url,
handle = curl::new_handle(range = paste0(offset, "-", offset + length - 1L)))
if (resp$status_code == 206L) return(resp$content)
}
That plus the zlib/shuffle codec mapping
zlib and shuffle codecs not recognized — the V2 filters use names "zlib" and "shuffle" but the codec pipeline probably only knows "gzip" / "zstd" etc. zlib maps to gzip in Arrow (arrow::Codec$create("gzip")), and shuffle is a byte-reordering filter that needs its own implementation (unshuffle bytes by element size before decompression).
The byte_range_read VSI path — needs updating from procedural vsi_open/vsi_seek/vsi_read to VSIFile class, or better yet, add the curl fallback for HTTP so this path stays GDAL-free:
That plus the zlib/shuffle codec mapping