Skip to content

Commit

Permalink
Merge pull request #8 from JuliaGeo/abstractds
Browse files Browse the repository at this point in the history
Harmonize API with NCDatasets.jl
  • Loading branch information
tcarion authored Apr 7, 2023
2 parents 3f47a85 + a877bdf commit a27ea01
Show file tree
Hide file tree
Showing 15 changed files with 406 additions and 117 deletions.
6 changes: 4 additions & 2 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,17 @@ authors = ["tcarion <[email protected]> and contributors"]
version = "0.1.1"

[deps]
CommonDataModel = "1fbeeb36-5f17-413c-809b-666fb144f157"
DataStructures = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"
Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"
DiskArrays = "3c3547ce-8d99-4f5e-a174-61eb10b00ae3"
GRIB = "b16dfd50-4035-11e9-28d4-9dfe17e6779b"

[compat]
CommonDataModel = "^0.2.1"
DataStructures = "0.18"
DiskArrays = "0.3.0"
GRIB = "0.3.0"
DiskArrays = "0.3"
GRIB = "0.3"
julia = "1"

[extras]
Expand Down
104 changes: 81 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,36 +7,94 @@

## Description

GRIBDatasets.jl uses [GRIB.jl](https://weech.github.io/GRIB.jl) to provide a higher level interface for reading GRIB file. It tries to follow the same approach as [NCDatasets.jl](https://github.com/JuliaGeo/NetCDF.jl).
GRIBDatasets.jl uses [GRIB.jl](https://weech.github.io/GRIB.jl) to provide a higher level interface for reading GRIB files. This package implements the [CommonDataModel.jl](https://github.com/JuliaGeo/CommonDataModel.jl) interface, which mean that the datasets can be accessed in the same way as netCDF files opened with [NCDatasets.jl](https://github.com/Alexander-Barth/NCDatasets.jl).

To read a GRIB file, just type:

```julia
using GRIBDatasets

ds = GRIBDataset("example.grib")
Dataset from file: example.grib
Dimensions:
longitude = 120
latitude = 61
number = 10
valid_time = 4
level = 2
Layers:
z, t
with attributes:
Dict{String, Any} with 5 entries:
"edition" => "1"
"centreDescription" => "European Centre for Medium-Range Weather Forecasts"
"centre" => "ecmf"
"subCentre" => "0"
"Conventions" => "CF-1.7"
julia> using GRIBDatasets

julia> ds = GRIBDataset("example.grib")
Dataset: example.grib
Group: /

Dimensions
lon = 120
lat = 61
valid_time = 4

Variables
(120)
Datatype: Float64 (Float64)
Dimensions: lon
Attributes:
units = degrees_east
long_name = longitude
standard_name = longitude

(61)
Datatype: Float64 (Float64)
Dimensions: lat
Attributes:
units = degrees_north
long_name = latitude
standard_name = latitude

(4)
Datatype: Dates.DateTime (Int64)
Dimensions: valid_time
Attributes:
units = seconds since 1970-01-01T00:00:00
calendar = proleptic_gregorian
long_name = time
standard_name = time

(120 × 61 × 4)
Datatype: Union{Missing, Float64} (Float64)
Dimensions: lon × lat × valid_time
Attributes:
units = K
long_name = Temperature
standard_name = air_temperature

Global attributes
edition = 1
source = /home/tcarion/.julia/dev/GRIBDatasets/test/sample-data/era5-levels-members.grib
centreDescription = European Centre for Medium-Range Weather Forecasts
centre = ecmf
subCentre = 0
Conventions = CF-1.7
```

You can then access a variable with `z = ds["z"]`, and slice according to the variable dimensions:
Indexing on the `GRIBDataset` object gives you the variable, which is an `AbstractArray` that can be sliced according to the required dimensions:

```julia
julia> t = ds["t"];
julia> t[1:3,1:5,1]
3×5 Matrix{Union{Missing, Float64}}:
233.31 231.276 230.121 229.144 229.072
233.31 231.229 230.053 229.212 228.893
233.31 231.174 229.942 229.064 228.84

julia> ds["valid_time"][:]
4-element Vector{Dates.DateTime}:
2017-01-01T00:00:00
2017-01-01T12:00:00
2017-01-02T00:00:00
2017-01-02T12:00:00
```

The attributes of any variable can be accessed this way:
```julia
z[:,:, 2, 1:2, 1]
julia> ds["z"].attrib
Dict{String, Any} with 3 entries:
"units" => "m**2 s**-2"
"long_name" => "Geopotential"
"standard_name" => "geopotential"
```

This package is similar to [CfGRIB.jl](https://github.com/ecmwf/cfgrib.jl), but some part of the code has been rewritten so it can be easily integrated to [Rasters.jl](https://github.com/rafaqz/Rasters.jl). It is recommended to use directly Rasters.jl, so the user can benefit from its nice features.
This package is similar to [CfGRIB.jl](https://github.com/ecmwf/cfgrib.jl), but the code has been adapted to be more Julian and to follow the `CommonDataModel` interface.

## Caveats:
- Only reading of GRIB format is currently possible with this package. But it should normally be straightforward to write a `GRIBDataset` to netCDF with `NCDatasets`.
- GRIB format files may have a (very) large amount of different shapes. `GRIBDatasets` might not work for your specific edge case. If this happens, do not hesitate to open an issue.
8 changes: 5 additions & 3 deletions src/GRIBDatasets.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@ module GRIBDatasets
using Dates
using GRIB
using DataStructures
using DiskArrays

const DA = DiskArrays
import DiskArrays as DA
using CommonDataModel: AbstractDataset, AbstractVariable, show_dim, CFVariable
import CommonDataModel: path, name, dimnames, isopen, attribnames, attrib
import CommonDataModel as CDM

const DEFAULT_EPOCH = DateTime(1970, 1, 1, 0, 0)

Expand All @@ -16,6 +17,7 @@ include("index.jl")
include("dimensions.jl")
include("dataset.jl")
include("variables.jl")
include("cfvariables.jl")

export GRIBDataset, FileIndex

Expand Down
69 changes: 69 additions & 0 deletions src/cfvariables.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@

# struct CFVariable{T, N, AT, TSA} <: AbstractGRIBVariable{T,N}
# var::Variable{T, N, AT}
# attrib::Dict{String, Any}
# _storage_attrib::TSA
# end

# function CFVariable(ds, varname; _v = Variable(ds, varname))
# v = _v
# missing_val = missing_value(v)
# T = eltype(v)
# N = ndims(v)

# storage_attrib = (
# missing_value = missing_val,
# )

# attribs = cflayer_attributes(v)

# CFVariable{T, N, typeof(parent(v)), typeof(storage_attrib)}(_v, attribs, storage_attrib)
# end

# Base.parent(cfvar::CFVariable) = parent(cfvar.var)
# Base.size(cfvar::CFVariable) = size(cfvar.var)
# Base.getindex(cfvar::CFVariable, I...) = getindex(parent(cfvar), I...)

# function Base.getindex(cfvar::CFVariable{T, N, TV}, I...) where {T,N,TV <: DA.AbstractDiskArray{T, N}}
# A = getindex(parent(cfvar), I...)
# misval = cfvar._storage_attrib.missing_value

# isnothing(misval) && (return A)

# # return any(x -> x == misval, A) ? replace(A, misval => missing) : A
# return any(A .== misval) ? replace(A, misval => missing) : A

# end

# varname(cfvar::CFVariable) = varname(cfvar.var)
# dims(cfvar::CFVariable) = dims(cfvar.var)

_get_dim(cfvar::CFVariable, dimname) = _get_dim(cfvar.var, dimname)
function cfvariable(ds, varname)
v = Variable(ds, string(varname))
misval = missing_value(v)
CDM.cfvariable(
ds, varname;
_v = v,
missing_value = isnothing(misval) ? eltype(v)[] : [misval],
attrib = cflayer_attributes(v),
)
end

# In case of layer variable
cflayer_attributes(var::Variable{T, N, <: DA.AbstractDiskArray{T, N}}) where {T, N} = cflayer_attributes(parent(var).layer_index)

# In case of a coordinate variable
cflayer_attributes(var::Variable) = var.attrib

function cflayer_attributes(index::FileIndex)
attributes = Dict{String, Any}()

for (gribkey, cfkey) in CF_MAP_ATTRIBUTES
if haskey(index, gribkey) && !occursin("unknown", getone(index, gribkey))
attributes[cfkey] = join(index[gribkey], ", ")
end
end

return attributes
end
7 changes: 7 additions & 0 deletions src/constants.jl
Original file line number Diff line number Diff line change
Expand Up @@ -254,6 +254,13 @@ const GRID_TYPES_2D_NON_DIMENSION_COORDS = [

const COORDINATE_VARIABLES_KEYS = vcat(keys(COORD_ATTRS) |> collect)

const CF_MAP_ATTRIBUTES = Dict(
"cfName" => "standard_name",
"name" => "long_name",
"units" => "units"
)

const KEYS_TO_SQUEEZE = ["number"]
# """
# GRIB_KEY_TO_DIMNAMES_MAP
# Maps the GRIB keys to the name the variable will have in the GRIBDataset.
Expand Down
30 changes: 20 additions & 10 deletions src/dataset.jl
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ julia> z[1:4, 3:6, 1, 1:2, 1]
51133.3 50806.3 50351.3 50399.3
```
"""
struct GRIBDataset{T, N}
struct GRIBDataset{T, N} <: AbstractDataset
index::FileIndex{T}
dims::NTuple{N, AbstractDim}
attrib::Dict{String, Any}
Expand All @@ -80,25 +80,35 @@ GRIBDataset(filepath::AbstractString; filter_by_values = Dict()) = GRIBDataset(F

Base.keys(ds::Dataset) = getvars(ds)
Base.haskey(ds::Dataset, key) = key in keys(ds)
Base.getindex(ds::Dataset, key) = Variable(ds, string(key))
Base.getindex(ds::Dataset, key) = cfvariable(ds, string(key))

getlayersid(ds::GRIBDataset) = ds.index["paramId"]
getlayersname(ds::GRIBDataset) = string.(ds.index["cfVarName"])

getvars(ds::GRIBDataset) = vcat(keys(ds.dims), getlayersname(ds))

_dim_values(ds::GRIBDataset, dim) = _dim_values(ds.index, dim)
_get_dim(ds::GRIBDataset, key) = _get_dim(ds.dims, key)

### Implementation of CommonDataModel
path(ds::GRIBDataset) = ds.index.grib_path
CDM.dim(ds::GRIBDataset, dimname::String) = dimlength(_get_dim(ds.dims, dimname))
dimnames(ds::GRIBDataset) = keys(ds.dims)

attribnames(ds::GRIBDataset) = keys(ds.attrib)
attrib(ds::GRIBDataset, attribname::String) = ds.attrib[attribname]

# _dim_values(ds::GRIBDataset, dim::Dimension{Horizontal}) = _dim_values(ds.index, dim)


function Base.show(io::IO, mime::MIME"text/plain", ds::Dataset)
println(io, "Dataset from file: $(ds.index.grib_path)")
show(io, mime, ds.dims)
println(io, "Layers:")
println(io, join(getlayersname(ds), ", "))
println(io, "with attributes:")
show(io, mime, ds.attrib)
end
# function Base.show(io::IO, mime::MIME"text/plain", ds::Dataset)
# println(io, "Dataset from file: $(ds.index.grib_path)")
# show(io, mime, ds.dim)
# println(io, "Layers:")
# println(io, join(getlayersname(ds), ", "))
# println(io, "with attributes:")
# show(io, mime, ds.attrib)
# end

function dataset_attributes(index::FileIndex)
attributes = Dict{String, Any}()
Expand Down
Loading

0 comments on commit a27ea01

Please sign in to comment.