Performance tests on CESNET infra

In #15, some first tests on using Xarray on CESNET with Swift are proposed. This shows that (thanks to @sebastian-luna-valero), it's quite easy (even with some manual steps) to write on the Swift object storage using Zarr and Dask.

One thing we might want to do know would be to perform some light but comprehensive benchmarks to identify what performances we could get on this Infrastructure.

A classical benchmark could be:
- Define some example datasets: small, medium, large (up to some TiB?).
  - 10GiB
  - 100GiB
  - 1TiB
- Write these datasets on Dask clusters varying in size: 
  - 5 workers
  - 10 workers
  - 20 workers
- Read back the datasets on varying Dask clusters
- Compute things like troughput or other stats (maybe by analyzing the Dask task report)
- Analyze the results.

We might want to start from something like https://github.com/pangeo-data/benchmarking on wihch @tinaok contributed.

We need also to ask CESNET for potential limits or constraints they'd have @sebastian-luna-valero.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Performance tests on CESNET infra #16

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance tests on CESNET infra #16

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions