Skip to content

Commit

Permalink
update 20240709-pytorch-fig-benchmark.png, caption
Browse files Browse the repository at this point in the history
  • Loading branch information
ryan-williams committed Jul 19, 2024
1 parent 7612fb3 commit 9c9164f
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
Binary file modified docs/articles/2024/20240709-pytorch-fig-benchmark.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions docs/articles/2024/20240709-pytorch.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,14 +106,14 @@ The balance between memory usage, efficiency, and level of randomness can be adj

We have made improvements to the loaders to reduce the amount of data transformations required from data fetching to model training. One such important change is to encode the expression data as a dense matrix immediately after the data is retrieved from disk/cloud.

In our benchmarks, we found that densifying data increases training speed ~3X while maintaining relatively constant memory usage (Figure 3). For this reason, we have disable the intermediate data processing in sparse format unless Torch Sparse Tensors are requested via the `ExperimentDataPipe` parameter `return_sparse_X`.
In our benchmarks, we found that densifying data increases training speed while maintaining relatively constant memory usage (Figure 3). For this reason, we have disabled the intermediate data processing in sparse format unless Torch Sparse Tensors are requested via the `ExperimentDataPipe` parameter `return_sparse_X`.

```{figure} ./20240709-pytorch-fig-benchmark.png
:alt: Census PyTorch loaders benchmark
:align: center
:figwidth: 80%
**Figure 3. Benchmark of memory usage and speed of data processing during modeling, default parameters lead to 3K+ samples/sec with 27GB of memory.** The benchmark was done processing 4M cells out of a 10M-cell Census, data was fetched from the cloud (S3). "Method" indicates the expression matrix encoding, circles are dense (np.array) and squares are sparse (scipy.csr). Size indicates the total number of cells per processing block (max cells materialized at any given time) and color is the number of individual randomly grabbed chunks composing a processing block, higher chunks per block lead to better shuffling. Data was fetched until modeling step, but no model was trained.
**Figure 3. Benchmark of memory usage and speed of data processing during modeling, default parameters lead to ≈2,500 samples/sec with 27GB of memory use.** The benchmark was done processing 4M cells out of a 10M-cell Census, with data streamed from the cloud (S3). "Method" indicates the expression matrix encoding: circles are dense ("np.array", now the default behavior) and squares are sparse ("scipy.csr"). Size indicates the total number of cells per processing block (max cells materialized at any given time) and color is the number of individual randomly grabbed chunks composing a processing block; higher chunks per block lead to better shuffling. Data was fetched until modeling step, but no model was trained.
```

We repeated the benchmark in Figure 3 in different conditions encompassing varying number of total cells and multiple epochs, please [follow this link for the full benchmark report and code.](https://github.com/ryan-williams/arrayloader-benchmarks).
Expand Down

0 comments on commit 9c9164f

Please sign in to comment.