Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[misc] acceptance test GHA workflow #897

Merged
merged 42 commits into from
Dec 20, 2023
Merged
Show file tree
Hide file tree
Changes from 36 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
bc663c1
full unit test workflow for python
Dec 16, 2023
1ac94c5
temp addition of pull-request event to register workflow
Dec 16, 2023
ec3813b
add build tools install
Dec 16, 2023
fa9742f
add missing flag
Dec 16, 2023
af494ac
add git install
Dec 16, 2023
e7442c1
debugging
Dec 16, 2023
982cca7
more debugging
Dec 16, 2023
aecdd24
bump setup deps for package build
Dec 16, 2023
4139b30
even more debugging
Dec 16, 2023
e5c3aa0
more debugging
Dec 16, 2023
aaa02e7
clean up debug code
Dec 16, 2023
908a401
reorganize
Dec 16, 2023
8a9b2c4
rename
Dec 16, 2023
3638584
reorg files
Dec 16, 2023
7ed131f
first attempt at an R test
Dec 16, 2023
dec8dec
remove pull_request
Dec 16, 2023
05caaff
remove testthat pkg
Dec 16, 2023
99e21a4
iterating....
Dec 16, 2023
5a625a8
add cmake
Dec 17, 2023
7b375c4
install local package
Dec 17, 2023
033184f
refinement
Dec 17, 2023
1146398
debugging
Dec 17, 2023
4c7b0e7
refinement for OOM logging
Dec 17, 2023
35cfe78
fix some typos
Dec 17, 2023
aec8fad
comments
Dec 17, 2023
03da935
improve python logging
Dec 17, 2023
6a7a962
disable fast-fail of jobs
Dec 17, 2023
4953837
test smaller buffer for R
Dec 17, 2023
c3551ba
use smaller buffers
Dec 17, 2023
2b27bab
PR review f/b
Dec 18, 2023
60283ae
add tiledbsoma package spec
Dec 18, 2023
344b79b
debugging
Dec 18, 2023
8b0f858
Merge branch 'main' into bkmartinjr/full-unit-test
Dec 18, 2023
fda81e9
remove debugging code
Dec 18, 2023
a196062
Merge branch 'main' into bkmartinjr/full-unit-test
Dec 18, 2023
4f5f5f2
new LTS census has no unique cells in previous test range
Dec 18, 2023
5a4f561
add comments on usage
Dec 19, 2023
8f39352
link back to PR
Dec 19, 2023
71260e5
Merge branch 'main' into bkmartinjr/full-unit-test
Dec 19, 2023
c1e1479
cap memory use for R acceptance tests
Dec 19, 2023
8ea2332
Merge branch 'main' into bkmartinjr/full-unit-test
Dec 19, 2023
a2f19a6
set job timeout to 24 hours
Dec 20, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 108 additions & 0 deletions .github/workflows/full-unittests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
name: cellxgene_census package full unit tests

# Run all unit tests, including those that are too expensive to run frequently.
# This workflow requires a very large capacity runner, e.g., 1+TiB RAM, which is
# currently available through self-hosted runners. These runners have no swap,
# so an OOM will cause the workflow to fail with OOMKilled (exit code 137)

on:
schedule:
- cron: "0 1 * * 6" # every Saturday night, 1AM UTC

workflow_dispatch: # used for debugging or manual validation of a branch
inputs:
tiledbsoma_python_dependency:
# Accepts any package spec that pip understand, e.g.,
# tiledbsoma==1.0
# git+https://github.com/single-cell-data/TileDB-SOMA.git#egg=tiledbsoma&subdirectory=apis/python/
# git+https://github.com/single-cell-data/[email protected]#egg=tiledbsoma&subdirectory=apis/python/
# or whatever...
description: "tiledbsoma package specified as pip requirement"
required: false
default: ""
type: string

jobs:
py_unit_tests:
runs-on: single-cell-1tb-runner
strategy:
fail-fast: false # prevent this job from killing other jobs
steps:
- name: log system state
run: |
free
echo ---------
df -kh
echo ---------
lscpu

- name: install OS dependencies
run: |
sudo apt update
sudo apt install -y build-essential git-all libxml2-dev libssl-dev libcurl4-openssl-dev cmake

- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: 3.11

- name: install python dependencies (including experimental)
run: |
python -m pip install -U pip setuptools setuptools_scm wheel
pip install --use-pep517 accumulation-tree # Geneformer dependency needs --use-pep517 for Cython
pip install -r ./api/python/cellxgene_census/scripts/requirements-dev.txt
pip install './api/python/cellxgene_census/[experimental]'

- name: install tiledbsoma version override
if: github.event_name == 'workflow_dispatch' && github.event.inputs.tiledbsoma_python_dependency != ''
run: |
# Due to a bug in the tiledbsoma setup, this must be installed editable, ie., `-e`.
# Filed as single-cell-data/TileDB-SOMA#1991
pip install -e '${{ github.event.inputs.tiledbsoma_python_dependency }}'

- name: pytest (--expensive --experimental)
run: |
echo '--------- tiledbsoma.show_package_version():'
python -c 'import tiledbsoma; tiledbsoma.show_package_versions()'
echo '--------- PIP package versions:'
pip freeze

PYTHONPATH=. pytest -v --durations=0 -rP --experimental --expensive ./api/python/cellxgene_census/tests/

r_unit_tests:
runs-on: single-cell-1tb-runner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

woot! just wanted to say 👋🏻 and introduce myself. I worked with @beroy getting these self-hosted runners set up. I'm watching them periodically and recording metrics on the nodes and autoscaler, but ping me if you see any unusual behavior or have other requirements you want us to add to them. Have fun!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @jakeyheath - these are a game changer. Really happy to have them!

So far they seem to work great, other than missing some "nice-to-have" packages installed in their default image. I have given @beroy an initial list.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easy to solve, we already chatted this morning about building a custom container for this project to use that has all the packages pre-installed.

Let's meet in the new year and see how things are going. Right now, everything is running in our czi-playground account. Might be good to put this project is a more specific location and tune the performance as needed. Here's a screenshot of the metrics I collected on the latest run. Happy to show yall how to pull these yourselves and tune to your needs.

Screenshot 2023-12-18 at 12 15 17 PM

strategy:
fail-fast: false # prevent this job from killing other jobs
steps:
- name: log system state
run: |
free
echo ---------
df -kh
echo ---------
lscpu

- name: install OS dependencies
run: |
sudo apt update
sudo apt install -y build-essential git-all libxml2-dev libssl-dev libcurl4-openssl-dev cmake

- uses: actions/checkout@v4

- uses: r-lib/actions/setup-r@v2
with:
extra-repositories: https://tiledb-inc.r-universe.dev, https://cloud.r-project.org, https://chanzuckerberg.r-universe.dev

- uses: r-lib/actions/setup-r-dependencies@v2
with:
working-directory: ./api/r/cellxgene.census
extra-packages: any::rcmdcheck, any::remotes
cache: true

- name: testthat
run: |
Rscript -e 'remotes::install_local("./api/r/cellxgene.census")'
Rscript -e 'library("tiledbsoma"); tiledbsoma::show_package_versions()'
Rscript -e 'library("testthat"); library("cellxgene.census"); test_dir("./api/r/cellxgene.census/tests/")'
Rscript -e 'library("cellxgene.census"); library(testthat); test_file("./api/r/cellxgene.census/tests/testthat/acceptance-tests.R")'
ebezzi marked this conversation as resolved.
Show resolved Hide resolved
2 changes: 1 addition & 1 deletion api/python/cellxgene_census/pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[build-system]
requires = ["setuptools>=45", "setuptools_scm[toml]>=6.2"]
requires = ["setuptools>=64", "setuptools_scm[toml]>=8"]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only tangentially related to the new GHA, but important as the existing dependencies were quite obsolete and willl fail in some scenarios. See https://setuptools-scm.readthedocs.io/en/latest/

build-backend = "setuptools.build_meta"

[project]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@ def test_hvg_vs_scanpy(
"Homo sapiens",
"is_primary_data == True",
"dataset_id",
slice(500_000, 1_000_000),
slice(1_000_000, 4_000_000),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the latest LTS census has no primary cells in the [500_000,1_000_000] soma joniid range, so this test will fail. Expanded the range to ensure we pick up some cells.

marks=pytest.mark.expensive,
),
],
Expand Down
16 changes: 8 additions & 8 deletions api/r/cellxgene.census/tests/testthat/acceptance-tests.R
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ test_that("test_incremental_read_X_human", {
})

test_that("test_incremental_read_X_human-large-buffer-size", {
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(4 * 1024**3))
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(1 * 1024**3))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on very high cpu count machines, this will OOM due to per-core buffer allocation. Reducing to a GiB resolves the issue

on.exit(census$close(), add = TRUE)

organism <- "homo_sapiens"
Expand All @@ -126,7 +126,7 @@ test_that("test_incremental_read_X_mouse", {
})

test_that("test_incremental_read_X_mouse-large-buffer-size", {
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(4 * 1024**3))
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(1 * 1024**3))
on.exit(census$close(), add = TRUE)

organism <- "mus_musculus"
Expand Down Expand Up @@ -322,7 +322,7 @@ test_that("test_seurat_common-tissue", {
})

test_that("test_seurat_common-tissue-large-buffer-size", {
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(4 * 1024**3))
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(1 * 1024**3))
on.exit(census$close(), add = TRUE)

test_args <- list(
Expand Down Expand Up @@ -350,7 +350,7 @@ test_that("test_seurat_common-cell-type", {
})

test_that("test_seurat_common-cell-type-large-buffer-size", {
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(4 * 1024**3))
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(1 * 1024**3))
on.exit(census$close(), add = TRUE)

test_args <- list(
Expand All @@ -366,7 +366,7 @@ test_that("test_seurat_common-cell-type-large-buffer-size", {
test_that("test_seurat_whole-enchilada-large-buffer-size", {
# SKIP: R is not capable to load into memory
if (FALSE) {
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(4 * 1024**3))
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(1 * 1024**3))
on.exit(census$close(), add = TRUE)

test_args <- list(
Expand Down Expand Up @@ -494,7 +494,7 @@ test_that("test_sce_common-tissue", {
})

test_that("test_sce_common-tissue-large-buffer-size", {
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(4 * 1024**3))
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(1 * 1024**3))
on.exit(census$close(), add = TRUE)

test_args <- list(
Expand Down Expand Up @@ -522,7 +522,7 @@ test_that("test_sce_common-cell-type", {
})

test_that("test_sce_common-cell-type-large-buffer-size", {
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(4 * 1024**3))
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(1 * 1024**3))
on.exit(census$close(), add = TRUE)

test_args <- list(
Expand All @@ -538,7 +538,7 @@ test_that("test_sce_common-cell-type-large-buffer-size", {
test_that("test_sce_whole-enchilada-large-buffer-size", {
# SKIP: R is not capable to load into memory
if (FALSE) {
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(4 * 1024**3))
census <- open_soma_latest_for_test(soma.init_buffer_bytes = paste(1 * 1024**3))
on.exit(census$close(), add = TRUE)

test_args <- list(
Expand Down
Loading