Skip to content

Commit

Permalink
Add a check that there are visible GPUs (#722)
Browse files Browse the repository at this point in the history
  • Loading branch information
gregtatum authored Jul 29, 2024
1 parent 27f95c9 commit 8ba3952
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 8 deletions.
4 changes: 4 additions & 0 deletions pipeline/bicleaner/bicleaner.sh
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,10 @@ else
biclean() {
export CUDA_VISIBLE_ARRAY=(${CUDA_VISIBLE_DEVICES//,/ })
export CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_ARRAY[$(($2-1))]}
# The GPU devices have failed to be found, and bicleaner AI falls back
# to operate on the CPU very slowly. To guard against this wasting expensive
# GPU time, always check that it can find GPUs.
python3 -c "import tensorflow; exit(0) if tensorflow.config.list_physical_devices('GPU') else exit(9001)"
bicleaner-ai-classify ${hardrules} --scol ${scol} --tcol ${tcol} - - $1
}
export -f biclean
Expand Down
12 changes: 4 additions & 8 deletions taskcluster/kinds/bicleaner/kind.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,10 @@ tasks:
COMPRESSION_CMD: zstdmt
ARTIFACT_EXT: zst
# 128 happens when cloning this repository fails
retry-exit-status: [128]
# 9001 is the code for when tensorflow fails to find GPUs on the system,
# and biclenaer reverts to CPU time. Rather than waste time, we should
# restart the task.
retry-exit-status: [128,9001]

# Don't run unless explicitly scheduled
run-on-tasks-for: []
Expand Down Expand Up @@ -128,10 +131,3 @@ tasks:
bicleaner-model:
- artifact: bicleaner-ai-{src_locale}-{trg_locale}.tar.zst
extract: true







0 comments on commit 8ba3952

Please sign in to comment.