You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Testing it on our large test nodes, the commands seem to work quite well for a single subject
would like to parallelize them to process my entire study.
participants each have around 30 sessions.
Attempting to parallelize each subject on our GPU clusters appears to fail, the jobs keep getting killed due to being out of memory. In fact, BIDSMREYE seems to take an extremely long time just to begin, about several hours for the job to begin.
#!/bin/bash -l#SBATCH --job-name=[bidsmreye]#SBATCH -o log/bidsmreye_%a.txt#SBATCH -e log/bidsmreye_%a.err#SBATCH --nodes=1#SBATCH --ntasks-per-node=1#SBATCH --cpus-per-task=8#SBATCH --mem-per-cpu=8G#SBATCH --account=DBIC#SBATCH --partition=gpuq#SBATCH --gres=gpu:2#SBATCH --time=7-01:00:00#SBATCH --mail-type=FAIL,END#SBATCH --requeue#SBATCH --array=0-11# Output and error log directories
output_log_dir="log"
error_log_dir="log"# Create the directories if they don't exist
mkdir -p "$output_log_dir"
mkdir -p "$error_log_dir"# Must run on a GPU node
module load cuda
module load TensorRT
nvidia-smi
echo$CUDA_VISIBLE_DEVICES
hostname
# bidsmreye requires input fmridata (fmriprep outputs) to be at least realigned# Filenames and structure that conforms to a BIDS derivative dataset# Had to add these lines to initialize conda
conda init bash
source~/.bashrc
conda activate deepmreye
# Check if SLURM_ARRAY_TASK_ID is not set or is emptyif [ -z"$SLURM_ARRAY_TASK_ID" ];then# Set SLURM_ARRAY_TASK_ID to a default value, e.g., 1
SLURM_ARRAY_TASK_ID=0
fi
bids_dir="/dartfs-hpc/rc/lab/C/CANlab/labdata/data/WASABI/derivatives/fmriprep-try2"
output_dir="/dartfs-hpc/rc/lab/C/CANlab/labdata/data/WASABI/derivatives/deepmreye"
SUBJECTS=(SID000002 SID000743 SID001567 SID001651 SID001804 SID001907 SID001641 SID001684 SID001852 SID002035 SID002263 SID002328)
SUBJ=${SUBJECTS[$SLURM_ARRAY_TASK_ID]}echo"processing bidsmreye for ${SUBJ}..."# Preparing the data, then Computing the eye movements (action prepare; action generalize)# Prepare: registers the data to MNI if this is not the case already, registers the data the the deepmreye template, extracts data from the eyes mask
bidsmreye --action all \
${bids_dir} \
${output_dir} \
participant --participant_label ${SUBJ}# Group Level Summary
bidsmreye --action qc \
${bids_dir} \
${output_dir} \
participant --participant_label ${SUBJ}echo"processing complete"
The text was updated successfully, but these errors were encountered:
To further clarify this issue, this occurs when using the conda environment installed bidsmreye. The following messages appear before processing begins:
2023-09-18 12:41:18.717612: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-18 12:41:25.070354: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Remi-Gau
changed the title
very slow start time when paralelizing
very slow start time when paralellizing
Sep 18, 2023
Remi-Gau
changed the title
very slow start time when paralellizing
very slow start time when parallelizing
Sep 18, 2023
The text was updated successfully, but these errors were encountered: