diff --git a/doc/web/images/speed-architecture-full.png b/doc/web/images/speed-architecture-full.png
new file mode 100644
index 0000000..8c3d24a
Binary files /dev/null and b/doc/web/images/speed-architecture-full.png differ
diff --git a/doc/web/index.html b/doc/web/index.html
index 5a2f9b8..5fe69e1 100644
--- a/doc/web/index.html
+++ b/doc/web/index.html
@@ -19,12 +19,12 @@
Speed: The GCS ENCS Cluster
-
Serguei A. Mokhov Gillian A. Roper Carlos Alarcón Meza Network, Security and HPC Group∗
- Gina Cody School of Engineering and Computer Science
- Concordia University
- Montreal, Quebec, Canada
- rt-ex-hpc~AT~encs.concordia.ca
-
Version 7.2-dev-03
+
Serguei A. Mokhov Gillian A. Roper Carlos Alarcón Meza Farah Salhany Network, Security and HPC Group∗
+ Gina Cody School of Engineering and Computer Science
+ Concordia University
+ Montreal, Quebec, Canada
+ rt-ex-hpc~AT~encs.concordia.ca
+
Version 7.2
∗The group acknowledges the initial manual version VI produced by Dr. Scott Bunnell while with
us as well as Dr. Tariq Daradkeh for his instructional support of the users and contribution of
examples.
@@ -32,131 +32,149 @@
Speed: The GCS ENCS Cluster
Abstract
-
This document presents a quick start guide to the usage of the Gina Cody School of
- Engineering and Computer Science compute server farm called “Speed” – the GCS Speed
- cluster, managed by the HPC/NAG group of the Academic Information Technology
- Services (AITS) at GCS, Concordia University, Montreal, Canada.
+
This document serves as a quick start guide to using the Gina Cody School of Engineering
+ and Computer Science (GCS ENCS) compute server farm, known as “Speed.” Managed by
+ the HPC/NAG group of the Academic Information Technology Services (AITS) at GCS,
+ Concordia University, Montreal, Canada.
This document contains basic information required to use “Speed” as well as tips and tricks,
-examples, and references to projects and papers that have used Speed. User contributions
-of sample jobs and/ or references are welcome. Details are sent to the hpc-ml mailing
-list.
-
Note: On October 20, 2023 with workshops prior, we have completed migration to SLURM (see
-Figure 2) from Grid Engine (UGE/AGE) as our job scheduler, so this manual has been ported to use
-SLURM’s syntax and commands. If you are a long-time GE user, see Appendix A.2 key highlights of
-the move needed to translate your GE jobs to SLURM as well as environment changes. These are also
-elaborated throughout this document and our examples as well in case you desire to re-read
-it.
-
If you wish to cite this work in your acknowledgements, you can use our general DOI found on our
+
This document contains basic information required to use “Speed”, along with tips, tricks, examples,
+and references to projects and papers that have used Speed. User contributions of sample jobs and/or
+references are welcome.
+
Note: On October 20, 2023, we completed the migration to SLURM from Grid Engine (UGE/AGE)
+as our job scheduler. This manual has been updated to use SLURM’s syntax and commands. If you
+are a long-time GE user, refer to Appendix A.2 for key highlights needed to translate your GE jobs to
+SLURM as well as environment changes. These changes are also elaborated throughout this document
+and our examples.
+
+
+
1.1 Citing Us
+
If you wish to cite this work in your acknowledgements, you can use our general DOI found on our
GitHub page https://dx.doi.org/10.5281/zenodo.5683642 or a specific version of the manual and
-scripts from that link individually.
-
+scripts from that link individually. You can also use the “cite this repository” feature of
+GitHub.
+
Twenty four (24) 32-core compute nodes, each with 512 GB of memory and approximately
1 TB of local volatile-scratch disk space (pictured in Figure 1).
+
+
+
-
Twelve (12) NVIDIA Tesla P6 GPUs, with 16 GB of memory (compatible with the
+
Twelve (12) NVIDIA Tesla P6 GPUs, with 16 GB of GPU memory (compatible with the
CUDA, OpenGL, OpenCL, and Vulkan APIs).
-
4 VIDPRO nodes, with 6 P6 cards, and 6 V100 cards (32GB), and 256GB of RAM.
+
4 VIDPRO nodes (ECE. Dr. Amer), with 6 P6 cards, and 6 V100 cards (32GB), and
+ 256GB of RAM.
-
7 new SPEED2 servers with 64 CPU cores each 4x A100 80 GB GPUs, partitioned into
- 4x 20GB each; larger local storage for TMPDIR.
-
-
-
+
7 new SPEED2 servers with 256 CPU cores each 4x A100 80 GB GPUs, partitioned into
+ 4x 20GB MIGs each; larger local storage for TMPDIR (see Figure 2).
One AMD FirePro S7150 GPU, with 8 GB of memory (compatible with the Direct X,
- OpenGL, OpenCL, and Vulkan APIs).
+ OpenGL, OpenCL, and Vulkan APIs).
+
+
Salus compute node (CSSE CLAC, Drs. Bergler and Kosseim), 56 cores and 728GB of
+ RAM, see Figure 2.
+
+
Magic subcluster partition (ECE, Dr. Khendek, 11 nodes, see Figure 2).
+
+
Nebular subcluster partition (CIISE, Drs. Yan, Assi, Ghafouri, et al., Nebulae GPU
+ node with 2x RTX 6000 Ada 48GB cards, Stellar compute node, and Matrix 177TB
+ storage/compute node, see Figure 2).
+
-
1.4 What Speed Is Ideal For
+
1.5 What Speed Is Ideal For
-
To design and develop, test and run parallel, batch, and other algorithms, scripts with
- partial data sets. “Speed” has been optimised for compute jobs that are multi-core aware,
+
Design, develop, test, and run parallel, batch, and other algorithms and scripts with
+ partial data sets. “Speed” has been optimized for compute jobs that are multi-core aware,
require a large memory space, or are iteration intensive.
-
Prepare them for big clusters:
+
Prepare jobs for large clusters such as:
Digital Research Alliance of Canada (Calcul Quebec and Compute Canada)
@@ -238,20 +282,20 @@
Multi-node multi-core jobs (MPI).
-
Anything that can fit into a 500-GB memory space and a scratch space of approximately
- 10 TB.
+
Anything that can fit into a 500-GB memory space and a speed scratch space of
+ approximately 10 TB.
CPU-based jobs.
-
CUDA GPU jobs (speed-01|-03|-05, speed-17, speed-37–speed-43).
+
CUDA GPU jobs.
-
Non-CUDA GPU jobs using OpenCL (speed-19 and -01|03|05|17|25|27|37-43).
-
+
Non-CUDA GPU jobs using OpenCL.
+
-
1.5 What Speed Is Not
+
1.6 What Speed Is Not
Speed is not a web host and does not host websites.
@@ -265,249 +309,256 @@
1.5
Speed is not for jobs executed outside of the scheduler. (Jobs running outside of the
scheduler will be killed and all data lost.)
-
-
-
1.6 Available Software
-
We have a great number of open-source software available and installed on “Speed” – various
-Python, CUDA versions, C++/Java compilers, OpenGL, OpenFOAM, OpenCV, TensorFlow,
-OpenMPI, OpenISS, MARF [26], etc. There are also a number of commercial packages, subject to
-licensing contributions, available, such as MATLAB [13, 25], Abaqus [1], Ansys, Fluent [2],
-etc.
-
To see the packages available, run ls -al /encs/pkg/ on speed.encs. In particular, there are
-over 2200 programs available in /encs/bin and /encs/pkg under Scientific Linux 7 (EL7). We are
-building an equivalent array of programs for the EL9 SPEED2 nodes.
+
+
1.7 Available Software
+
There are a wide range of open-source and commercial software available and installed on “Speed.”
+This includes Abaqus [1], AllenNLP, Anaconda, ANSYS, Bazel, COMSOL, CPLEX, CUDA, Eclipse,
+Fluent [2], Gurobi, MATLAB [15, 30], OMNeT++, OpenCV, OpenFOAM, OpenMPI, OpenPMIx,
+ParaView, PyTorch, QEMU, R, Rust, and Singularity among others. Programming environments
+include various versions of Python, C++/Java compilers, TensorFlow, OpenGL, OpenISS, and
+MARF [31].
+
In particular, there are over 2200 programs available in /encs/bin and /encs/pkg under Scientific
+Linux 7 (EL7). We are building an equivalent array of programs for the EL9 SPEED2 nodes. To see
+the packages available, run ls -al /encs/pkg/ on speed.encs. See a complete list in
+Appendix D.
+
Note: We do our best to accommodate custom software requests. Python environments can use
+user-custom installs from within the scratch directory.
+
+
+
1.8 Requesting Access
+
After reviewing the “What Speed is” (Section 1.5) and “What Speed is Not” (Section 1.6), request
+access to the “Speed” cluster by emailing: rt-ex-hpc AT encs.concordia.ca.
+
+
GCS ENCS faculty and staff may request access directly.
+
-
Popular concrete examples:
+
GCS students must include the following in their request message:
-
MATLAB (R2016b, R2018a, R2018b, ...)
+
GCS ENCS username
-
Fluent (19.2, ...)
+
Name and email (CC) of the approver – either a supervisor, course instructor, or a
+ department representative (e.g., in the case of undergraduate or M.Eng. students it
+ can be the Chair, associate chair, a technical officer, or a department administrator)
+ for approval.
-
Singularity containers (see Section 2.16) can run other operating systems and Linux
- distributions, like Ubuntu’s, as well as converted Docker containers.
+
Written request from the approver for the GCS ENCS username to be granted access
+ to “Speed.”
-
We do our best to accommodate custom software requests. Python environments can use
- user-custom installs from within the scratch directory.
+
Non-GCS students taking a GCS course will have their GCS ENCS account created
+ automatically, but still need the course instructor’s approval to use the service.
-
-
A number of specific environments are available and can be loaded using the module command:
-
-
-
Python (2.3.x - 3.11.x)
-
-
Gurobi (7.0.1, 7.5.0, 8.0.0, 8.1.0)
-
-
Ansys (16, 17, 18, 19)
-
-
OpenFOAM (2.3.1, 3.0.1, 5.0, 6.0)
-
-
Cplex 12.6.x to 12.8.x
-
-
OpenMPI 1.6.x, 1.8.x, 3.1.3
-
+
Non-GCS faculty and students need to get a “sponsor” within GCS, so that a guest GCS ENCS
+ account is created first. A sponsor can be any GCS Faculty member you collaborate with.
+ Failing that, request the approval from our Dean’s Office; via our Associate Deans Drs. Eddie
+ Hoi Ng or Emad Shihab.
+
+
External entities collaborating with GCS Concordia researchers should also go through the
+ Dean’s Office for approvals.
+
+
+
2 Job Management
+
We use SLURM as the workload manager. It supports primarily two types of jobs: batch and
+interactive. Batch jobs are used to run unattended tasks, whereas, interactive jobs are are ideal for
+setting up virtual environments, compilation, and debugging.
+
Note: In the following instructions, anything bracketed like, <>, indicates a label/value to be replaced
+(the entire bracketed term needs replacement).
+
Job instructions in a script start with #SBATCH prefix, for example:
-
-
1.7 Requesting Access
-
After reviewing the “What Speed is” (Section 1.4) and “What Speed is Not” (Section 1.5), request
-access to the “Speed” cluster by emailing: rt-ex-hpc AT encs.concordia.ca. GCS ENCS
-faculty and staff may request access directly. Students must include the following in their
-message:
+
For complex compute steps within a script, use srun. We recommend using salloc for interactive
+jobs as it supports multiple steps. However, srun can also be used to start interactive jobs (see
+Section 2.8). Common and required job parameters include:
+
-
GCS ENCS username
-
-
Name and email (CC) of the supervisor or instructor
-
-
Written request from the supervisor or instructor for the ENCS username to be granted
- access to “Speed”
-
Non-GCS faculty / students need to get a “sponsor” within GCS, such that your guest GCS ENCS
-account is created first. A sponsor can be any GCS Faculty member you collaborate with. Failing
-that, request the approval from our Dean’s Office; via our Associate Deans Drs. Eddie Hoi Ng or
-Emad Shihab. External entities to Concordia who collaborate with GCS Concordia researchers, should
-also go through the Dean’s office for approvals. Non-GCS students taking a GCS course do have their
-GCS ENCS account created automatically, but still need the course instructor’s approval to use the
-service.
-
-
-
2 Job Management
-
In these instructions, anything bracketed like so, <>, indicates a label/value to be replaced (the entire
-bracketed term needs replacement). We use SLURM as the Workload Manager. It supports
-primarily two types of jobs: batch and interactive. Batch jobs are used to run unattended
-tasks.
-
TL;DR: Job instructions in a script start with #SBATCH prefix, for example:
+
We use srun for every complex compute step inside the script. Use interactive jobs to set up virtual
-environments, compilation, and debugging. salloc is preferred; allows multiple steps. srun can start
-interactive jobs as well (see Section 2.8). Required and common job parameters: memory (mem),
-time (t), job-name (J), slurm project account (A), partition (p), mail-type, ntasks (n),
-cpus-per-task.
-
-
-
2.1 Getting Started
-
Before getting started, please review the “What Speed is” (Section 1.4) and “What Speed is Not”
-(Section 1.5). Once your GCS ENCS account has been granted access to “Speed”, use
+
2.1 Getting Started
+
Before getting started, please review the “What Speed is” (Section 1.5) and “What Speed is Not”
+(Section 1.6). Once your GCS ENCS account has been granted access to “Speed”, use
your GCS ENCS account credentials to create an SSH connection to speed (an alias for
-speed-submit.encs.concordia.ca). All users are expected to have a basic understanding of Linux
-and its commonly used commands (see Appendix B.1 for resources).
-
+speed-submit.encs.concordia.ca).
+
All users are expected to have a basic understanding of Linux and its commonly used commands
+(see Appendix B for resources).
+
-
2.1.1 SSH Connections
-
Requirements to create connections to Speed:
+
2.1.1 SSH Connections
+
Requirements to create connections to “Speed”:
-
An active GCS ENCS user account, which has permission to connect to Speed (see
- Section 1.7).
+
Active GCS ENCS user account: Ensure you have an active GCS ENCS user account
+ with permission to connect to Speed (see Section 1.8).
-
If you are off campus, an active connection to Concordia’s VPN. Accessing Concordia’s
- VPN requires a Concordia netname.
+
VPN Connection (for off-campus access): If you are off-campus, you wil need to
+ establish an active connection to Concordia’s VPN, which requires a Concordia netname.
-
Windows systems require a terminal emulator such as PuTTY, Cygwin, or MobaXterm.
+
Terminal Emulator for Windows: Windows systems use a terminal emulator such as
+ PuTTY, Cygwin, or MobaXterm.
-
-
-
-
macOS systems do have a Terminal app for this or xterm that comes with XQuarz.
-
Open up a terminal window and type in the following SSH command being sure to replace
-<ENCSusername> with your ENCS account’s username.
+
Terminal for macOS: macOS systems have a built-in Terminal app or xterm that comes
+ with XQuartz.
+
To create an SSH connection to Speed, open a terminal window and type the following command,
+replacing <ENCSusername> with your ENCS account’s username:
After creating an SSH connection to Speed, you will need to make sure the srun, sbatch, and salloc
-commands are available to you. Type the command name at the command prompt and press enter.
-If the command is not available, e.g., (“command not found”) is returned, you need to
-make sure your $PATH has /local/bin in it. To view your $PATH type echo $PATH at the
-prompt.
-
The next step is to copy a job template to your home directory and to set up your cluster-specific
-storage. Execute the following command from within your home directory. (To move to your home
-directory, type cd at the Linux prompt and press Enter.)
+commands are available to you. To check this, type each command at the prompt and press Enter. If
+“command not found” is returned, you need to make sure your $PATH includes /local/bin. You can
+check your $PATH by typing:
Tip: the default shell for GCS ENCS users is tcsh. If you would like to use bash, please contact
-rt-ex-hpc AT encs.concordia.ca.
-
Note: If a “command not found” error appears after you log in to speed, your user account many
-have probably have defunct Grid Engine environment commands. See Appendix A.2 to learn how to
-prevent this error on login.
-
-
-
2.2 Job Submission Basics
-
Preparing your job for submission is fairly straightforward. Start by basing your job script on one of the
-examples available in the src/ directory of our GitHub’s (https://github.com/NAG-DevOps/speed-hpc).
-Job scripts are broken into four main sections:
+
+
The next step is to set up your cluster-specific storage “speed-scratch”, to do so, execute the following
+command from within your home directory.
+
+
+
+
+ mkdir -p /speed-scratch/$USER && cd /speed-scratch/$USER
+
+
+
Next, copy a job template to your cluster-specific storage
-
Directives
+
From Windows drive G: to Speed: cp /winhome/<1st letter of $USER>/$USER/example.sh /speed-scratch/$USER/
-
Module Loads
-
-
User Scripting
-
You can clone the tip of our repository to get the examples to start with or download them
-individually via a browser or command line:
+
From Linux drive U: to Speed: cp ~/example.sh /speed-scratch/$USER/
+
Tip: the default shell for GCS ENCS users is tcsh. If you would like to use bash, please contact
+rt-ex-hpc AT encs.concordia.ca.
+
Note: If you encounter a “command not found” error after logging in to Speed, your user account
+may have defunct Grid Engine environment commands. See Appendix A.2 for instructions on how to
+resolve this issue.
+
+
+
2.2 Job Submission Basics
+
Preparing your job for submission is fairly straightforward. Start by basing your job script on one of
+the examples available in the src/ directory of our GitHub repository. You can clone the repository to
+get the examples to start with via the command line:
+ git clone --depth=1 https://github.com/NAG-DevOps/speed-hpc.git
+ cd speed-hpc/src
-
-
Then to quickly run some sample jobs, you can:
+
+
The job script is a shell script that contains directives, module loads, and user scripting. To quickly
+run some sample jobs, use the following commands:
Directives are comments included at the beginning of a job script that set the shell and the options for
the job scheduler. The shebang directive is always the first line of a script. In your job script, this
directive sets which shell your script’s commands will run in. On “Speed”, we recommend that your
-script use a shell from the /encs/bin directory.
+script use a shell from the /encs/bin directory.
To use the tcsh shell, start your script with #!/encs/bin/tcsh. For bash, start with
-#!/encs/bin/bash. Directives that start with #SBATCH, set the options for the cluster’s Slurm job
-scheduler. The script template, template.sh, provides the essentials:
+#!/encs/bin/bash.
+
Directives that start with #SBATCH set the options for the cluster’s SLURM job scheduler. The
+following provides an example of some essential directives:
-
-#SBATCH --job-name=<jobname> ## or -J. Give the job a name
-#SBATCH --mail-type=<type> ## Set type of email notifications
-#SBATCH --chdir=<directory> ## or -D, Set working directory where output files will go
-#SBATCH --nodes=1 ## or -N, Node count required for the job
-#SBATCH --ntasks=1 ## or -n, Number of tasks to be launched
-#SBATCH --cpus-per-task=<corecount> ## or -c, Core count requested, e.g. 8 cores
-#SBATCH --mem=<memory> ## Assign memory for this job, e.g., 32G memory per node
-
-
-
Replace the following to adjust the job script for your project(s)
-
-
<jobname> with a job name for the job
+
+ #SBATCH --job-name=<jobname> ## or -J. Give the job a name
+ #SBATCH --mail-type=<type> ## set type of email notifications
+ #SBATCH --chdir=<directory> ## or -D, set working directory for the job
+ #SBATCH --nodes=1 ## or -N, node count required for the job
+ #SBATCH --ntasks=1 ## or -n, number of tasks to be launched
+ #SBATCH --cpus-per-task=<corecount> ## or -c, core count requested, e.g. 8 cores
+ #SBATCH --mem=<memory> ## assign memory for this job,
+ ## e.g., 32G memory per node
+
+
+
Replace the following to adjust the job script for your project(s)
+
+
<jobname> with a job name for the job. This name will be displayed in the job queue.
-
<directory> with the fullpath to your job’s working directory, e.g., where your code,
+
<directory> with the fullpath to your job’s working directory, e.g., where your code,
source files and where the standard output files will be written to. By default, --chdir
- sets the current directory as the job’s working directory
+ sets the current directory as the job’s working directory.
-
<type> with the type of e-mail notifications you wish to receive. Valid options are: NONE,
- BEGIN, END, FAIL, REQUEUE, ALL
+
<type> with the type of e-mail notifications you wish to receive. Valid options are: NONE,
+ BEGIN, END, FAIL, REQUEUE, ALL.
-
<corecount> with the degree of multithreaded parallelism (i.e., cores) allocated to your
+
<corecount> with the degree of multithreaded parallelism (i.e., cores) allocated to your
job. Up to 32 by default.
-
<memory> with the amount of memory, in GB, that you want to be allocated per node. Up
- to 500 depending on the node. NOTE: All jobs MUST set a value for the --mem option.
-
Example with short option equivalents:
+
<memory> with the amount of memory, in GB, that you want to be allocated per node.
+ Up to 500 depending on the node. Note: All jobs MUST set a value for the --mem option.
+
Example with short option equivalents:
-
-#SBATCH -J tmpdir ## Job’s name set to ’tmpdir’
-#SBATCH --mail-type=ALL ## Receive all email type notifications
-#SBATCH -D ./ ## Use current directory as working directory
-#SBATCH -N 1 ## Node count required for the job
-#SBATCH -n 1 ## Number of tasks to be launched
-#SBATCH -c 1 ## Request 8 cores
-#SBATCH --mem=32G ## Allocate 32G memory per node
-
-
-
If you are unsure about memory footprints, err on assigning a generous memory space to
+
+ #SBATCH -J myjob ## Job’s name set to ’myjob’
+ #SBATCH --mail-type=ALL ## Receive all email type notifications
+ #SBATCH -D ./ ## Use current directory as working directory
+ #SBATCH -N 1 ## Node count required for the job
+ #SBATCH -n 1 ## Number of tasks to be launched
+ #SBATCH -c 8 ## Request 8 cores
+ #SBATCH --mem=32G ## Allocate 32G memory per node
+
+
+
Tip: If you are unsure about memory footprints, err on assigning a generous memory space to
your job, so that it does not get prematurely terminated. You can refine --mem values
for future jobs by monitoring the size of a job’s active memory space on speed-submit
with:
@@ -515,28 +566,28 @@
2.2.1
-
-sacct -j <jobID>
-sstat -j <jobID>
+
+ sacct -j <jobID>
+ sstat -j <jobID>
-
-
This can be customized to show specific columns:
+
Memory-footprint values are also provided for completed jobs in the final e-mail notification as
-“maxvmsize”. Jobs that request a low-memory footprint are more likely to load on a busy
-cluster.
-
Other essential options are --time, or -t, and --account, or -A.
Memory-footprint efficiency values (seff) are also provided for completed jobs in the final email
+notification as “maxvmsize”. Jobs that request a low-memory footprint are more likely to load on a
+busy cluster.
+
Other essential options are --time, or -t, and --account, or -A.
--time=<time> – is the estimate of wall clock time required for your job to run. As
- preiviously mentioned, the maximum is 7 days for batch and 24 hours for interactive jobs.
+ previously mentioned, the maximum is 7 days for batch and 24 hours for interactive jobs.
Jobs with a smaller time value will have a higher priority and may result in your job
being scheduled sooner.
@@ -544,150 +595,135 @@
2.2.1
resources your job uses should be attributed to. When moving from GE to SLURM users
most users were assigned to Speed’s two default accounts speed1 and speed2. However,
users that belong to a particular research group or project are will have a default Account
- like the following aits, vidpro, gipsy, ai2, mpackir, cmos, among others.
-
-
-
-
2.2.2 Module Loads
-
As your job will run on a compute or GPU “Speed” node, and not the submit node, any software that
-is needed must be loaded by the job script. Software is loaded within the script just as it would be
-from the command line.
-
To see a list of which modules are available, execute the following from the command line on
-speed-submit.
-
-
-
+ like the following aits, vidpro, gipsy, ai2, mpackir, cmos, among others.
+
-
-module avail
-
-
-
To list for a particular program (matlab, for example):
+
2.2.2 Working with Modules
+
After setting the directives in your job script, the next section typically involves loading the necessary
+software modules. The module command is used to manage the user environment, make sure to load
+all the modules your job depends on. You can check available modules with the module avail
+command. Loading the correct modules ensures that your environment is properly set up for
+execution.
+
To list for a particular program (matlab, for example):
-module -t avail matlab
+ module avail
+ module -t avail matlab ## show the list for a particular program (e.g., matlab)
+ module -t avail m ## show the list for all programs starting with m
-
-
Which, of course, can be shortened to match all that start with a particular letter:
+
+
For example, insert the following in your script to load the matlab/R2023a module:
-module -t avail m
+ module load matlab/R2023a/default
-
-
Insert the following in your script to load the matlab/R2020a) module:
+
+
Note: you can remove a module from active use by replacing load by unload.
+
To list loaded modules:
-module load matlab/R2020a/default
+ module list
-
-
Use, unload, in place of, load, to remove a module from active use.
-
To list loaded modules:
+
+
To purge all software in your working environment:
-module list
+ module purge
-
-
To purge all software in your working environment:
-
-
-
-
-
-module purge
-
-
-
Typically, only the module load command will be used in your script.
-
-
-
2.2.3 User Scripting
-
The last part the job script is the scripting that will be executed by the job. This part of
-the job script includes all commands required to set up and execute the task your script
-has been written to do. Any Linux command can be used at this step. This section can
-be a simple call to an executable or a complex loop which iterates through a series of
-commands.
-
Any compute heavy step is preferably should be prefixed by srun as the best practice.
-
Every software program has a unique execution framework. It is the responsibility of the script’s
-author (e.g., you) to know what is required for the software used in your script by reviewing the
-software’s documentation. Regardless of which software your script calls, your script should be written
-so that the software knows the location of the input and output files as well as the degree of
-parallelism.
-
Jobs which touch data-input and data-output files more than once, should make use of TMPDIR, a
-scheduler-provided working space almost 1 TB in size. TMPDIR is created when a job starts, and
-exists on the local disk of the compute node executing your job. Using TMPDIR results
-in faster I/O operations than those to and from shared storage (which is provided over
+
+
+
+
2.2.3 User Scripting
+
The final part of the job script involves the commands that will be executed by the job. This section
+should include all necessary commands to set up and run the tasks your script is designed to perform.
+You can use any Linux command in this section, ranging from a simple executable call to a complex
+loop iterating through multiple commands.
+
Best Practice: prefix any compute-heavy step with srun. This ensures you gain proper insights on
+the execution of your job.
+
Each software program may have its own execution framework, as it’s the script’s author (e.g., you)
+responsibility to review the software’s documentation to understand its requirements. Your script
+should be written to clearly specify the location of input and output files and the degree of parallelism
+needed.
+
Jobs that involve multiple interactions with data input and output files, should make use of TMPDIR, a
+scheduler-provided workspace nearly 1 TB in size. TMPDIR is created on the local disk of the compute
+node at the start of a job, offering faster I/O operations compared to shared storage (provided over
NFS).
-
An sample job script using TMPDIR is available at /home/n/nul-uge/templateTMPDIR.sh: the job
+
An sample job script using TMPDIR is available at /home/n/nul-uge/templateTMPDIR.sh: the job
is instructed to change to $TMPDIR, to make the new directory input, to copy data from
$SLURM_SUBMIT_DIR/references/ to input/ ($SLURM_SUBMIT_DIR represents the current working
directory), to make the new directory results, to execute the program (which takes input from
$TMPDIR/input/ and writes output to $TMPDIR/results/), and finally to copy the total end results
-to an existing directory, processed, that is located in the current working directory. TMPDIR only
+to an existing directory, processed, that is located in the current working directory. TMPDIR only
exists for the duration of the job, though, so it is very important to copy relevant results from it at
job’s end.
-
+
-
2.3 Sample Job Script
-
Now, let’s look at a basic job script, tcsh.sh in Figure 3 (you can copy it from our GitHub page or
-from /home/n/nul-uge).
+
The first line is the shell declaration (also know as a shebang) and sets the shell to tcsh. The lines
-that begin with #SBATCH are directives for the scheduler.
-
+
The first line is the shell declaration (also know as a shebang) and sets the shell to tcsh. The lines that
+begin with #SBATCH are directives for the scheduler.
-
-J (or --job-name) sets tcsh-test as the job name
-
-
--chdir tells the scheduler to execute the job from the current working directory
+
-J (or --job-name) sets tcsh-test as the job name.
-
--mem=1GB requests and assigns 1GB of memory to the job. Jobs require the --mem option
- to be set either in the script or on the command line; if it’s missing job submission
+
--mem=1GB requests and assigns 1GB of memory to the job. Jobs require the --mem option
+ to be set either in the script or on the command line; if it’s missing, job submission
will be rejected.
-
The script then:
-
-
-
Sleeps on a node for 30 seconds
+
The script then:
+
+
Sleeps on a node for 30 seconds.
-
Uses the module command to load the gurobi/8.1.0 environment
+
Uses the module command to load the gurobi/8.1.0 environment.
-
Prints the list of loaded modules into a file
-
The scheduler command, sbatch, is used to submit (non-interactive) jobs. From an ssh session on
-speed-submit, submit this job with sbatch ./tcsh.sh. You will see, "Submitted batch job 2653"
-where \(2653\) is a job ID assigned. The commands, squeue and sinfo can be used to look at the status of
-the cluster: squeue -l. You will see something like this:
+
Prints the list of loaded modules into a file.
+
The scheduler command, sbatch, is used to submit (non-interactive) jobs. From an ssh session on
+“speed-submit”, submit this job with
+
+
+
+
+
+ sbatch ./tcsh.sh
+
+
+
You will see, Submitted batch job 2653 where \(2653\) is a job ID assigned. The commands squeue and
+sinfo can be used to look at the status of the cluster:
@@ -709,58 +745,143 @@
2.3
pt up 7-00:00:00 7 idle speed-[37-43]
pa up 7-00:00:00 4 idle speed-[01,03,25,27]
-
-
Remember that you only have 30 seconds before the job is essentially over, so if you do not see a
-similar output, either adjust the sleep time in the script, or execute the sbatch statement more
-quickly. The squeue output listed above shows you that your job is running on node speed-07, that it
-has a job number of 2654, its time limit of 7 days, etc.
-
Once the job finishes, there will be a new file in the directory that the job was started from,
-with the syntax of, slurm-"job id".out, so in this example the file is, slurm-2654.out.
+
+
Remember that you only have 30 seconds before the job is essentially over, so if you do not see a
+similar output, either adjust the sleep time in the script, or execute the squeue statement more
+quickly. The squeue output listed above shows that your job 2654 is running on node speed-07, and
+its time limit is 7 days, etc.
+
Once the job finishes, there will be a new file in the directory that the job was started from,
+with the syntax of, slurm-<job id>.out, so in this example the file is, slurm-2654.out.
This file represents the standard output (and error, if there is any) of the job in question.
If you look at the contents of your newly created file, you will see that it contains the
output of the, module list command. Important information is often written to this
file.
-
2.4 Common Job Management Commands Summary
-
Here are useful job-management commands:
+
2.4 Common Job Management Commands Summary
+
Here is a summary of useful job management commands for handling various aspects of job
+submission and monitoring on the Speed cluster:
-
sbatch -A <ACCOUNT> --t
- <MINUTES> --mem=20G -p <PARTITION> ./<myscript>.sh: once that your job script is
- ready, on speed-submit you can submit it using this
-
-
squeue -u <ENCSusername>: you can check the status of your job(s)
+
+
Submitting a job:
-
-
squeue: display cluster status for all users. -A shows per account (e.g., vidpro, gipsy,
- speed1, ai2, aits, etc.), -p per partition (ps, pg, pt, pa), and others. man squeue for
- details.
-
-
squeue --job [job-ID]: display job information for [job-ID] (said job may be actually
- running, or waiting in the queue).
-
-
squeue -las: displays individual job steps (for debugging easier to see which step failed
- if you used srun).
-
-
watch -n 1 "sinfo -Nel -pps,pt,pg,pa && squeue -la": view sinfo information
- and watch the queue for your job(s).
-
-
scancel [job-ID]: cancel job [job-ID].
-
-
scontrol hold [job-ID]: hold queued job, [job-ID], from running.
-
-
scontrol release [job-ID]: release held job [job-ID].
+
+
See man sacct or sacct -e for details of the available formatting options. You can define your
+
+
See man sacct or sacct -e for details of the available formatting options. You can define your
preferred default format in the SACCT_FORMAT environment variable in your .cshrc or .bashrc
files.
-
seff [job-ID]: reports on the efficiency of a job’s cpu and memory utilization. Don’t execute it
- on RUNNING jobs (only on completed/finished jobs), efficiency statistics may be
- misleading.
-
If you define the following directive in your batch script, your ENCS email address will receive
- an email with seff output when your job is finished.
+
Displaying job efficiency: (including CPU and memory utilization)
-
+
+ seff <job-ID>
+
+
+
Don’t execute it on RUNNING jobs (only on completed/finished jobs), else efficiency statistics
+ may be misleading. If you define the following directive in your batch script, your
+ GCS ENCS email address will receive an email with seff’s output when your job is
+ finished.
+
+
+
+
In addition to the basic sbatch options presented earlier, there are a few additional options that are
+
2.5 Advanced sbatch Options
+
In addition to the basic sbatch options presented earlier, there are several advanced options that are
generally useful:
-
--mail-type=TYPE: requests that the scheduler e-mail you when a job changes state.
- Where TYPE is ALL, BEGIN, END, or FAIL. Mail is sent to the default address of, "<ENCSusername>@encs.concordia.ca", which you can consult via webmail.encs via
- the VPN, on login.encs via alpine or setup forwarding to @concordia.ca address or offsite,
- unless a different address is supplied (see, --mail-user). The report sent when a job ends
- includes job runtime, as well as the maximum memory value hit (maxvmem).
-
-
--mail-user email@domain.com: requests that the scheduler use this e-mail notification
- address, rather than the default (see, --mail-type).
-
-
--export=[ALL | NONE | variables]: exports environment variable(s) that can be used
- by the script.
+
+
E-mail notifications:
-
-
-t [min] or DAYS-HH:MM:SS: sets a job runtime of min or HH:MM:SS. Note that if you
- give a single number, that represents minutes, not hours.
-
-
--depend=[state:job-ID]: run this job only when job [job-ID] finishes. Held jobs appear
- in the queue.
-
-
The many sbatch options available are read with, man sbatch. Also note that sbatch options can
-be specified during the job-submission command, and these override existing script options (if
-present). The syntax is, sbatch [options] PATHTOSCRIPT, but unlike in the script, the options are
-specified without the leading #SBATCH (e.g., sbatch -J sub-test --chdir=./ --mem=1G
-./tcsh.sh).
-
-
2.6 Array Jobs
-
Array jobs are those that start a batch job or a parallel job multiple times. Each iteration of the job
-array is called a task and receives a unique job ID. Only supported for batch jobs; submit time \(< 1\)
-second, compared to repeatedly submitting the same regular job over and over even from a
-script.
-
To submit an array job, use the --array option of the sbatch command as follows:
+
+ --mail-type=<TYPE>
+
+
Requests the scheduler to send an email when the job changes state. <TYPE> can be ALL, BEGIN,
+ END, or FAIL. Mail is sent to the default address of,
-
-sbatch --array=n-m[:s]] <batch_script>
+
+ <ENCSusername>@encs.concordia.ca
-
-
-t Option Syntax:
-
-
n: indicates the start-id.
-
-
m: indicates the max-id.
-
-
s: indicates the step size.
-
Examples:
-
-
sbatch --array=1-50000 -N1 -i my_in_%a -o my_out_%a array.sh: submits a job
- with 50000 elements, %a maps to the task-id between 1 and 50K.
-
-
sbatch --array=10 array.sh: submits a job with 1 task where the task-id is 10.
-
-
sbatch --array=1-10 array.sh: submits a job with 10 tasks numbered consecutively
- from 1 to 10.
-
-
sbatch --array=3-15:3 array.sh: submits a jobs with 5 tasks numbered consecutively
- with step size 3 (task-ids 3,6,9,12,15).
-
Output files for Array Jobs:
-
The default and output and error-files are slurm-job_id_task_id.out. This means that Speed
-creates an output and an error-file for each task generated by the array-job as well as
-one for the super-ordinate array-job. To alter this behavior use the -o and -e option of
-sbatch.
-
For more details about Array Job options, please review the manual pages for sbatch by executing
-the following at the command line on speed-submit man sbatch.
+
which you can consult via webmail.encs.concordia.ca (use VPN from off-campus) unless a
+ different address is supplied (see, --mail-user). The report sent when a job ends includes job
+ runtime, as well as the maximum memory value hit (maxvmem).
-
-#SBATCH -n 1
-#SBATCH -c [#cores for threads of a single process]
+
+ -t <MINUTES> or DAYS-HH:MM:SS
-
-
Both sbatch and salloc support -n on the command line, and it should always be used either in
-the script or on the command line as the default \(n=1\). Do not request more cores than you think
-will be useful, as larger-core jobs are more difficult to schedule. On the flip side, though, if you are
-going to be running a program that scales out to the maximum single-machine core count available,
-please (please) request 32 cores, to avoid node oversubscription (i.e., to avoid overloading the
-CPUs).
-
Important note about --ntasks or --ntasks-per-node (-n) talks about processes (usually the
-ones ran with srun). --cpus-per-task (-c) corresponds to threads per process. Some programs
-consider them equivalent, some don’t. Fluent for example uses --ntasks-per-node=8 and
---cpus-per-task=1, some just set --cpus-per-task=8 and --ntasks-per-node=1. If one of them is
-not \(1\) then some applications need to be told to use \(n*c\) total cores.
-
Core count associated with a job appears under, “AllocCPUS”, in the, qacct -j, output.
+
sets a job runtime of min or HH:MM:SS. Note that if you give a single number, that represents
+ minutes, not hours. The set runtime should not exceed the default maximums of 24h for
+ interactive jobs and 7 days for batch jobs.
+
Job sessions can be interactive, instead of batch (script) based. Such sessions can be useful for testing,
-debugging, and optimising code and resource requirements, conda or python virtual environments
-setup, or any likewise preparatory work prior to batch submission.
-
-
-
2.8.1 Command Line
-
To request an interactive job session, use, salloc [options], similarly to a sbatch command-line
-job, e.g.,
+
Runs the job only when the specified job <job-ID> finishes. This is useful for creating job
+ chains where subsequent jobs depend on the completion of previous ones.
+
Note: sbatch options can be specified during the job-submission command, and these override
+existing script options (if present). The syntax is
-
-salloc -J interactive-test --mem=1G -p ps -n 8
+
+sbatch [options] PATHTOSCRIPT
-
Inside the allocated salloc session you can run shell commands as usual; it is recommended to use
-srun for the heavy compute steps inside salloc. If it is a quick a short job just to compile something,
-e.g., on a GPU node you can use an interactive srun directly (note no srun can run within srun),
-e.g., a 1 hour allocation:
-
For tcsh:
+
but unlike in the script, the options are specified without the leading #SBATCH e.g.:
Array jobs are those that start a batch job or a parallel job multiple times. Each iteration of the job
+array is called a task and receives a unique job ID. Array jobs are particularly useful for running a
+large number of similar tasks with slight variations.
+
To submit an array job (Only supported for batch jobs), use the --array option of the sbatch
+command as follows:
Output files for Array Jobs: The default output and error-files are slurm-job_id_task_id.out. This means that Speed
+creates an output and an error-file for each task generated by the array-job, as well as
+one for the super-ordinate array-job. To alter this behavior use the -o and -e options of
+sbatch.
+
For more details about Array Job options, please review the manual pages for sbatch by executing
+the following at the command line on speed-submit man sbatch.
+
For jobs that can take advantage of multiple machine cores, you can request up to 32 cores (per job)
+in your script using the following options:
+
+
+
+
+
+#SBATCH -n <#cores for processes>
+#SBATCH -n 1
+#SBATCH -c <#cores for threads of a single process>
+
+
+
Both sbatch and salloc support -n on the command line, and it should always be used either in the
+script or on the command line as the default \(n=1\).
+
Important Considerations:
+
+
Do not request more cores than you think will be useful, as larger-core jobs are more
+ difficult to schedule.
+
+
If you are running a program that scales out to the maximum single-machine core count
+ available, please request 32 cores to avoid node oversubscription (i.e., overloading the
+ CPUs).
+
Note: --ntasks or --ntasks-per-node (-n) refers to processes (usually the ones run with srun).
+--cpus-per-task (-c) corresponds to threads per process.
+
Some programs consider them equivalent, while others do not. For example, Fluent uses
+--ntasks-per-node=8 and --cpus-per-task=1, whereas others may set --cpus-per-task=8 and
+--ntasks-per-node=1. If one of these is not 1, some applications need to be configured to use n * c
+total cores.
+
Core count associated with a job appears under, “AllocCPUS”, in the, sacct -j <job-id>,
+output.
+
+
+
+
Interactive job sessions allow you to interact with the system in real-time. These sessions are
+particularly useful for tasks such as testing, debugging, optimizing code, setting up environments, and
+other preparatory work before submitting batch jobs.
+
+
+
2.8.1 Command Line
+
To request an interactive job session, use the salloc command with appropriate options. This is
+similar to submitting a batch job but allows you to run shell commands interactively within the
+allocated resources. For example:
+
+
+
+
+
+salloc -J interactive-test --mem=1G -p ps -n 8
+
+
+
Within the allocated salloc session, you can run shell commands as usual. It is recommended to
+use srun for compute-intensive steps within salloc. If you need a quick, short job just to compile
+something on a GPU node, you can use an interactive srun directly. For example, a 1-hour
+allocation:
+
If you need to run an on-Speed graphical-based UI application (e.g., MALTLAB, Abaqus CME, etc.),
-or an IDE (PyCharm, VSCode, Eclipse) to develop and test your job’s code interactively you need to
-enable X11-forwarding from your client machine to speed then to the compute node. To do
-so:
-
To run graphical UI applications (e.g., MALTLAB, Abaqus CME, IDEs like PyCharm, VSCode,
+Eclipse, etc.) on Speed, you need to enable X11 forwarding from your client machine Speed then to
+the compute node. To do so, follow these steps:
+
-
-
you need to run an X server on your client machine, such as,
+
+
Run an X server on your client machine:
-
on Windows: MobaXterm with X turned on, or Xming + PuTTY with X11
+
Windows: Use MobaXterm with X turned on, or Xming + PuTTY with X11
forwarding, or XOrg under Cygwin
module load the required version, then matlab, or abaqus cme, etc.
-
Here’s an example of starting PyCharm (see Figure 4), of which we made a sample local
-installation. You can make a similar install under your own directory. If using VSCode, it’s currently
-only supported with the --no-sandbox option.
-
Note: with X11 forwarding the graphical rendering is happening on your client machine! That is
+you are not using GPUs on Speed to render graphics, instead all graphical information is
+forwarded from Speed to your desktop or laptop over X11, which in turn renders it using its
+own graphics card. Thus, for GPU rendering jobs either keep them non-interactive or use
+VirtualGL.
+
Here’s an example of starting PyCharm (see Figure 5). Note: If using VSCode, it’s currently only
+supported with the --no-sandbox option.
+
The scheduler presents a number of environment variables that can be used in your jobs. You can
-invoke env or printenv in your job to know what hose are (most begin with the prefix SLURM). Some
-of the more useful ones are:
+
2.9 Scheduler Environment Variables
+
The scheduler provides several environment variables that can be useful in your job scripts. These
+variables can be accessed within the job using commands like env or printenv. Many of these
+variables start with the prefix SLURM.
+
Here are some of the most useful environment variables:
-
$TMPDIR – the path to the job’s temporary space on the node. It only exists for the
- duration of the job, so if data in the temporary space are important, they absolutely need
- to be accessed before the job terminates.
+
$TMPDIR (and $SLURM_TMPDIR): This is the path to the job’s temporary space on the node.
+ It only exists for the duration of the job. If you need the data from this temporary space,
+ ensure you copy it before the job terminates.
-
$SLURM_SUBMIT_DIR – the path to the job’s working directory (likely an NFS-mounted
+
$SLURM_SUBMIT_DIR: The path to the job’s working directory (likely an NFS-mounted
path). If, --chdir, was stipulated, that path is taken; if not, the path defaults to your
home directory.
-
$SLURM_JOBID – your current jobs ID, useful for some manipulation and reporting.
+
$SLURM_JOBID: This variable holds the current job’s ID, which is useful for job
+ manipulation and reporting within the job’s process.
-
$SLURM_JOB_NODELIST=nodes participating in your job.
+
$SLURM_NTASKS: the number of cores requested for the job. This variable can be used in
+ place of hardcoded thread-request declarations, e.g., for Fluent or similar.
-
$SLURM_ARRAY_TASK_ID=for array jobs (see Section 2.6).
+
$SLURM_JOB_NODELIST: This lists the nodes participating in your job.
An example script that utilizes some of these environment variables is in Figure 12.
-
+
-
#!/encs/bin/tcsh
-
-#SBATCH--job-name=tmpdir##Give the job a name
-#SBATCH--mail-type=ALL##Receive all email type notifications
-#SBATCH--chdir=./##Use currect directory as working directory
-#SBATCH--nodes=1
-#SBATCH--ntasks=1
-#SBATCH--cpus-per-task=8##Request 8 cores
-#SBATCH--mem=32G##Assign 32G memory per node
-
-cd$TMPDIR
-mkdirinput
-rsync-av $SLURM_SUBMIT_DIR/references/ input/
-mkdirresults
-srunSTAR --inFiles $TMPDIR/input --parallel $SRUN_CPUS_PER_TASK --outFiles $TMPDIR/results
-rsync-av $TMPDIR/results/ $SLURM_SUBMIT_DIR/processed/
+
#!/encs/bin/tcsh
+
+#SBATCH--job-name=tmpdir##Give the job a name
+#SBATCH--mail-type=ALL##Receive all email type notifications
+#SBATCH--chdir=./##Use currect directory as working directory
+#SBATCH--nodes=1
+#SBATCH--ntasks=1
+#SBATCH--cpus-per-task=8##Request 8 cores
+#SBATCH--mem=32G##Assign 32G memory per node
+
+cd$TMPDIR
+mkdirinput
+rsync-av $SLURM_SUBMIT_DIR/references/ input/
+mkdirresults
+srunSTAR --inFiles $TMPDIR/input --parallel $SRUN_CPUS_PER_TASK --outFiles $TMPDIR/results
+rsync-av $TMPDIR/results/ $SLURM_SUBMIT_DIR/processed/
-Figure 9: Source code for tmpdir.sh
+Figure 12: Source code for tmpdir.sh
-
2.10 SSH Keys For MPI
-
Some programs effect their parallel processing via MPI (which is a communication protocol). An
-example of such software is Fluent. MPI needs to have ‘passwordless login’ set up, which means SSH
-keys. In your NFS-mounted home directory:
+
2.10 SSH Keys for MPI
+
Some programs, such as Fluent, utilize MPI (Message Passing Interface) for parallel processing. MPI
+requires ‘passwordless login’, which is achieved through SSH keys. Here are the steps to set up SSH
+keys for MPI:
cat id_ed25519.pub >> authorized_keys (if the authorized_keys file already exists)
- OR cat id_ed25519.pub > authorized_keys (if does not)
-
-
Set file permissions of authorized_keys to 600; of your NFS-mounted home to 700
- (note that you likely will not have to do anything here, as most people will have those
- permissions by default).
-
-
-
2.11 Creating Virtual Environments
-
The following documentation is specific to the Speed HPC Facility at the Gina Cody School of
-Engineering and Computer Science. Virtual environments typically instantiated via Conda or Python.
-Another option is Singularity detailed in Section 2.16. Usually, virtual environments are created once
-during an interactive session before submitting a batch job to the scheduler. The job script submitted
-to the scheduler is then written to (1) activate the virtual environment, (2) use it, and (3) close it at
-the end of the job.
-
+
+
Navigate to the .ssh directory
+
+
+
-
2.11.1 Anaconda
-
Request an interactive session in the queue you wish to submit your jobs to (e.g., salloc -p pg
-–gpus=1 for GPU jobs). Once your interactive has started, create an anaconda environment in your
-speed-scratch directory by using the prefix option when executing conda create. For example,
+
+ cd ~/.ssh
+
+
+
+
+
Generate a new SSH key pair (Accept the default location and leave the passphrase
+ blank)
-to create an anaconda environment for a_user, execute the following at the command
-line:
+
List Environments.
- To view your conda environments, type: conda info --envs
+
+ cat id_ed25519.pub > authorized_keys
+
+
+
+
+
Set permissions: ensure the correct permissions are set for the ‘authorized_keys’ file and your
+ home directory (most users will already have these permissions by default):
The following documentation is specific to Speed. Other clusters may have their own requirements.
+Virtual environments are typically created using Conda or Python. Another option is Singularity
+(detailed in Section 2.16). These environments are usually created once during an interactive session
+before submitting a batch job to the scheduler. The job script submitted to the scheduler
+should:
+
+
Activate the virtual environment.
+
+
Use the virtual environment.
+
+
Deactivate the virtual environment at the end of the job.
+
-
Activate an Environment.
- Activate the environment speedscratcha_usermyconda as follows
+
2.11.1 Anaconda
+
To create an Anaconda environment, follow these steps:
+
+
+
Request an interactive session
-
-conda activate /speed-scratch/a_user/myconda
+
+ salloc -p pg --gpus=1
-
After activating your environment, add pip to your environment by using
+
+
+
+
Load the Anaconda module and create your Anaconda environment in your speed-scratch
+ directory by using the --prefix option (without this option, the environment will be created in
+ your home directory by default).
Important Note: pip (and pip3) are used to install modules from the python distribution while
-conda install installs modules from anaconda’s repository.
+
+
Add pip to your environment (this will install pip and pip’s dependencies, including python,
+ into the environment.)
+
+
+
-
Conda Env without –prefix:
- If you don’t want to use the prefix option every time you create a new environment and you
-don’t want to use the default $HOME. Create a new directory an set the following variables to point to
-the new created directory, e.g:
+
If you want to make these changes permanent, add the variables to your .tcshrc or .bashrc
-(depending on the default shell you are using)
-
+
+
If you encounter no space left error while creating Conda environments, please refer to
+Appendix B.3. Likely you forgot --prefix or environment variables below.
+
Important Note: pip (and pip3) are package installers for Python. When you use pip install, it
+installs packages from the Python Package Index (PyPI), whereas, conda install installs packages
+from Anaconda’s repository.
-
2.11.2 Python
-
Setting up a Python virtual environment is fairly straightforward. The first step is to request an
-interactive session in the queue you wish to submit your jobs to.
-
We have a simple example that use a Python virtual environment:
+
2.11.1.1 Conda Env without --prefix
+ If you don’t want to use the --prefix option every time you create a new environment and do not
+want to use the default home directory, you can create a new directory and set the following variables
+to point to the newly created directory, e.g.:
+
+
+
Important Note: partition ps is used for CPU jobs, partitions pg, pt are used for GPU jobs, no
-need to use --gpus= when preparing environments for CPU jobs.
-
Important Note: our partition ps is used for CPU jobs, while pg, pt, and cl are used
+for GPU jobs. You do not need to use --gpus when preparing environments for CPU
+jobs.
+
Note: Python enviornments are also preferred over Conda in some clusters, see a note
+in Section 2.8.3.3.
+
2.12 Example Job Script: Fluent
@@ -1620,11 +1963,11 @@
+
-
#!/encs/bin/tcsh
+
#!/encs/bin/tcsh#SBATCH--job-name=flu10000##Give the job a name#SBATCH--mail-type=ALL##Receive all email type notifications
@@ -1657,126 +2000,198 @@
date
-Figure 10: Source code for fluent.sh
+Figure 13: Source code for fluent.sh
-
The job script in Figure 10 runs Fluent in parallel over 32 cores. Of note, we have requested
-e-mail notifications (--mail-type), are defining the parallel environment for, fluent, with,
--t$SLURM_NTASKS and -g-cnf=$FLUENTNODES (very important), and are setting $TMPDIR as
-the in-job location for the “moment” rfile.out file (in-job, because the last line of the
-script copies everything from $TMPDIR to a directory in the user’s NFS-mounted home).
+
The job script in Figure 13 runs Fluent in parallel over 32 cores. Notable aspects of this script
+include requesting e-mail notifications (--mail-type), defining the parallel environment
+for Fluent with -t$SLURM_NTASKS and -g-cnf=$FLUENTNODES, and setting $TMPDIR as
+the in-job location for the “moment” rfile.out file. The script also copies everything
+from $TMPDIR to a directory in the user’s NFS-mounted home after the job completes.
Job progress can be monitored by examining the standard-out file (e.g., slurm-249.out),
-and/or by examining the “moment” file in /disk/nobackup/<yourjob> (hint: it starts
-with your job-ID) on the node running the job. Caveat: take care with journal-file file
+and/or by examining the “moment” file in TMPDIR (usually /disk/nobackup/<yourjob>
+(it starts with your job-ID)) on the node running the job. Be cautious with journal-file
paths.
-
2.13 Example Job: efficientdet
-
The following steps describing how to create an efficientdet environment on Speed, were submitted by
-a member of Dr. Amer’s research group.
+
2.13 Example Job: EfficientDet
+
The following steps describe how to create an EfficientDet environment on Speed, as submitted by a
+member of Dr. Amer’s research group:
-
Enter your ENCS user account’s speed-scratch directory cd /speed-scratch/<encs_username>
-
Jobs that call java have a memory overhead, which needs to be taken into account when assigning a
-value to --mem. Even the most basic java call, java -Xmx1G -version, will need to have,
---mem=5G, with the 4-GB difference representing the memory overhead. Note that this memory
-overhead grows proportionally with the value of -Xmx. To give you an idea, when -Xmx has a
-value of 100G, --mem has to be at least 106G; for 200G, at least 211G; for 300G, at least
-314G.
-
-
-
2.15 Scheduling On The GPU Nodes
-
The primary cluster has two GPU nodes, each with six Tesla (CUDA-compatible) P6 cards: each card
-has 2048 cores and 16GB of RAM. Though note that the P6 is mainly a single-precision card, so
-unless you need the GPU double precision, double-precision calculations will be faster on a CPU
-node.
-
Job scripts for the GPU queue differ in that they need this statement, which attaches either a
-single GPU, or, two GPUs, to the job:
+
Navigate to your speed-scratch directory:
-
-#SBATCH --gpus=[1|2]
+
+ cd /speed-scratch/$USER
+
-
-
Once that your job script is ready, you can submit it to the GPU partition (queue)
-with:
+
+
+
+
Load Python module
-
-sbatch -p pg ./<myscript>.sh
+
+ module load python/3.8.3
+
-
-
And you can query nvidia-smi on the node that is running your job with:
+
Jobs that call Java have a memory overhead, which needs to be taken into account when assigning a
+value to --mem. Even the most basic Java call, such as Java -Xmx1G -version, will need to have,
+--mem=5G, with the 4 GB difference representing the memory overhead. Note that this memory
+overhead grows proportionally with the value of -Xmx. For example,
+
+
+
When -Xmx has a value of 100G, --mem has to be at least 106G.
+
+
For -Xmx of 200G, --mem has to be at least 211G.
+
+
For -Xmx of 300G, --mem has to be at least 314G.
+
+
+
+
+
+
2.15 Scheduling on the GPU Nodes
+
Speed has various GPU types in various subclusters of its nodes.
+
+
+
speed-05 and speed-17: The primary SPEED1 cluster has two GPU nodes, each with six
+ Tesla (CUDA-compatible) P6 cards. Each card has 2048 cores and 16GB of RAM. Note
+ that the P6 is mainly a single-precision card, so unless you need GPU double precision,
+ double-precision calculations will be faster on a CPU node.
+
+
speed-01: This vidpro node (see Figure 2, contact Dr. Maria Amer) is identical to 05
+ and 17 in its GPU configuration, but managed by the priority for the vidpro group, that
+ is a pg job scheduled there is a subject for preemption.
+
+
speed-03, speed-25, speed-25: These vidpro nodes feature NVIDIA V100 cards with
+ 32GB of RAM. Like speed-01, the priority is of the vidpro group, who purchased the
+ nodes, and others’ jobs are a subject from preemption within pg, pt, and cl partitions.
+
+
speed-37 – speed-43: SPEED2 nodes, the main backbone of the teaching partition pt,
+ have 4x A100 80GB GPUs each, partitioned into average 4x MIGs of 20GB each, with
+ exceptions.
+
+
nebulae: A member of the Nebular subcluster (contact Dr. Jun Yan), has 2x 48GB RTX
+ Ada 6000 cards. This node is in the pn partition.
+
+
speed-19: Has an AMD GPU, Tonga, 16GB of GPU ram. This node along with the
+ majority of the NVIDIA GPU nodes are in the cl partition (with restrictions) to run
+ OpenCL, Vulkan, and HIP jobs.
+
Job scripts for the GPU queues differ in that they need these statements, which attach either a single
+GPU or more GPUs to the job with the appropriate partition:
+
+
+
+
You can query rocm-smi on the AMD GPU node running your job with:
-
-sinfo -p pg --long --Node
+
+ ssh <ENCSusername>@speed-19 rocm-smi
-
-
Very important note regarding TensorFlow and PyTorch: if you are planning to run TensorFlow
-and/or PyTorch multi-GPU jobs, do not use the tf.distribute and/or torch.nn.DataParallel functions on speed-01,05,17, as they will crash the compute node (100%
-certainty). This appears to be the current hardware’s architecture’s defect. The workaround is to
-either manually effect GPU parallelisation (TensorFlow has an example on how to do this), or to run
-on a single GPU.
-
Important
-
Users without permission to use the GPU nodes can submit jobs to the pg partition, but those
-jobs will hang and never run. Their availability is seen with:
+
+
Important note for TensorFlow and PyTorch users: if you are planning to run
+TensorFlow and/or PyTorch multi-GPU jobs, please do not use the tf.distribute and/or
+torch.nn.DataParallel functions on speed-01, speed-05, or speed-17, as they will crash the
+compute node (100% certainty). This appears to be a defect in the current hardware architecture. The
+workaround is to either manually effect GPU parallelisation (see Section 2.15.1) (TensorFlow provides
+an example on how to do this), or to run on a single GPU, which is now the default for those
+nodes.
+
Important: Users without permission to use the GPU nodes can submit jobs to the various
+GPU partitions, but those jobs will hang and never run. Their availability can be seen
+with:
This status demonstrates that most are available (i.e., have not been requested as resources). To specifically request a
-GPU node, add, --gpus=[#GPUs], to your sbatch (statement/script) or salloc (statement) request. For example,
-sbatch -t 10 --mem=1G --gpus=1 -p pg ./tcsh.sh. You will see that this job has been assigned to one of the GPU
-nodes.
+
+
To specifically request a GPU node, add, --gpus=[#GPUs], to your sbatch statement/script or
+salloc statement request. For example:
+
+
+
+
As described lines above, P6 cards are not compatible with Distribute and DataParallel functions
-(Pytorch, Tensorflow) when running on Multi-GPUs. One workaround is to run the job in
-Multi-node, single GPU per node; per example:
+
As described earlier, P6 cards are not compatible with Distribute and DataParallel functions
+(PyTorch, Tensorflow) when running on multiple GPUs. One workaround is to run the
+job in Multi-node, single GPU per node (this applies to P6 nodes: speed-05, speed-17,
+speed-01):
An example script for training on multiple nodes with multiple GPUs is provided in
+pytorch-multinode-multigpu.sh illustrates a job for training on Multi-Nodes, Multi-GPUs
+
2.15.2 CUDA
-
When calling CUDA within job scripts, it is important to create a link to the desired CUDA libraries and
-set the runtime link path to the same libraries. For example, to use the cuda-11.5 libraries, specify
-the following in your Makefile.
+
When calling CUDA within job scripts, it is important to link to the desired the desired CUDA
+libraries and set the runtime link path to the same libraries. For example, to use the cuda-11.5
+libraries, specify the following in your Makefile.
For CUDA to compile properly for the GPU partition, edit your Makefile replacing
-usrlocalcuda with one of the above.
-
+
+
For CUDA to compile properly for the GPU partition, edit your Makefile replacing usrlocalcuda
+with one of the above.
+
2.15.4 OpenISS Examples
-
These represent more comprehensive research-like examples of jobs for computer vision and other
-tasks with a lot longer runtime (a subject to the number of epochs and other parameters) derive from
-the actual research works of students and their theses. These jobs require the use of CUDA
+
These examples represent more comprehensive research-like jobs for computer vision and other tasks
+with longer runtime (subject to the number of epochs and other parameters). They derive
+from the actual research works of students and their theses and require the use of CUDA
and GPUs. These examples are available as “native” jobs on Speed and as Singularity
containers.
+
Examples include:
-
OpenISS and REID
-
- The example openiss-reid-speed.sh illustrates a job for a computer-vision based person
-re-identification (e.g., motion capture-based tracking for stage performance) part of the OpenISS
-project by Haotao Lai [10] using TensorFlow and Keras. The fork of the original repo [12] adjusted to
-to run on Speed is here:
-
OpenISS and YOLOv3
-
- The related code using YOLOv3 framework is in the the fork of the original repo [11] adjusted to
-to run on Speed is here:
+
2.15.4.1 OpenISS and REID
+ A computer-vision-based person re-identification (e.g., motion capture-based tracking for stage
+performance) part of the OpenISS project by Haotao Lai [12] using TensorFlow and Keras. The script
+is available here: openiss-reid-speed.sh. The fork of the original repo [14] adjusted to run on Speed is
+available here: openiss-reid-tfk. Detailed instructions on how to run it on Speed are in the README:
+https://github.com/NAG-DevOps/speed-hpc/tree/master/src#openiss-reid-tfk
Its example job scripts can run on both CPUs and GPUs, as well as interactively using
-TensorFlow:
+
2.15.4.2 OpenISS and YOLOv3
+ The related code using YOLOv3 framework is in the the fork of the original repo [13] adjusted to
+to run on Speed is available here: openiss-yolov3.
+
Example job scripts can run on both CPUs and GPUs, as well as interactively using TensorFlow:
If the /encs software tree does not have a required software instantaneously available, another option
-is to run Singularity containers. We run EL7 flavor of Linux, and if some projects require Ubuntu or
-other distributions, there is a possibility to run that software as a container, including the ones
-translated from Docker.
-
The example lambdal-singularity.sh showcases an immediate use of a container built for the
-Ubuntu-based LambdaLabs software stack, originally built as a Docker image then pulled
-in as a Singularity container that is immediately available for use as that job example
-illustrates. The source material used for the docker image was our fork of their official repo:
-https://github.com/NAG-DevOps/lambda-stack-dockerfiles
-
-
-
-
NOTE: It is important if you make your own containers or pull from DockerHub, use your
-/speed-scratch/$USER directory as these images may easily consume gigs of space in your home
-directory and you’d run out of quota there very fast.
-
We likewise built equivalent OpenISS (Section 2.15.4) containers from their Docker
-counter parts as they were used for teaching and research [14]. The images from
+
Singularity is a container platform designed to execute applications in a portable, reproducible, and
+secure manner. Unlike Docker, Singularity does not require root privileges, making it more suitable for
+HPC environments. If the /encs software tree does not have the required software available, another
+option is to run Singularity containers. We run EL7 and EL9 flavors of Linux, and if some projects
+require Ubuntu or other distributions, it is possible to run that software as a container,
+including those converted from Docker. The currently recommended version of Singularity is
+singularity/3.10.4/default.
+
The example lambdal-singularity.sh showcases an immediate use of a container built for the
+Ubuntu-based LambdaLabs software stack, originally built as a Docker image then pulled in as a
+Singularity container. The source material used for the docker image was our fork of their official
+repository: https://github.com/NAG-DevOps/lambda-stack-dockerfiles.
+
Note: If you make your own containers or pull from DockerHub, use your /speed-scratch/$USER
+directory, as these images may easily consume gigabytes of space in your home directory, quickly
+exhausting your quota.
+
Tip: To check your quota and find big files, see Section B.3 and ENCS Data Storage.
+
We have also built equivalent OpenISS (Section 2.15.4) containers from their
+Docker counterparts for teaching and research purposes [16]. The images from
https://github.com/NAG-DevOps/openiss-dockerfiles and their DockerHub equivalents
-https://hub.docker.com/u/openiss are found in the same public directory on
-/speed-scratch/nag-public as the LambdaLabs Singularity image. They all have .sif extension.
-Some of them can be ran in both batch or interactive mode, some make more sense to
-run interactively. They cover some basics with CUDA, OpenGL rendering, and computer
-vision tasks as examples from the OpenISS library and other libraries, including the base
-images that use different distros. We also include Jupyter notebook example with Conda
+https://hub.docker.com/u/openiss can be found in /speed-scratch/nag-public with a ‘.sif’
+extension. Some can be run in both batch and interactive modes, covering basics with CUDA,
+OpenGL rendering, and computer vision tasks. Examples include Jupyter notebooks with Conda
support.
The currently recommended version of Singularity is singularity/3.10.4/default.
-
This section comprises an introduction to working with Singularity, its containers, and what can
-and cannot be done with Singularity on the ENCS infrastructure. It is not intended to be an
-exhaustive presentation of Singularity: the program’s authors do a good job of that here:
-https://www.sylabs.io/docs/. It also assumes that you have successfully installed Singularity on a
-user-managed/personal system (see next paragraph as to why).
-
Singularity containers are essentially either built from an existing container, or are built from
-scratch. Building from scratch requires a recipe file (think of like a Dockerfile), and the operation must
-be effected as root. You will not have root on the ENCS infrastructure, so any built-from-scratch
-containers must be created on a user-managed/personal system. Root-level permissions are also
-required (in some cases, essential; in others, for proper build functionality) for building from an
-existing container. Three types of Singularity containers can be built: file-system; sandbox; squashfs.
-The first two are “writable” (meaning that changes can persist after the Singularity session ends).
-File-system containers are built around the ext3 file system, and are a read-write “file”, sandbox
-containers are essentially a directory in an existing read-write space, and squashfs containers are
-a read-only compressed “file”. Note that file-system containers cannot be resized once
-built.
-
Note that the default build is a squashfs one. Also note what Singularity’s authors have to say
-about the builds, “A common workflow is to use the “sandbox” mode for development of
-the container, and then build it as a default (squashfs) Singularity image when done.”
-File-system containers are considered to be, “legacy”, at this point in time. When built, a
-very small overhead is allotted to a file-system container (think, MB), and that cannot be
-changed.
-
Probably for the most of your workflows you might find there is a Docker container exists for your
-tasks, in this case you can use the docker pull function of Singularity as a part of you virtual
-environment setup as an interactive job allocation:
-
-
-
-
This method can be used for converting Docker containers directly on Speed. On GPU nodes make
-sure to pass on the --nv flag to Singularity, so its containers could access the GPUs. See the linked
-example.
-
This section introduces working with Singularity, its containers, and what can and cannot be done
+with Singularity on the ENCS infrastructure. For comprehensive documentation, refer to the authors’
+guide: https://www.sylabs.io/docs/.
+
Singularity containers are either built from an existing container, or from scratch. Building from
+scratch requires a recipe file (think of like a Dockerfile) and must be done with root permissions,
+which are not available on the ENCS infrastructure. Therefore, built-from-scratch containers
+must be created on a user-managed/personal system. There are three types of Singularity
+containers:
+
+
+
File-system containers: built around the ext3 file system and are read-write “file”, but
+ cannot be resized once built.
+
+
Sandbox containers: essentially a directory in an existing read-write space and are also
+ read-write.
+
+
Squashfs containers: read-only compressed “file” and are read-only. It is the default build
+ type.
+
“A common workflow is to use the “sandbox” mode for container development and then build it as a
+default (squashfs) Singularity image when done.” says the Singularity’s authors about builds.
+File-system containers are considered legacy and are not commonly used.
+
For many workflows, a Docker container might already exist. In this case, you can use
+Singularity’s docker pull function as part of your virtual environment setup in an interactive job
+allocation:
+
+
+
+
+
+ salloc --gpus=1 -n8 --mem=4Gb -t60
+ cd /speed-scratch/$USER/
+ singularity pull openiss-cuda-devicequery.sif docker://openiss/openiss-cuda-devicequery
+ INFO: Converting OCI blobs to SIF format
+ INFO: Starting build...
+
+
+
This method can be used for converting Docker containers directly on Speed. On GPU nodes, make
+sure to pass on the --nv flag to Singularity so its containers could access the GPUs. See the linked
+example for more details.
+
3 Conclusion
-
The cluster is, “first come, first served”, until it fills, and then job position in the queue is
-based upon past usage. The scheduler does attempt to fill gaps, though, so sometimes a
-single-core job of lower priority will schedule before a multi-core job of higher priority, for
-example.
-
+
The cluster operates on a “first-come, first-served” basis until it reaches full capacity. After that, job
+positions in the queue are determined based on past usage. The scheduler does attempt to fill gaps, so
+occasionally, a single-core job with lower priority may be scheduled before a multi-core job with higher
+priority.
+
3.1 Important Limitations
+
While Speed is a powerful tool, it is essential to recognize its limitations to use it effectively:
+
-
New users are restricted to a total of 32 cores: write to rt-ex-hpc@encs.concordia.ca
- if you need more temporarily (192 is the maximum, or, 6 jobs of 32 cores each).
+
New users are limited to a total of 32 cores and 4 GPUs. If you need more cores
+ temporarily, please contact rt-ex-hpc AT encs.concordia.ca.
-
Batch job sessions are a maximum of one week in length (only 24 hours, though, for
- interactive jobs, see Section 2.8).
+
Batch job sessions can run for a maximum of one week. Interactive jobs are limited to 24
+ hours see Section 2.8.
-
Scripts can live in your NFS-provided home, but any substantial data need to be in your
- cluster-specific directory (located at /speed-scratch/<ENCSusername>/).
-
NFS is great for acute activity, but is not ideal for chronic activity. Any data that a job will
- read more than once should be copied at the start to the scratch disk of a compute node
- using $TMPDIR (and, perhaps, $SLURM_SUBMIT_DIR), any intermediary job data should be
- produced in $TMPDIR, and once a job is near to finishing, those data should be copied
+
Scripts can live in your NFS-provided home directory, but substantial data should be
+ stored in your cluster-specific directory (located at /speed-scratch/<ENCSusername>/).
- to your NFS-mounted home (or other NFS-mounted space) from $TMPDIR (to, perhaps,
- $SLURM_SUBMIT_DIR). In other words, IO-intensive operations should be effected locally
- whenever possible, saving network activity for the start and end of jobs.
+
NFS is suitable for short-term activities but not for long-term operations. Data that a
+ job will read multiple times should be copied at the start to the scratch disk of a
+ compute node using $TMPDIR (and possibly $SLURM_SUBMIT_DIR). Intermediate job data
+ should be produced in $TMPDIR, and once a job is near completion, these data should be
+ copied to your NFS-mounted home directory (or other NFS-mounted space). In other
+ words, IO-intensive operations should be performed locally whenever possible,
+ reserving network activity for the start and end of jobs.
-
Your current resource allocation is based upon past usage, which is an amalgamation of
- approximately one week’s worth of past wallclock (i.e., time spent on the node(s)) and
- compute activity (on the node(s)).
+
Your current resource allocation is based on past usage, which considers approximately
+ one week’s worth of past wall clock time (time spent on the node(s)) and compute activity
+ (on the node(s)).
-
Jobs should NEVER be run outside of the province of the scheduler. Repeat offenders
- risk loss of cluster access.
-
-
+
Jobs must always be run within the scheduler’s system. Repeat offenders who run jobs
+ outside the scheduler risk losing cluster access.
+
3.2 Tips/Tricks
-
Files/scripts must have Linux line breaks in them (not Windows ones). Use file command
- to verify; and dos2unix command to convert.
+
Ensure that files and scripts have Linux line breaks. Use the file command to verify and
+ dos2unix to convert if necessary.
-
Use rsync, not scp, when copying or moving large amounts of data.
+
Use rsync (preferred over scp) for copying or moving large amounts of data.
-
Before moving a large amount of files between NFS-mounted storage and the cluster, tar
- up the files you plan to move first.
+
Before transferring a large number of files between NFS-mounted storage and the cluster,
+ compress the files into a tar archive.
-
If you intend to use a different shell (e.g., bash[22]), you will need to change the shell
- declaration in your script(s).
+
If you plan to use a different shell (e.g., bash[27]), change the shell declaration at the
+ beginning of your script(s).
-
Try to request resources that closely match what your job will use: requesting
- many more cores or much more memory than will be needed makes a job
- more difficult to schedule when resources are scarce.
-
-
-
+
Request resources (cores, memory, GPUs) that closely match the actual needs of your job.
+ Requesting significantly more than necessary can make your job harder to schedule when
+ resources are limited. Always check the efficiency of your job with either seff and/or the
+ --mail-type=ALL, to adjust your job parameters.
-
E-mail, rt-ex-hpc AT encs.concordia.ca, with any concerns/questions.
-
+
For any concerns or questions, email rt-ex-hpc AT encs.concordia.ca
+
3.3 Use Cases
-
HPC Committee’s initial batch about 6 students (end of 2019):
+
HPC Committee’s initial batch about 6 students (end of 2019):
10000 iterations job in Fluent finished in \(<26\) hours vs. 46 hours in Calcul Quebec
compilation of forensic computing reasoning cases about false or true positives of
hardware address spoofing in the labs
-
S4 LAB/GIPSY R&D Group’s:
+
S4 LAB/GIPSY R&D Group’s:
MARFCAT and MARFPCAT (OSS signal processing and machine learning tools for
- vulnerable and weak code analysis and network packet capture analysis) [20, 15, 6]
+ vulnerable and weak code analysis and network packet capture analysis) [22, 17, 6]
Web service data conversion and analysis
-
Forensic Lucid encoders (translation of large log data into Forensic Lucid [16] for
+
Forensic Lucid encoders (translation of large log data into Forensic Lucid [18] for
forensic analysis)
-
-
Genomic alignment exercises
+
+
Genomic alignment exercises
+
+
Best Paper award, Tariq Daradkeh, Gillian Roper, Carlos Alarcon Meza, and Serguei
+ Mokhov. HPC jobs classification and resource prediction to minimize job failures. In
+ International Conference on Computer Systems and Technologies 2024 (CompSysTech ’24), New
+ York, NY, USA, June 2024. ACM
+
+
Newton F. Ouedraogo and Ebenezer E. Essel. Unsteady wake interference of unequal-height
+ tandem cylinders mounted in a turbulent boundary layer. Journal of Fluid Mechanics, 977:A52,
+ 2023. https://doi.org/10.1017/jfm.2023.952
+
+
Newton F. Ouedraogo and Ebenezer E. Essel. Effects of Reynolds number on the wake
+ characteristics of a Notchback Ahmed body. Journal of Fluids Engineering, 146(11):111302, 05
+ 2024
+
+
L. Drummond, H. Banh, N. Ouedraogo, H. Ho, and E. Essel. Effects of nozzle convergence
+ angle on the flow characteristics of a synthetic circular jet in a crossflow. In Bulletin of the
+ American Physical Society, editor, 76th Annual Meeting of the Division of Fluid Dynamics,
+ November 2023
+
+
N. Ouedraogo, A. Cyrus, and E. Essel. Effects of Reynolds number on the wake characteristics
+ of a Notchback Ahmed body. In Bulletin of the American Physical Society, editor, 76th Annual
+ Meeting of the Division of Fluid Dynamics, November 2023
Serguei Mokhov, Jonathan Llewellyn, Carlos Alarcon Meza, Tariq Daradkeh, and Gillian Roper.
The use of containers in OpenGL, ML and HPC for teaching and research support. In
@@ -2100,13 +2529,14 @@
Farshad Rezaei and Marius Paraschivoiu. Computational challenges of simulating vertical axis
wind turbine on the roof-top corner of a building. Progress in Canadian Mechanical
Engineering, 6, 1–6 2023. http://hdl.handle.net/11143/20861
@@ -2122,9 +2552,6 @@
3.3
Belkacem Belabes and Marius Paraschivoiu. CFD study of the aerodynamic performance of a
vertical axis wind turbine in the wake of another turbine. In Proceedings of the CSME
International Congress, 2022. https://doi.org/10.7939/r3-rker-1746
-
-
-
Belkacem Belabes and Marius Paraschivoiu. Numerical study of the effect of turbulence intensity on
VAWT performance. Energy, 233:121139, 2021. https://doi.org/10.1016/j.energy.2021.121139
@@ -2134,131 +2561,137 @@
The work “Haotao Lai. An OpenISS framework specialization for deep learning-based
+
The work “Haotao Lai. An OpenISS framework specialization for deep learning-based
person re-identification. Master’s thesis, Department of Computer Science and
Software Engineering, Concordia University, Montreal, Canada, August 2019.
https://spectrum.library.concordia.ca/id/eprint/985788/” using TensorFlow and Keras
on OpenISS adjusted to run on Speed based on the repositories:
The first 6 (to 6.5) versions of this manual and early UGE job script samples, Singularity
+
The first 6 to 6.5 versions of this manual and early UGE job script samples, Singularity
testing and user support were produced/done by Dr. Scott Bunnell during his time at
Concordia as a part of the NAG/HPC group. We thank him for his contributions.
-
-
-
The HTML version with devcontainer support was contributed by Anh H Nguyen.
-
Dr. Tariq Daradkeh, was our IT Instructional Specialist August 2022 to September
+
Dr. Tariq Daradkeh, was our IT Instructional Specialist from August 2022 to September
2023; working on the scheduler, scheduling research, end user support, and integration
- of examples, such as YOLOv3 in Section 2.15.4.0 other tasks. We have a continued
- collaboration on HPC/scheduling research.
-
+ of examples, such as YOLOv3 in Section 2.15.4.2 and other tasks. We have a continued
+ collaboration on HPC/scheduling research (see [8]).
+
A.2 Migration from UGE to SLURM
-
For long term users who started off with Grid Engine here are some resources to make a transition
+
For long term users who started off with Grid Engine here are some resources to make a transition
and mapping to the job submission process.
+
+
+
-
Queues are called “partitions” in SLURM. Our mapping from the GE queues to SLURM
+
Queues are called “partitions” in SLURM. Our mapping from the GE queues to SLURM
partitions is as follows:
-
+
GE => SLURM
s.q ps
g.q pg
a.q pa
-
We also have a new partition pt that covers SPEED2 nodes, which previously did not
+
We also have a new partition pt that covers SPEED2 nodes, which previously did not
exist.
-
-Figure 11: Rosetta Mappings of Scheduler Commands from SchedMD
+
+Figure 14: Rosetta Mappings of Scheduler Commands from SchedMD
-
NOTE: If you have used UGE commands in the past you probably still have these lines there;
+
NOTE: If you have used UGE commands in the past you probably still have these lines there;
they should now be removed, as they have no use in SLURM and will start giving
“command not found” errors on login when the software is removed:
-
# Speed environment set up
if [ $HOSTNAME = "speed-submit.encs.concordia.ca" ]; then
. /local/pkg/uge-8.6.3/root/default/common/settings.sh
printenv ORGANIZATION | grep -qw ENCS || . /encs/Share/bash/profile
fi
-
-
Note that you will need to either log out and back in, or execute a new shell, for the
- environment changes in the updated .tcshrc or .bashrc file to be applied (important).
+
+
IMPORTANT NOTE: you will need to either log out and back in, or execute a new
+ shell, for the environment changes in the updated .tcshrc or .bashrc file to be
+ applied.
-
+
A.3 Phases
-
Brief summary of Speed evolution phases.
-
+
Brief summary of Speed evolution phases.
+
-
A.3.1 Phase 4
-
Phase 4 had 7 SuperMicro servers with 4x A100 80GB GPUs each added, dubbed as “SPEED2”. We
+
A.3.1 Phase 5
+
Phase 5 saw incorporation of the Salus, Magic, and Nebular subclusters (see Figure 2).
+
+
+
A.3.2 Phase 4
+
Phase 4 had 7 SuperMicro servers with 4x A100 80GB GPUs each added, dubbed as “SPEED2”. We
also moved from Grid Engine to SLURM.
-
+
-
A.3.2 Phase 3
-
Phase 3 had 4 vidpro nodes added from Dr. Amer totalling 6x P6 and 6x V100 GPUs
+
A.3.3 Phase 3
+
Phase 3 had 4 vidpro nodes added from Dr. Amer totalling 6x P6 and 6x V100 GPUs
added.
-
-
-
A.3.3 Phase 2
-
Phase 2 saw 6x NVIDIA Tesla P6 added and 8x more compute nodes. The P6s replaced 4x of FirePro
-S7150.
-
+
+
+
A.3.4 Phase 2
+
Phase 2 saw 6x NVIDIA Tesla P6 added and 8x more compute nodes. The P6s replaced 4x of FirePro
+S7150.
+
-
A.3.4 Phase 1
-
Phase 1 of Speed was of the following configuration:
+
A.3.5 Phase 1
+
Phase 1 of Speed was of the following configuration:
Sixteen, 32-core nodes, each with 512 GB of memory and approximately 1 TB of
@@ -2266,228 +2699,247 @@
A.3.4
Five AMD FirePro S7150 GPUs, with 8 GB of memory (compatible with the Direct X,
OpenGL, OpenCL, and Vulkan APIs).
-
-
-
B Frequently Asked Questions
-
+
-
B.1 Where do I learn about Linux?
-
All Speed users are expected to have a basic understanding of Linux and its commonly used
-commands.
-
All Speed users are expected to have a basic understanding of Linux and its commonly used
+commands. Here are some recommended resources:
-
Udemy
-
There are a number of Udemy courses, including free ones, that will assist you in learning Linux.
-Active Concordia faculty, staff and students have access to Udemy courses. The course Linux
-Mastery: Master the Linux Command Line in 11.5 Hours is a good starting point for
-beginners. Visit https://www.concordia.ca/it/services/udemy.html to learn how Concordians
-may access Udemy.
+
Software Carpentry
+ Software Carpentry provides free resources to learn software, including a workshop on the Unix
+shell. Visit Software Carpentry Lessons to learn more.
-
-
B.2 How to use the “bash shell” on Speed?
-
This section describes how to use the “bash shell” on Speed. Review Section 2.1.2 to ensure that your
-bash environment is set up.
-
+
Udemy
+ There are numerous Udemy courses, including free ones, that will help you learn Linux.
+Active Concordia faculty, staff and students have access to Udemy courses. A recommended
+starting point for beginners is the course “Linux Mastery: Master the Linux Command
+Line in 11.5 Hours”. Visit Concordia’s Udemy page to learn how Concordians can access
+Udemy.
+
-
B.2.1 How do I set bash as my login shell?
-
In order to set your default login shell to bash on Speed, your login shell on all GCS servers must be
-changed to bash. To make this change, create a ticket with the Service Desk (or email help at
-concordia.ca) to request that bash become your default login shell for your ENCS user account on
-all GCS servers.
-
+
B.2 How to use bash shell on Speed?
+
This section provides comprehensive instructions on how to utilize the bash shell on the Speed
+cluster.
+
-
B.2.2 How do I move into a bash shell on Speed?
-
To move to the bash shell, type bash at the command prompt. For example:
+
B.2.1 How do I set bash as my login shell?
+
To set your default login shell to bash on Speed, your login shell on all GCS servers must be changed
+to bash. To make this change, create a ticket with the Service Desk (or email help at concordia.ca)
+to request that bash become your default login shell for your ENCS user account on all GCS
+servers.
+
+
+
B.2.2 How do I move into a bash shell on Speed?
+
To move to the bash shell, type bash at the command prompt:
The “Disk quota exceeded” Error occurs when your application has run out of disk space to write
-to. On Speed this error can be returned when:
-
+
B.3.1 Probable Cause
+
The “Disk quota exceeded” error occurs when your application has run out of disk space to write
+to. On Speed, this error can be returned when:
-
Your NFS-provided home is full and cannot be written to. You can verify this using quota
- and bigfiles commands.
+
The NFS-provided home is full and cannot be written to. You can verify this using the
+ quota and bigfiles commands.
-
The /tmp directory on the speed node your application is running on is full and cannot
- be written to.
-
+
The “/tmp” directory on the speed node where your application is running is full and
+ cannot be written to.
+
-
B.3.2 Possible Solutions
-
+
B.3.2 Possible Solutions
+
-
Use the --chdir job script option to set the directory that the job script is submitted
- from the job working directory. The job working directory is the directory that
- the job will write output files in.
-
-
-
The use local disk space is generally recommended for IO intensive operations. However, as the
- size of /tmp on speed nodes is 1TB it can be necessary for scripts to store temporary data
- elsewhere. Review the documentation for each module called within your script to determine
- how to set working directories for that application. The basic steps for this solution are:
-
-
-
-
+
Use the --chdir job script option to set the job working directory. This is the directory
+ where the job will write output files.
+
+
+
Although local disk space is recommended for IO-intensive operations, the ‘/tmp’ directory on
+ Speed nodes is limited to 1TB, so it may be necessary to store temporary data elsewhere.
+ Review the documentation for each module used in your script to determine how to set working
+ directories. The basic steps are:
-
Review the documentation on how to set working directories for each module called
- by the job script.
+
Determine how to set working directories for each module used in your job script.
-
Create a working directory in speed-scratch for output files. For example, this
- command will create a subdirectory called output in your speed-scratch
- directory:
+
Create a working directory in speed-scratch for output files:
-
+
mkdir -m 750 /speed-scratch/$USER/output
-
-
+
-
To create a subdirectory for recovery files:
+
Create a subdirectory for recovery files:
-
+
mkdir -m 750 /speed-scratch/$USER/recovery
-
+
-
Update the job script to write output to the subdirectories you created in your
- speed-scratch directory, e.g., /speed-scratch/$USER/output.
+
Update the job script to write output to the directories created in your speed-scratch
+ directory, e.g., /speed-scratch/$USER/output.
-
In the above example, $USER is an environment variable containing your ENCS username.
+
In the above example, $USER is an environment variable containing your ENCS username.
-
B.3.3 Example of setting working directories for COMSOL
+
B.3.3 Example of setting working directories for COMSOL
-
Create directories for recovery, temporary, and configuration files. For example, to create these
- directories for your GCS ENCS user account:
+
Create directories for recovery, temporary, and configuration files.
In the above example, $USER is an environment variable containing your ENCS username.
-
-
-
B.3.4 Example of setting working directories for Python Modules
-
By default when adding a python module the /tmp directory is set as the temporary repository for
-files downloads. The size of the /tmp directory on speed-submit is too small for pytorch. To add a
-python module
+
+
In the above example, $USER is an environment variable containing your ENCS username.
+
+
+
B.3.4 Example of setting working directories for Python Modules
+
By default when adding a Python module, the /tmp directory is set as the temporary repository for
+files downloads. The size of the /tmp directory on speed-submit is too small for PyTorch. To add a
+Python module
-
Create your own tmp directory in your speed-scratch directory
+
Create your own tmp directory in your speed-scratch directory:
-
- mkdir /speed-scratch/$USER/tmp
+
+ mkdir /speed-scratch/$USER/tmp
+
+
+
+
+
Use the temporary directory you created
+
+
+
+
+
+ setenv TMPDIR /speed-scratch/$USER/tmp
+
Attempt the installation of PyTorch
+
In the above example, $USER is an environment variable containing your ENCS username.
+
+
+
B.4 How do I check my job’s status?
+
When a job with a job ID of 1234 is running or terminated, you can track its status using the
+following commands:
+
-
Use the tmp directory you created
+
Use the “sacct” command to view the status of a job:
-
- setenv TMPDIR /speed-scratch/$USER/tmp
+
+ sacct -j 1234
+
+
+
+
+
Use the “squeue” command to see if the job is sitting in the queue:
+
+
+
+
+
+ squeue -j 1234
-
+
-
Attempt the installation of pytorch
-
In the above example, $USER is an environment variable containing your ENCS username.
-
+
+
Use the “sstat” command to find long-term statistics on the job after it has terminated and the
+ slurmctld has purged it from its tracking state into the database:
+
+
+
-
B.4 How do I check my job’s status?
-
When a job with a job id of 1234 is running or terminated, the status of that job can be tracked using
-‘sacct -j 1234’. squeue -j 1234 can show while the job is sitting in the queue as well. Long term
-statistics on the job after its terminated can be found using sstat -j 1234 after slurmctld purges it
-its tracking state into the database.
-
+
+ sstat -j 1234
+
+
+
-
B.5 Why is my job pending when nodes are empty?
-
+
B.5 Why is my job pending when nodes are empty?
+
-
B.5.1 Disabled nodes
-
It is possible that one or a number of the Speed nodes are disabled. Nodes are disabled if they require
-maintenance. To verify if Speed nodes are disabled, see if they are in a draining or drained
-state:
+
B.5.1 Disabled nodes
+
It is possible that one or more of the Speed nodes are disabled for maintenance. To verify if Speed
+nodes are disabled, check if they are in a draining or drained state:
Note which nodes are in the state of drained. Why the state is drained can be found in the reason
-column.
-
Your job will run once an occupied node becomes availble or the maintenance has been completed
-and the disabled nodes have a state of idle.
-
-
-
B.5.2 Error in job submit request.
-
It is possible that your job is pending, because the job requested resources that are not available
-within Speed. To verify why job id 1234 is not running, execute ‘sacct -j 1234’. A summary of the
-reasons is available via the squeue command.
-
+
+
Note which nodes are in the state of drained. The reason for the drained state can be found in the
+reason column.
+
Your job will run once an occupied node becomes availble or the maintenance is completed, and the
+disabled nodes have a state of idle.
+
+
+
+
+
B.5.2 Error in job submit request.
+
It is possible that your job is pending because it requested resources that are not available within
+Speed. To verify why job ID 1234 is not running, execute:
-
C Sister Facilities
-
Below is a list of resources and facilities similar to Speed at various capacities. Depending on your
+
+
+sacct -j 1234
+
+
+
A summary of the reasons can be obtained via the squeue command.
+
+
+
C Sister Facilities
+
Below is a list of resources and facilities similar to Speed at various capacities. Depending on your
research group and needs, they might be available to you. They are not managed by HPC/NAG of
AITS, so contact their respective representatives.
-
computation.encs CPU only 3-machine cluster running longer jobs without a scheduler
- at the moment
+
computation.encs
+ is a CPU-only 3-machine cluster running longer jobs without a scheduler at the moment.
+ Shares the same EL7 software tree as Speed’s EL7 nodes as well as lab desktops. See
+ https://www.concordia.ca/ginacody/aits/public-servers.html.
apini.encs cluster for teaching and MPI programming (see the corresponding course in
- CSSE)
+ CSSE), managed by CSSE
Computer Science and Software Engineering (CSSE) Virya GPU Cluster. For CSSE
members only. The cluster has 4 nodes with total of 32 NVIDIA GPUs (a mix of V100s
- and A100s). To request access send email to virya.help AT concordia.ca.
+ and A100s). To request access send email to virya.help AT concordia.ca. This includes
+ an Atlas Analytics partition of Dr. Mahdi Husseini.
+
+
Dr. Eugene Belilovsky hightower Exxact, and megatower graphcore clusters.
-
Dr. Maria Amer’s VidPro group’s nodes in Speed (-01, -03, -25, -27) with additional V100
+
Dr. Maria Amer’s VidPro group’s nodes in Speed (-01, -03, -25, -27) with additional V100
and P6 GPUs.
+
+
+
-
There are various Lambda Labs other GPU servers and like computers acquired by individual
+
There are various Lambda Labs other GPU servers and like computers acquired by individual
researchers; if you are member of their research group, contact them directly. These resources
are not managed by us.
-
Dr. Amin Hammad’s construction.encs Lambda Labs station
+
Dr. Amin Hammad’s construction.encs Lambda Labs station
-
Dr. Hassan Rivaz’s impactlab.encs Lambda Labs station
+
Dr. Hassan Rivaz’s impactlab.encs Lambda Labs station
-
Dr. Nizar Bouguila’s xailab.encs Lambda Labs station
+
Dr. Nizar Bouguila’s xailab.encs Lambda Labs station
-
Dr. Roch Glitho’s femto.encs server
-
-
-
+
Dr. Roch Glitho’s femto.encs server
-
Dr. Maria Amer’s venom.encs Lambda Labs station
+
Dr. Maria Amer’s venom.encs Lambda Labs station
-
Dr. Leon Wang’s guerrera.encs DGX station
+
Dr. Leon Wang’s guerrera.encs DGX station
-
Dr. Ivan Contreras’ servers (managed by AITS)
+
Dr. Ivan Contreras’ 4 Operations Research group servers (managed by AITS).
C Digital Research Alliance Canada (Compute Canada / Calcul Quebec), https://alliancecan.ca/. Follow this link on the information how to obtain access (students
need to be sponsored by their supervising faculty members, who should create accounts first).
Their SLURM examples are here: https://docs.alliancecan.ca/wiki/Running_jobs
+
+
+
+
+
+
D Software Installed On Speed
+
This is a generated section by a script; last updated on Tue Jul 23 10:48:52 PM EDT 2024. We have
+two major software trees: Scientific Linux 7 (EL7), which is outgoing, and AlmaLinux 9 (EL9). After
+major synchronization of software packages is complete, we will stop maintaining the EL7 tree and
+will migrate the remaining nodes to EL9.
+
Use --constraint=el7 to select EL7-only installed nodes for their software packages. Conversely,
+use --constraint=el9 for the EL9-only software. These options would be used as a part of your job
+parameters in either #SBATCH or on the command line.
+
NOTE: this list does not include packages installed directly on the OS (yet).
+
+
+
D.1 EL7
+
Not all packages are intended for HPC, but the common tree is available on Speed as well as teaching
+labs’ desktops.
+
- [3]Belkacem Belabes and Marius Paraschivoiu.
- Numerical study of the effect of turbulence intensity on VAWT performance. Energy, 233:121139,
- 2021. https://doi.org/10.1016/j.energy.2021.121139.
-
-
- [4]Belkacem Belabes and Marius Paraschivoiu. CFD study of the aerodynamic performance of a
- vertical axis wind turbine in the wake of another turbine. In Proceedings of the CSME International
- Congress, 2022. https://doi.org/10.7939/r3-rker-1746.
-
-
- [5]Belkacem Belabes and Marius Paraschivoiu. CFD modeling of vertical-axis wind turbine wake
- interaction. Transactions of the Canadian Society for Mechanical Engineering, pages 1–10, 2023.
- https://doi.org/10.1139/tcsme-2022-0149.
-
-
- [6]Amine Boukhtouta, Nour-Eddine Lakhdari, Serguei A. Mokhov, and Mourad Debbabi.
- Towards fingerprinting malicious traffic. In Proceedings of ANT’13, volume 19, pages 548–555.
- Elsevier, June 2013.
-
-
- [7]Amy Brown and Greg Wilson, editors. The Architecture of Open Source Applications:
- Elegance, Evolution, and a Few Fearless Hacks, volume I. aosabook.org, March 2012. Online at
- http://aosabook.org.
-
- [9]Goutam Yelluru Gopal and Maria Amer. Separable self and mixed attention transformers
- for efficient object tracking. In IEEE/CVF Winter Conference on Applications of Computer
- Vision (WACV), Waikoloa, Hawaii, January 2024. https://arxiv.org/abs/2309.03979and
- https://github.com/goutamyg/SMAT.
+
+
acl-10.1.express
+
+
acroread-9.5.5
+
+
ADS-2016.01
+
+
ADS-2017.01
+
+
ADS-2019
+
+
ADS-2020u1
+
+
adt_bundle-20140702
+
+
alpine-2.00
+
+
alpine-2.25
+
+
anaconda-1.7.0
+
+
anaconda2-2019.07
+
+
anaconda2-5.1.0
+
+
anaconda3-2019.07
+
+
anaconda3-2019.10
+
+
anaconda3-2021.05
+
+
anaconda3-2023.03
-
-
- [10]Haotao Lai. An OpenISS framework
- specialization for deep learning-based person re-identification. Master’s thesis, Department of
- Computer Science and Software Engineering, Concordia University, Montreal, Canada, August
- 2019. https://spectrum.library.concordia.ca/id/eprint/985788/.
-
- [14]Serguei Mokhov, Jonathan Llewellyn, Carlos Alarcon Meza, Tariq Daradkeh, and Gillian
- Roper. The use of containers in OpenGL, ML and HPC for teaching and research support.
- In ACM SIGGRAPH 2023 Posters, SIGGRAPH ’23, New York, NY, USA, 2023. ACM.
- https://doi.org/10.1145/3588028.3603676.
-
-
- [15]Serguei A. Mokhov. The use of machine learning with signal- and NLP processing
- of source code to fingerprint, detect, and classify vulnerabilities and weaknesses with
- MARFCAT. Technical Report NIST SP 500-283, NIST, October 2011. Report:
- http://www.nist.gov/manuscript-publication-search.cfm?pub_id=909407, online e-print at
+
+ [3]Belkacem Belabes and Marius Paraschivoiu.
+ Numerical study of the effect of turbulence intensity on VAWT performance. Energy, 233:121139,
+ 2021. https://doi.org/10.1016/j.energy.2021.121139.
+
+
+ [4]Belkacem Belabes and Marius Paraschivoiu. CFD study of the aerodynamic performance of a
+ vertical axis wind turbine in the wake of another turbine. In Proceedings of the CSME International
+ Congress, 2022. https://doi.org/10.7939/r3-rker-1746.
+
+
+ [5]Belkacem Belabes and Marius Paraschivoiu. CFD modeling of vertical-axis wind turbine wake
+ interaction. Transactions of the Canadian Society for Mechanical Engineering, pages 1–10, 2023.
+ https://doi.org/10.1139/tcsme-2022-0149.
+
+
+ [6]Amine Boukhtouta, Nour-Eddine Lakhdari, Serguei A. Mokhov, and Mourad Debbabi.
+ Towards fingerprinting malicious traffic. In Proceedings of ANT’13, volume 19, pages 548–555.
+ Elsevier, June 2013.
+
+
+ [7]Amy Brown and Greg Wilson, editors. The Architecture of Open Source Applications:
+ Elegance, Evolution, and a Few Fearless Hacks, volume I. aosabook.org, March 2012. Online at
+ http://aosabook.org.
+
+
+ [8]Tariq Daradkeh, Gillian Roper, Carlos Alarcon Meza, and Serguei Mokhov. HPC jobs
+ classification and resource prediction to minimize job failures. In International Conference on
+ Computer Systems and Technologies 2024 (CompSysTech ’24), New York, NY, USA, June 2024.
+ ACM.
+
+
+ [9]L. Drummond, H. Banh, N. Ouedraogo, H. Ho, and E. Essel. Effects of nozzle convergence
+ angle on the flow characteristics of a synthetic circular jet in a crossflow. In Bulletin of the
+
+
+
+ American Physical Society, editor, 76th Annual Meeting of the Division of Fluid Dynamics,
+ November 2023.
+
+
+ [10]Goutam Yelluru Gopal and Maria Amer. Mobile vision transformer-based visual object
+ tracking. In 34th British Machine Vision Conference (BMVC), Aberdeen, UK, November 2023.
+ https://arxiv.org/abs/2309.05829and https://github.com/goutamyg/MVT.
+
+
+ [11]Goutam Yelluru Gopal and Maria Amer. Separable self and mixed attention transformers
+ for efficient object tracking. In IEEE/CVF Winter Conference on Applications of Computer
+ Vision (WACV), Waikoloa, Hawaii, January 2024. https://arxiv.org/abs/2309.03979and
+ https://github.com/goutamyg/SMAT.
+
+
+ [12]Haotao Lai. An OpenISS framework
+ specialization for deep learning-based person re-identification. Master’s thesis, Department of
+ Computer Science and Software Engineering, Concordia University, Montreal, Canada, August
+ 2019. https://spectrum.library.concordia.ca/id/eprint/985788/.
+
+ [16]Serguei Mokhov, Jonathan Llewellyn, Carlos Alarcon Meza, Tariq Daradkeh, and Gillian
+ Roper. The use of containers in OpenGL, ML and HPC for teaching and research support.
+ In ACM SIGGRAPH 2023 Posters, SIGGRAPH ’23, New York, NY, USA, 2023. ACM.
+ https://doi.org/10.1145/3588028.3603676.
+
- [16]Serguei A. Mokhov. Intensional Cyberforensics. PhD thesis, Department of Computer Science
+ [18]Serguei A. Mokhov. Intensional Cyberforensics. PhD thesis, Department of Computer Science
and Software Engineering, Concordia University, Montreal, Canada, September 2013. Online athttp://arxiv.org/abs/1312.0466.
+
+
+
- [17]Serguei A. Mokhov, Michael J. Assels, Joey Paquet, and Mourad Debbabi. Automating MAC
+ [19]Serguei A. Mokhov, Michael J. Assels, Joey Paquet, and Mourad Debbabi. Automating MAC
spoofer evidence gathering and encoding for investigations. In Frederic Cuppens et al., editors,
Proceedings of The 7th International Symposium on Foundations & Practice of Security (FPS’14),
LNCS 8930, pages 168–183. Springer, November 2014. Full paper.
- [18]Serguei A. Mokhov, Michael J. Assels, Joey Paquet, and Mourad Debbabi. Toward automated
+ [20]Serguei A. Mokhov, Michael J. Assels, Joey Paquet, and Mourad Debbabi. Toward automated
MAC spoofer investigations. In Proceedings of C3S2E’14, pages 179–184. ACM, August 2014.
Short paper.
- [20]Serguei A. Mokhov, Joey Paquet, and Mourad Debbabi. The use of NLP techniques in static
+ [22]Serguei A. Mokhov, Joey Paquet, and Mourad Debbabi. The use of NLP techniques in static
code analysis to detect weaknesses and vulnerabilities. In Maria Sokolova and Peter van Beek,
editors, Proceedings of Canadian Conference on AI’14, volume 8436 of LNAI, pages 326–332.
Springer, May 2014. Short paper.
- [21]Parna Niksirat, Adriana Daca, and Krzysztof Skonieczny. The effects of reduced-gravity
+ [23]Parna Niksirat, Adriana Daca, and Krzysztof Skonieczny. The effects of reduced-gravity
on planetary rover mobility. International Journal of Robotics Research, 39(7):797–811, 2020.https://doi.org/10.1177/0278364920913945.
- [22]Chet Ramey. The Bourne-Again Shell. In Brown and Wilson [7].
+ [24]N. Ouedraogo, A. Cyrus, and E. Essel. Effects of Reynolds number on the wake characteristics
+ of a Notchback Ahmed body. In Bulletin of the American Physical Society, editor, 76th Annual
+ Meeting of the Division of Fluid Dynamics, November 2023.
+
+
+ [25]Newton F. Ouedraogo and Ebenezer E. Essel. Unsteady wake interference of unequal-height
+ tandem cylinders mounted in a turbulent boundary layer. Journal of Fluid Mechanics, 977:A52,
+ 2023. https://doi.org/10.1017/jfm.2023.952.
+
+
+ [26]Newton F. Ouedraogo and Ebenezer E. Essel. Effects of Reynolds number on the wake
+ characteristics of a Notchback Ahmed body. Journal of Fluids Engineering, 146(11):111302, 05
+ 2024.
+
- [23]Farshad Rezaei and Marius Paraschivoiu. Placing a small-scale vertical axis wind turbine on
+ [28]Farshad Rezaei and Marius Paraschivoiu. Placing a small-scale vertical axis wind turbine on
roof-top corner of a building. In Proceedings of the CSME International Congress, June 2022.
+
+
+
https://doi.org/10.7939/r3-j7v7-m909.
- [24]Farshad Rezaei and Marius Paraschivoiu. Computational challenges of simulating vertical axis
+ [29]Farshad Rezaei and Marius Paraschivoiu. Computational challenges of simulating vertical axis
wind turbine on the roof-top corner of a building. Progress in Canadian Mechanical Engineering,
6, 1–6 2023. http://hdl.handle.net/11143/20861.
- [26]The MARF Research and Development Group. The Modular Audio Recognition
+ [31]The MARF Research and Development Group. The Modular Audio Recognition
Framework and its Applications. [online], 2002–2014. http://marf.sf.netandhttp://arxiv.org/abs/0905.1235, last viewed May 2015.