-
Notifications
You must be signed in to change notification settings - Fork 0
/
rancher-condor.html
543 lines (489 loc) · 29.6 KB
/
rancher-condor.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<title>4.5 Rancher Condor | BisQue_Documentation.utf8.md</title>
<meta name="description" content="Bisque (Bio-Image Semantic Query User Environment) : Store, visualize, organize and analyze images in the cloud." />
<meta name="generator" content="bookdown 0.13 and GitBook 2.6.7" />
<meta property="og:title" content="4.5 Rancher Condor | BisQue_Documentation.utf8.md" />
<meta property="og:type" content="book" />
<meta property="og:description" content="Bisque (Bio-Image Semantic Query User Environment) : Store, visualize, organize and analyze images in the cloud." />
<meta name="twitter:card" content="summary" />
<meta name="twitter:title" content="4.5 Rancher Condor | BisQue_Documentation.utf8.md" />
<meta name="twitter:description" content="Bisque (Bio-Image Semantic Query User Environment) : Store, visualize, organize and analyze images in the cloud." />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black" />
<link rel="prev" href="connoisseurgpu-workload-provisioning.html"/>
<script src="libs/jquery-2.2.3/jquery.min.js"></script>
<link href="libs/gitbook-2.6.7/css/style.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-table.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-bookdown.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-highlight.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-search.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-fontsettings.css" rel="stylesheet" />
<style type="text/css">
a.sourceLine { display: inline-block; line-height: 1.25; }
a.sourceLine { pointer-events: none; color: inherit; text-decoration: inherit; }
a.sourceLine:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode { white-space: pre; position: relative; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
code.sourceCode { white-space: pre-wrap; }
a.sourceLine { text-indent: -1em; padding-left: 1em; }
}
pre.numberSource a.sourceLine
{ position: relative; left: -4em; }
pre.numberSource a.sourceLine::before
{ content: attr(title);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; pointer-events: all; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
a.sourceLine::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
<link rel="stylesheet" href="style.css" type="text/css" />
</head>
<body>
<div class="book without-animation with-summary font-size-2 font-family-1" data-basepath=".">
<div class="book-summary">
<nav role="navigation">
<ul class="summary">
<li><a href="./">Official UCSB BisQue Documentation</a></li>
<li class="divider"></li>
<li class="chapter" data-level="" data-path="index.html"><a href="index.html"><i class="fa fa-check"></i>Getting Started</a><ul>
<li class="chapter" data-level="" data-path="bisque-docker-installation.html"><a href="bisque-docker-installation.html"><i class="fa fa-check"></i>BisQue Docker Installation</a></li>
</ul></li>
<li class="chapter" data-level="1" data-path="intro.html"><a href="intro.html"><i class="fa fa-check"></i><b>1</b> BisQue Cloud Instance</a><ul>
<li class="chapter" data-level="1.1" data-path="how-to-login-or-create-a-new-user.html"><a href="how-to-login-or-create-a-new-user.html"><i class="fa fa-check"></i><b>1.1</b> How to Login or Create a New User</a></li>
<li class="chapter" data-level="1.2" data-path="how-to-upload-data.html"><a href="how-to-upload-data.html"><i class="fa fa-check"></i><b>1.2</b> How to Upload Data</a><ul>
<li class="chapter" data-level="" data-path="how-to-upload-data.html"><a href="how-to-upload-data.html#step-1.-login"><i class="fa fa-check"></i>Step 1. Login</a></li>
<li class="chapter" data-level="" data-path="how-to-upload-data.html"><a href="how-to-upload-data.html#step-2.-upload-file-or-folder"><i class="fa fa-check"></i>Step 2. Upload File or Folder</a></li>
<li class="chapter" data-level="" data-path="how-to-upload-data.html"><a href="how-to-upload-data.html#step-3.-build-a-dataset"><i class="fa fa-check"></i>Step 3. Build a Dataset</a></li>
<li class="chapter" data-level="" data-path="how-to-upload-data.html"><a href="how-to-upload-data.html#step-4.-setting-permissions"><i class="fa fa-check"></i>Step 4. Setting Permissions</a></li>
</ul></li>
<li class="chapter" data-level="1.3" data-path="how-to-run-a-module.html"><a href="how-to-run-a-module.html"><i class="fa fa-check"></i><b>1.3</b> How to Run a Module</a></li>
<li class="chapter" data-level="1.4" data-path="imagej-module.html"><a href="imagej-module.html"><i class="fa fa-check"></i><b>1.4</b> ImageJ Module</a><ul>
<li class="chapter" data-level="" data-path="imagej-module.html"><a href="imagej-module.html#overview"><i class="fa fa-check"></i>Overview</a></li>
<li class="chapter" data-level="" data-path="imagej-module.html"><a href="imagej-module.html#uploading-of-imagej-pipelines"><i class="fa fa-check"></i>Uploading of ImageJ Pipelines</a></li>
<li class="chapter" data-level="" data-path="imagej-module.html"><a href="imagej-module.html#running-an-imagej-macro"><i class="fa fa-check"></i>Running an ImageJ Macro</a></li>
<li class="chapter" data-level="" data-path="imagej-module.html"><a href="imagej-module.html#special-bisque-extensions-for-imagej"><i class="fa fa-check"></i>Special BisQue extensions for ImageJ</a></li>
</ul></li>
<li class="chapter" data-level="1.5" data-path="cellprofiler-module.html"><a href="cellprofiler-module.html"><i class="fa fa-check"></i><b>1.5</b> CellProfiler Module</a><ul>
<li class="chapter" data-level="" data-path="cellprofiler-module.html"><a href="cellprofiler-module.html#overview-1"><i class="fa fa-check"></i>Overview</a></li>
<li class="chapter" data-level="" data-path="cellprofiler-module.html"><a href="cellprofiler-module.html#uploading-of-cellprofiler-pipelines"><i class="fa fa-check"></i>Uploading of CellProfiler Pipelines</a></li>
<li class="chapter" data-level="" data-path="cellprofiler-module.html"><a href="cellprofiler-module.html#running-a-cellprofiler-pipeline"><i class="fa fa-check"></i>Running a CellProfiler Pipeline</a></li>
<li class="chapter" data-level="" data-path="cellprofiler-module.html"><a href="cellprofiler-module.html#results"><i class="fa fa-check"></i>Results</a></li>
</ul></li>
</ul></li>
<li class="chapter" data-level="2" data-path="module-development.html"><a href="module-development.html"><i class="fa fa-check"></i><b>2</b> Module Development</a><ul>
<li class="chapter" data-level="2.1" data-path="dockerfile.html"><a href="dockerfile.html"><i class="fa fa-check"></i><b>2.1</b> Dockerfile</a></li>
<li class="chapter" data-level="2.2" data-path="module-xml.html"><a href="module-xml.html"><i class="fa fa-check"></i><b>2.2</b> Module XML</a><ul>
<li class="chapter" data-level="" data-path="module-xml.html"><a href="module-xml.html#module-description"><i class="fa fa-check"></i>Module Description</a></li>
<li class="chapter" data-level="" data-path="module-xml.html"><a href="module-xml.html#configurations-for-images-datasets-and-resources"><i class="fa fa-check"></i>Configurations for Images, Datasets, and Resources</a></li>
<li class="chapter" data-level="" data-path="module-xml.html"><a href="module-xml.html#data-parallel-execution"><i class="fa fa-check"></i>Data-Parallel Execution</a></li>
</ul></li>
<li class="chapter" data-level="2.3" data-path="python-script-wrapper.html"><a href="python-script-wrapper.html"><i class="fa fa-check"></i><b>2.3</b> Python Script Wrapper</a><ul>
<li><a href="python-script-wrapper.html#example.-python-script-wrapper"><em>Example.</em> Python Script Wrapper</a></li>
</ul></li>
<li class="chapter" data-level="2.4" data-path="source-code.html"><a href="source-code.html"><i class="fa fa-check"></i><b>2.4</b> Source Code</a><ul>
<li><a href="source-code.html#example.-composite-strength-module"><em>Example.</em> Composite Strength Module</a></li>
</ul></li>
<li class="chapter" data-level="2.5" data-path="runtime-module-configuration.html"><a href="runtime-module-configuration.html"><i class="fa fa-check"></i><b>2.5</b> Runtime Module Configuration</a></li>
<li class="chapter" data-level="2.6" data-path="python-setup.html"><a href="python-setup.html"><i class="fa fa-check"></i><b>2.6</b> Python Setup</a></li>
</ul></li>
<li class="chapter" data-level="3" data-path="bqapi.html"><a href="bqapi.html"><i class="fa fa-check"></i><b>3</b> BQAPI</a></li>
<li class="chapter" data-level="4" data-path="bisque-development.html"><a href="bisque-development.html"><i class="fa fa-check"></i><b>4</b> BisQue Development</a><ul>
<li class="chapter" data-level="4.1" data-path="source-code-installation.html"><a href="source-code-installation.html"><i class="fa fa-check"></i><b>4.1</b> Source Code Installation</a><ul>
<li class="chapter" data-level="" data-path="source-code-installation.html"><a href="source-code-installation.html#clone-the-repository-and-prepare-virtual-environment"><i class="fa fa-check"></i>Clone the repository and Prepare Virtual Environment</a></li>
<li class="chapter" data-level="" data-path="source-code-installation.html"><a href="source-code-installation.html#configure-bisque-environment"><i class="fa fa-check"></i>Configure Bisque Environment</a></li>
<li class="chapter" data-level="" data-path="source-code-installation.html"><a href="source-code-installation.html#run-bisque-server"><i class="fa fa-check"></i>Run Bisque Server</a></li>
<li class="chapter" data-level="" data-path="source-code-installation.html"><a href="source-code-installation.html#upload-dataset"><i class="fa fa-check"></i>Upload Dataset</a></li>
<li class="chapter" data-level="" data-path="source-code-installation.html"><a href="source-code-installation.html#module"><i class="fa fa-check"></i>Module</a></li>
<li class="chapter" data-level="" data-path="source-code-installation.html"><a href="source-code-installation.html#debug-module"><i class="fa fa-check"></i>Debug Module</a></li>
<li class="chapter" data-level="" data-path="source-code-installation.html"><a href="source-code-installation.html#g.-module-mexui"><i class="fa fa-check"></i>G. Module MEX/UI</a></li>
</ul></li>
<li class="chapter" data-level="4.2" data-path="planteome-deep-segment.html"><a href="planteome-deep-segment.html"><i class="fa fa-check"></i><b>4.2</b> Planteome Deep Segment</a></li>
<li class="chapter" data-level="4.3" data-path="rancher-2-0-setup-wkubernetes-engine.html"><a href="rancher-2-0-setup-wkubernetes-engine.html"><i class="fa fa-check"></i><b>4.3</b> Rancher 2.0 Setup (w/Kubernetes engine)</a><ul>
<li class="chapter" data-level="4.3.1" data-path="rancher-2-0-setup-wkubernetes-engine.html"><a href="rancher-2-0-setup-wkubernetes-engine.html#pre-requisite"><i class="fa fa-check"></i><b>4.3.1</b> Pre-requisite</a></li>
<li class="chapter" data-level="4.3.2" data-path="rancher-2-0-setup-wkubernetes-engine.html"><a href="rancher-2-0-setup-wkubernetes-engine.html#setup-workload-on-the-bq-cluster"><i class="fa fa-check"></i><b>4.3.2</b> Setup Workload on the bq-cluster</a></li>
</ul></li>
<li class="chapter" data-level="4.4" data-path="connoisseurgpu-workload-provisioning.html"><a href="connoisseurgpu-workload-provisioning.html"><i class="fa fa-check"></i><b>4.4</b> Connoisseur/GPU Workload provisioning</a></li>
<li class="chapter" data-level="4.5" data-path="rancher-condor.html"><a href="rancher-condor.html"><i class="fa fa-check"></i><b>4.5</b> Rancher Condor</a><ul>
<li class="chapter" data-level="4.5.1" data-path="rancher-condor.html"><a href="rancher-condor.html#topology"><i class="fa fa-check"></i><b>4.5.1</b> Topology</a></li>
<li class="chapter" data-level="4.5.2" data-path="rancher-condor.html"><a href="rancher-condor.html#masterworker-condor-config"><i class="fa fa-check"></i><b>4.5.2</b> Master/Worker Condor Config</a></li>
<li class="chapter" data-level="4.5.3" data-path="rancher-condor.html"><a href="rancher-condor.html#bisquesvc-submit-node-condor-config"><i class="fa fa-check"></i><b>4.5.3</b> BisqueSvc Submit Node Condor Config</a></li>
<li class="chapter" data-level="4.5.4" data-path="rancher-condor.html"><a href="rancher-condor.html#state-logs"><i class="fa fa-check"></i><b>4.5.4</b> State & logs</a></li>
<li class="chapter" data-level="4.5.5" data-path="rancher-condor.html"><a href="rancher-condor.html#test-condor"><i class="fa fa-check"></i><b>4.5.5</b> Test Condor</a></li>
</ul></li>
</ul></li>
<li class="divider"></li>
<li><a href="https://bisque.ece.ucsb.edu/" target="blank">UCSB BisQue Cloud Platform</a></li>
</ul>
</nav>
</div>
<div class="book-body">
<div class="body-inner">
<div class="book-header" role="navigation">
<h1>
<i class="fa fa-circle-o-notch fa-spin"></i><a href="./"></a>
</h1>
</div>
<div class="page-wrapper" tabindex="-1" role="main">
<div class="page-inner">
<section class="normal" id="section-">
<div id="rancher-condor" class="section level2">
<h2><span class="header-section-number">4.5</span> Rancher Condor</h2>
<p>The Condor instructions are based out of custom image(biodev.ece.ucsb.edu:5000/condor)</p>
<p>Official Docs: <a href="https://research.cs.wisc.edu/htcondor/manual/v8.8" class="uri">https://research.cs.wisc.edu/htcondor/manual/v8.8</a></p>
<div id="topology" class="section level3">
<h3><span class="header-section-number">4.5.1</span> Topology</h3>
<p>The condor image available at the registry has <code>htcondor==8.4.2~dfsg.1-1build1</code> pre-installed.</p>
<ul>
<li>Submit Node (host = bisquesvc.prod)</li>
<li>Master node (host = master.condor)</li>
<li>Worker Nodes (host = worker*.condor)
<ul>
<li>Docker typically runs at worker and on the Bisque submit node for cases where we dont use condor at all</li>
<li>Test this using <code>docker ps</code> on the worker nodes</li>
</ul></li>
</ul>
<blockquote>
<p>Now on each node we should start Condor and also make sure that the hostname “master.condor” is reachable from all the nodes in the pool, including the bisque submit node.</p>
</blockquote>
<pre><code>$ ping master.condor
PING master.condor.svc.cluster.local (10.43.55.33) 56(84) bytes of data.
$ service condor start</code></pre>
</div>
<div id="masterworker-condor-config" class="section level3">
<h3><span class="header-section-number">4.5.2</span> Master/Worker Condor Config</h3>
<div id="initiate-startd-at-master-and-workers" class="section level4">
<h4><span class="header-section-number">4.5.2.1</span> Initiate startd at master and workers</h4>
<pre><code>condor_startd </code></pre>
</div>
<div id="condor-config-vim-etccondorcondor_config.local" class="section level4">
<h4><span class="header-section-number">4.5.2.2</span> Condor config <code>vim /etc/condor/condor_config.local</code></h4>
<p>Add the following in the section where you see ALLOW_READ/WRITE keys</p>
<pre><code>ALLOW_ADMINISTRATOR = $(CONDOR_HOST)
ALLOW_OWNER = $(FULL_HOSTNAME), $(ALLOW_ADMINISTRATOR)
ALLOW_READ = *
ALLOW_WRITE = *
ALLOW_NEGOTIATOR = *
ALLOW_NEGOTIATOR_SCHEDD = *
ALLOW_WRITE_COLLECTOR = $(ALLOW_WRITE), $(FLOCK_FROM)
ALLOW_WRITE_STARTD = $(ALLOW_WRITE), $(FLOCK_FROM)
ALLOW_READ_COLLECTOR = $(ALLOW_READ), $(FLOCK_FROM)
ALLOW_READ_STARTD = $(ALLOW_READ), $(FLOCK_FROM)
ALLOW_CLIENT = *
ALLOW_ADVERTISE_STARTD = *
SEC_DEFAULT_NEGOTIATION = NEVER
SEC_DEFAULT_AUTHENTICATION = NEVER
</code></pre>
</div>
<div id="reconfig-restart-condor" class="section level4">
<h4><span class="header-section-number">4.5.2.3</span> Reconfig & Restart Condor</h4>
<pre><code>condor_reconfig
service condor restart</code></pre>
</div>
</div>
<div id="bisquesvc-submit-node-condor-config" class="section level3">
<h3><span class="header-section-number">4.5.3</span> BisqueSvc Submit Node Condor Config</h3>
<p>Here is contents of the production configuration <code>/etc/condor/condor_config.local</code> file</p>
<pre><code>CONDOR_HOST = master.condor
COLLECTOR_NAME = CBIUCSB
DAEMON_LIST = MASTER,SCHEDD,SHARED_PORT
CONDOR_ADMIN = [email protected]
## Do you want to use NFS for file access instead of remote system calls
ALLOW_READ = $(ALLOW_READ), 172.*, 10.*, 128.111.*, *.ece.ucsb.edu, *.cs.ucsb.edu
ALLOW_WRITE = $(ALLOW_WRITE), 172.*, 10.*, 128.111.*, *.ece.ucsb.edu, *.cs.ucsb.edu
ALLOW_NEGOTIATOR = 172.*, 10.*, 128.111.*
#https://lists.cs.wisc.edu/archive/htcondor-users/2016-December/msg00046.shtml
DISCARD_SESSION_KEYRING_ON_STARTUP = false
# Use CCB with shared port so outside units can talk to
USE_SHARED_PORT = TRUE
SHARED_PORT_ARGS = -p 9886
UPDATE_COLLECTOR_WITH_TCP = TRUE
BIND_ALL_INTERFACES = TRUE
# Slots for multi-cpu machines
NUM_SLOTS = 1
NUM_SLOTS_TYPE_1 = 1
SLOT_TYPE_1 = 100%
SLOT_TYPE_1_PARTITIONABLE = true
START = True
PREEMPT = False
SUSPEND = False
KILL = False
WANT_SUSPEND = False
WANT_VACATE= False
CONTINUE= True</code></pre>
</div>
<div id="state-logs" class="section level3">
<h3><span class="header-section-number">4.5.4</span> State & logs</h3>
<ul>
<li>condor_status for the state of the pool</li>
</ul>
<pre><code>root@bisquevc:/source# condor_status
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
slot1@master-7b988 LINUX X86_64 Unclaimed Idle -1.000 64423 0+13:07:28
slot1@worker-6c6cc LINUX X86_64 Unclaimed Idle -1.000 64423 0+13:04:24
Total Owner Claimed Unclaimed Matched Preempting Backfill
X86_64/LINUX 2 0 0 2 0 0 0
Total 2 0 0 2 0 0 0</code></pre>
<ul>
<li>condor_q for the schedule of the queue. You can use “-analyse” to get additional details on the jobs.</li>
</ul>
<pre><code>root@bisquesvc:/source# condor_q
-- Schedd: bisquesvc : <10.42.0.15:40007>
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended</code></pre>
</div>
<div id="test-condor" class="section level3">
<h3><span class="header-section-number">4.5.5</span> Test Condor</h3>
<p>Condor is configured to be run as a user and not root. So we will change user (su bisque) to bisque and operate with a regular user level privileges for job submittion purposes.</p>
<div id="create-a-file-dock.sh-with-the-following-commands" class="section level5">
<h5><span class="header-section-number">4.5.5.0.1</span> Create a file dock.sh with the following commands</h5>
<pre><code>#!/bin/bash
echo "Hello HTCondor from Job $1 running on `whoami`@`hostname`"
docker --version
sleep 10s</code></pre>
</div>
<div id="create-a-dock.submit-file-with-the-following-paramaters" class="section level5">
<h5><span class="header-section-number">4.5.5.0.2</span> Create a dock.submit file with the following paramaters</h5>
<pre><code>executable = dock.sh
arguments = $(Process)
universe = vanilla
output = dock_$(Cluster)_$(Process).out
error= dock_$(Cluster)_$(Process).error
log = dock_$(Cluster)_$(Process).log
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
queue 2</code></pre>
</div>
<div id="now-execute" class="section level5">
<h5><span class="header-section-number">4.5.5.0.3</span> Now execute</h5>
<pre><code>bisque@bisquesvc:~/condor_dock_test$ condor_submit dock.submit
Submitting job(s)..
2 job(s) submitted to cluster 10.</code></pre>
<ul>
<li>Here we also observe the status of condor queue which should have the jobs running</li>
</ul>
<pre><code>bisque@bisquesvc:~/condor_dock_test$ condor_q
-- Schedd: bisquesvc : <10.42.0.15:9886?...
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
11.0 bisque 3/5 22:37 0+00:00:00 I 0 0.0 dock.sh 0
11.1 bisque 3/5 22:37 0+00:00:00 I 0 0.0 dock.sh 1
2 jobs; 0 completed, 0 removed, 2 idle, 0 running, 0 held, 0 suspended</code></pre>
<ul>
<li>When the Master/worker nodes executes the job we see the following state</li>
</ul>
<pre><code>bisque@bisquesvc:~/condor_dock_test$ condor_status
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
slot1@master-7b988 LINUX X86_64 Unclaimed Idle 0.000 64295 0+00:00:04
slot1_1@master-7b9 LINUX X86_64 Claimed Busy -1.000 128 0+00:00:04
slot1@worker-6c6cc LINUX X86_64 Unclaimed Idle 0.000 64295 0+00:00:04
slot1_1@worker-6c6 LINUX X86_64 Claimed Busy -1.000 128 0+00:00:04
Total Owner Claimed Unclaimed Matched Preempting Backfill
X86_64/LINUX 4 0 2 2 0 0 0
Total 4 0 2 2 0 0 0</code></pre>
<ul>
<li>In about 10 seconds this execution will terminate and dump the results in corresponding log files</li>
</ul>
<pre><code>bisque@bisquesvc:~/condor_dock_test$ condor_status
Name OpSys Arch State Activity LoadAv Mem ActvtyTime
slot1@master-7b988 LINUX X86_64 Unclaimed Idle 0.000 64295 0+00:00:04
slot1@worker-6c6cc LINUX X86_64 Unclaimed Idle 0.000 64295 0+00:00:04
Total Owner Claimed Unclaimed Matched Preempting Backfill
X86_64/LINUX 2 0 0 2 0 0 0
Total 2 0 0 2 0 0 0
bisque@bisquesvc:~/condor_dock_test$ condor_q
-- Schedd: bisquesvc : <10.42.0.15:9886?...
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended</code></pre>
<ul>
<li>To verify the execution at worker/master we can take a look at the /var/log/condor/StartLog</li>
</ul>
<pre><code># Logs at the master condor node
03/05/19 22:37:54 slot1_1: Request accepted.
03/05/19 22:37:54 slot1_1: Remote owner is bisque@bisquesvc
03/05/19 22:37:54 slot1_1: State change: claiming protocol successful
03/05/19 22:37:54 slot1_1: Changing state: Owner -> Claimed
03/05/19 22:37:54 slot1_1: Got activate_claim request from shadow (10.42.0.15)
03/05/19 22:37:54 /proc format unknown for kernel version 4.15.0
03/05/19 22:37:54 slot1_1: Remote job ID is 11.0
03/05/19 22:37:54 slot1_1: Got universe "VANILLA" (5) from request classad
03/05/19 22:37:54 slot1_1: State change: claim-activation protocol successful
03/05/19 22:37:54 slot1_1: Changing activity: Idle -> Busy
03/05/19 22:37:59 /proc format unknown for kernel version 4.15.0
03/05/19 22:38:04 /proc format unknown for kernel version 4.15.0
03/05/19 22:38:09 /proc format unknown for kernel version 4.15.0
03/05/19 22:38:09 slot1_1: Called deactivate_claim_forcibly()
03/05/19 22:38:09 Starter pid 1141 exited with status 0
03/05/19 22:38:09 slot1_1: State change: starter exited
03/05/19 22:38:09 slot1_1: Changing activity: Busy -> Idle
03/05/19 22:38:09 slot1_1: State change: received RELEASE_CLAIM command
03/05/19 22:38:09 slot1_1: Changing state and activity: Claimed/Idle -> Preempting/Vacating
03/05/19 22:38:09 slot1_1: State change: No preempting claim, returning to owner
03/05/19 22:38:09 slot1_1: Changing state and activity: Preempting/Vacating -> Owner/Idle
03/05/19 22:38:09 slot1_1: State change: IS_OWNER is false
03/05/19 22:38:09 slot1_1: Changing state: Owner -> Unclaimed
03/05/19 22:38:09 slot1_1: Changing state: Unclaimed -> Delete
03/05/19 22:38:09 slot1_1: Resource no longer needed, deleting</code></pre>
<ul>
<li>Final results of the output could be seen in the dock_11_*.out result files which represents the Job Id = 11</li>
</ul>
<pre><code>bisque@bisquesvc:~/condor_dock_test$ ll
total 36
drwxrwxr-x 2 bisque bisque 4096 Mar 5 22:37 ./
drwxr-xr-x 1 bisque bisque 4096 Mar 5 22:34 ../
-rw-r--r-- 1 bisque bisque 0 Mar 5 22:37 dock_11_0.error
-rw-rw-r-- 1 bisque bisque 1021 Mar 5 22:38 dock_11_0.log
-rw-r--r-- 1 bisque bisque 132 Mar 5 22:38 dock_11_0.out
-rw-r--r-- 1 bisque bisque 0 Mar 5 22:37 dock_11_1.error
-rw-rw-r-- 1 bisque bisque 1021 Mar 5 22:38 dock_11_1.log
-rw-r--r-- 1 bisque bisque 132 Mar 5 22:38 dock_11_1.out
-rw-r--r-- 1 bisque bisque 163 Mar 5 22:34 dock.sh
-rw-r--r-- 1 bisque bisque 253 Mar 5 09:07 dock.submit</code></pre>
<ul>
<li>Output Result</li>
</ul>
<pre><code>bisque@bisquesvc:~/condor_dock_test$ cat dock_11_1.out
Hello HTCondor from Job 1 running on nobody@master-7b988ddb7d-hnspj
Docker version 17.03.0-ce, build 60ccb22
Completed my first job</code></pre>
<ul>
<li>Output Log</li>
</ul>
<pre><code>bisque@bisquesvc:~/condor_dock_test$ cat dock_11_1.log
000 (011.001.000) 03/05 22:37:44 Job submitted from host: <10.42.0.15:9886?addrs=10.42.0.15-9886&noUDP&sock=6207_60d7_3>
...
001 (011.001.000) 03/05 22:37:57 Job executing on host: <10.42.0.8:9886?sock=30363_868c>
...
006 (011.001.000) 03/05 22:38:07 Image size of job updated: 1
8 - MemoryUsage of job (MB)
7560 - ResidentSetSize of job (KB)
...
005 (011.001.000) 03/05 22:38:09 Job terminated.
(1) Normal termination (return value 0)
Usr 0 00:00:00, Sys 0 00:00:00 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage
132 - Run Bytes Sent By Job
163 - Run Bytes Received By Job
132 - Total Bytes Sent By Job
163 - Total Bytes Received By Job
Partitionable Resources : Usage Request Allocated
Cpus : 1 1
Disk (KB) : 9 1 893677
Memory (MB) : 8 1 1
...
</code></pre>
<div id="references" class="section level6">
<h6><span class="header-section-number">4.5.5.0.3.1</span> References:</h6>
<ul>
<li>Condor <a href="https://research.cs.wisc.edu/htcondor/manual/v8.2/2_5Submitting_Job.html">Submit Jobs (Official Docs)</a>
<ul>
<li>Multi Node <a href="https://spinningmatt.wordpress.com/2011/06/12/getting-started-creating-a-multiple-node-condor-pool/">Condor Pool (Blog)</a></li>
<li>Simple <a href="https://www.linux.com/news/setting-condor-cluster">Condor Cluster (Blog)</a></li>
</ul></li>
<li>Condor <a href="https://research.cs.wisc.edu/htcondor/manual/v8.8/Security.html">Examples of Security Configurations (Official Docs)</a></li>
<li>Install Docker Community Edition <a href="https://docs.docker.com/install/linux/docker-ce/ubuntu/" class="uri">https://docs.docker.com/install/linux/docker-ce/ubuntu/</a></li>
</ul>
</div>
</div>
</div>
</div>
<!-- </div> -->
</section>
</div>
</div>
</div>
<a href="connoisseurgpu-workload-provisioning.html" class="navigation navigation-prev navigation-unique" aria-label="Previous page"><i class="fa fa-angle-left"></i></a>
</div>
</div>
<script src="libs/gitbook-2.6.7/js/app.min.js"></script>
<script src="libs/gitbook-2.6.7/js/lunr.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-search.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-sharing.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-fontsettings.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-bookdown.js"></script>
<script src="libs/gitbook-2.6.7/js/jquery.highlight.js"></script>
<script>
gitbook.require(["gitbook"], function(gitbook) {
gitbook.start({
"sharing": {
"github": false,
"facebook": true,
"twitter": true,
"google": false,
"linkedin": false,
"weibo": false,
"instapaper": false,
"vk": false,
"all": ["facebook", "google", "twitter", "linkedin", "weibo", "instapaper"]
},
"fontsettings": {
"theme": "white",
"family": "sans",
"size": 2
},
"edit": {
"link": null,
"text": null
},
"history": {
"link": null,
"text": null
},
"download": ["BisQue_Documentation.pdf", "BisQue_Documentation.epub"],
"toc": {
"collapse": "subsection"
}
});
});
</script>
</body>
</html>