Skip to content
This repository was archived by the owner on May 19, 2023. It is now read-only.

Commit dcd7486

Browse files
authored
Merge pull request #229 from brhodes10/fix/cybert-multi-gpu
Fix cybert streamz dask startup
2 parents 7601736 + 7ccc29e commit dcd7486

File tree

4 files changed

+31
-9
lines changed

4 files changed

+31
-9
lines changed

CHANGELOG.md

+1
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88

99
## Bug Fixes
1010
- PR #231 Fix segmentation fault in cybert notebook
11+
- PR #229 Fix cybert streamz script
1112

1213
# clx 0.15.0 (26 Aug 2020)
1314

examples/streamz/README.md

+19-5
Original file line numberDiff line numberDiff line change
@@ -30,13 +30,13 @@ Create a new container using the image above. When running this container, it wi
3030

3131
##### Preferred - Docker CE v19+ and nvidia-container-toolkit
3232
```
33-
docker run -it --gpus '"device=0,1,2"' -p 8787:8787 -v /path/to/dataset:/path/to/dataset -v /path/to/model.pth:/path/to/model.pth -v /path/to/label.txt:/path/to/label.txt --name cybert-streamz -d cybert-streamz:latest \
33+
docker run -it --gpus '"device=0,1,2"' -p 8787:8787 -v /path/to/dataset:/path/to/dataset -v /path/to/model.pth:/path/to/model.pth -v /path/to/label.txt:/path/to/label.json --name cybert-streamz -d cybert-streamz:latest \
3434
--broker localhost:9092 \
3535
--group_id streamz \
3636
--input_topic input \
3737
--output_topic output \
3838
--model_file /path/to/model.pth \
39-
--label_file /path/to/label.txt \
39+
--label_file /path/to/label.json \
4040
--cuda_visible_devices 0,1,2 \
4141
--poll_interval 1s \
4242
--max_batch_size 1000 \
@@ -45,19 +45,33 @@ docker run -it --gpus '"device=0,1,2"' -p 8787:8787 -v /path/to/dataset:/path/to
4545

4646
##### Legacy - Docker CE v18 and nvidia-docker2
4747
```
48-
docker run -it --runtime=nvidia -p 8787:8787 -v /path/to/dataset:/path/to/dataset -v /path/to/model.pth:/path/to/model.pth -v /path/to/label.txt:/path/to/label.txt --name cybert-streamz -d cybert-streamz:latest \
48+
docker run -it --runtime=nvidia -p 8787:8787 -v /path/to/dataset:/path/to/dataset -v /path/to/model.pth:/path/to/model.pth -v /path/to/label.txt:/path/to/label.json --name cybert-streamz -d cybert-streamz:latest \
4949
--broker localhost:9092 \
5050
--group_id streamz \
5151
--input_topic input \
5252
--output_topic output \
5353
--model_file /path/to/model.pth \
54-
--label_file /path/to/label.yaml \
54+
--label_file /path/to/label.json \
5555
--cuda_visible_devices 0,1,2 \
5656
--poll_interval 1s \
5757
--max_batch_size 1000 \
5858
--data /path/to/dataset
5959
```
6060

61+
Parameters
62+
- `broker`* - Host and port where kafka broker is running
63+
- `group_id`* - Kafka [group id](https://docs.confluent.io/current/installation/configuration/consumer-configs.html#group.id) that uniquely identifies the streamz data consumer.
64+
- `input_topic` - The name for the input topic to send the input dataset. Any name can be indicated here.
65+
- `output_topic` - The name for the output topic to send the output data. Any name can be indicated here.
66+
- `model_file` - The path to your model file
67+
- `label_file` - The path to your label file
68+
- `cuda_visible_devices` - List of gpus use to run streamz with Dask. The gpus should be equal to or a subset of devices indicated within the docker run command (in the example above device list is set to `'"device=0,1,2"'`)
69+
- `poll_interval`* - Interval (in seconds) to poll the Kafka input topic for data
70+
- `max_batch_size`* - Max batch size of data (max number of logs) to ingest into streamz with each `poll_interval`
71+
- `data` - The input dataset to use for this streamz example
72+
73+
``*`` = More information on these parameters can be found in the streamz [documentation](https://streamz.readthedocs.io/en/latest/api.html#streamz.from_kafka_batched).
74+
6175
View the data processing activity on the dask dashboard by visiting `localhost:8787` or `<host>:8787`
6276

6377
View the cyBERT script output in the container logs
@@ -68,7 +82,7 @@ docker logs cybert-streamz
6882

6983
Processed data will be pushed to the kafka topic named `output`. To view all processed output run:
7084
```
71-
docker exec cybert-streamz bash -c 'source activate clx && $KAFKA_HOME/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic output --from-beginning'
85+
docker exec cybert-streamz bash -c 'source activate rapids && $KAFKA_HOME/bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic output --from-beginning'
7286
```
7387

7488
##### Benchmark

examples/streamz/python/cybert.py

+4-2
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,9 @@ def parse_arguments():
127127
help="Dask scheduler address. If not provided a new dask cluster will be created",
128128
)
129129
parser.add_argument(
130-
"--cuda_visible_devices", type=str, help="Cuda visible devices (ex: '0,1,2')",
130+
"--cuda_visible_devices",
131+
type=str,
132+
help="Cuda visible devices (ex: '0,1,2')",
131133
)
132134
parser.add_argument(
133135
"--max_batch_size",
@@ -180,7 +182,7 @@ def create_dask_client(dask_scheduler, cuda_visible_devices=[0]):
180182
"group.id": args.group_id,
181183
"session.timeout.ms": 60000,
182184
"enable.partition.eof": "true",
183-
"auto.offset.reset": "latest",
185+
"auto.offset.reset": "earliest",
184186
}
185187
print("Consumer conf:", consumer_conf)
186188
source = Stream.from_kafka_batched(

examples/streamz/scripts/entry.sh

+7-2
Original file line numberDiff line numberDiff line change
@@ -131,8 +131,13 @@ log "INFO" "Dask scheduler running"
131131
#**********************************
132132
# Start Dask CUDA Worker
133133
#**********************************
134-
nohup dask-cuda-worker localhost:8786 2>&1 &
135-
134+
IFS=',' read -ra devices <<< "$cuda_visible_devices"
135+
for i in "${devices[@]}"
136+
do
137+
echo "CUDA_VISIBLE_DEVICES=$i nohup dask-cuda-worker localhost:8786 2>&1 &"
138+
CUDA_VISIBLE_DEVICES=$i nohup dask-cuda-worker localhost:8786 2>&1 &
139+
done
140+
sleep 3
136141
#**********************************
137142
# Start Jupyter Notebook
138143
#**********************************

0 commit comments

Comments
 (0)