Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC]: What is the labels file for in 2_real_world_phishing? #1208

Closed
2 tasks done
nyck33 opened this issue Sep 21, 2023 · 3 comments · Fixed by #1215
Closed
2 tasks done

[DOC]: What is the labels file for in 2_real_world_phishing? #1208

nyck33 opened this issue Sep 21, 2023 · 3 comments · Fixed by #1215
Assignees
Labels
doc Improvements or additions to documentation external This issue was filed by someone outside of the Morpheus team Needs Triage Need team to review and classify

Comments

@nyck33
Copy link

nyck33 commented Sep 21, 2023

How would you describe the priority of this documentation request

Medium

Describe the future/missing documentation

I look at it and it's blank:

(morpheus) root@4c26e05a1dc0:/workspace/morpheus/models/data# ls
bert-base-cased-hash.txt    bert-base-uncased-vocab.txt  columns_ae_cloudtrail.txt  labels_ae.txt        public_suffix_list.dat
bert-base-cased-vocab.txt   columns_ae.txt               columns_ae_duo.txt         labels_nlp.txt       splunk_notable_regex.yaml
bert-base-uncased-hash.txt  columns_ae_azure.txt         columns_fil.txt            labels_phishing.txt  windows_event_regex.yaml
(morpheus) root@4c26e05a1dc0:/workspace/morpheus/models/data# cat labels_phishing.txt
score
pred

Then after running the example I get output like: https://gist.github.com/nyck33/8cc7f8622088f16ebbbe4a5071632db4

But I can only see some [SEP] tokens inserted? I thought like the abp_nvsmi example I was going to get a file with either 1 or 0 to indicate whether an email was flagged as potential phishing attempt or not.

The output below makes it seem like the run was successful. It took Triton inference server some time to finish, in that last line below, that started at 0 and when I came back it was 100 (left the screen because it was slow but my GPU is not that great).

morpheus) root@4c26e05a1dc0:/workspace/morpheus# morpheus --log_level=debug --plugin examples/developer_guide/2_1_real_world_phishing/recipient_features_stage.py \
  run pipeline-nlp --labels_file=data/labels_phishing.txt --model_seq_length=128 \
  from-file --filename=examples/data/email_with_addresses.jsonlines \
  recipient-features \
  deserialize \
  preprocess --vocab_hash_file=data/bert-base-uncased-hash.txt --truncation=true --do_lower_case=true --add_special_tokens=false \
  inf-triton --model_name=phishing-bert-onnx --server_url=192.168.2.101:8000 --force_convert_inputs=true \
  monitor --description="Inference Rate" --smoothing=0.001 --unit=inf \
  filter --threshold=0.9 --filter_source=TENSOR \
  serialize \
  to-file --filename=/tmp/detections.jsonlines --overwrite
Parameter, 'labels_file', with relative path, 'data/labels_phishing.txt', does not exist. Using package relative location: '/opt/conda/envs/morpheus/lib/python3.10/site-packages/morpheus/data/labels_phishing.txt'
Configuring Pipeline via CLI
Loaded labels file. Current labels: [['score', 'pred']]
Parameter, 'vocab_hash_file', with relative path, 'data/bert-base-uncased-hash.txt', does not exist. Using package relative location: '/opt/conda/envs/morpheus/lib/python3.10/site-packages/morpheus/data/bert-base-uncased-hash.txt'
Starting pipeline via CLI... Ctrl+C to Quit
Config: 
{
  "ae": null,
  "class_labels": [
    "score",
    "pred"
  ],
  "debug": false,
  "edge_buffer_size": 128,
  "feature_length": 128,
  "fil": null,
  "log_config_file": null,
  "log_level": 10,
  "mode": "NLP",
  "model_max_batch_size": 8,
  "num_threads": 12,
  "pipeline_batch_size": 256,
  "plugins": [
    "examples/developer_guide/2_1_real_world_phishing/recipient_features_stage.py"
  ]
}
CPP Enabled: True
====Registering Pipeline====
W20230921 00:38:02.017280    63 thread.cpp:137] unable to set memory policy - if using docker use: --cap-add=sys_nice to allow membind
====Building Pipeline====
Inference Rate: 0 inf [00:00, ? inf/s]====Building Pipeline Complete!====
Starting! Time: 1695256682.0191388
====Registering Pipeline Complete!====
====Starting Pipeline====
====Pipeline Started====
====Building Segment: linear_segment_0====
Added source: <from-file-0; FileSourceStage(filename=examples/data/email_with_addresses.jsonlines, iterative=False, file_type=FileTypes.Auto, repeat=1, filter_null=True, parser_kwargs=None)>
  └─> morpheus.MessageMeta
Added stage: <recipient-features-1; RecipientFeaturesStage(sep_token=[SEP])>
  └─ morpheus.MessageMeta -> morpheus.MessageMeta
Added stage: <deserialize-2; DeserializeStage(ensure_sliceable_index=True)>
  └─ morpheus.MessageMeta -> morpheus.MultiMessage
Added stage: <preprocess-nlp-3; PreprocessNLPStage(vocab_hash_file=/opt/conda/envs/morpheus/lib/python3.10/site-packages/morpheus/data/bert-base-uncased-hash.txt, truncation=True, do_lower_case=True, add_special_tokens=False, stride=-1, column=data)>
  └─ morpheus.MultiMessage -> morpheus.MultiInferenceNLPMessage
Added stage: <inference-4; TritonInferenceStage(model_name=phishing-bert-onnx, server_url=192.168.2.101:8000, force_convert_inputs=True, use_shared_memory=False)>
  └─ morpheus.MultiInferenceNLPMessage -> morpheus.MultiResponseMessage
Added stage: <monitor-5; MonitorStage(description=Inference Rate, smoothing=0.001, unit=inf, delayed_start=False, determine_count_fn=None, log_level=LogLevels.INFO)>
  └─ morpheus.MultiResponseMessage -> morpheus.MultiResponseMessage
Added stage: <filter-6; FilterDetectionsStage(threshold=0.9, copy=True, filter_source=FilterSource.TENSOR, field_name=probs)>
  └─ morpheus.MultiResponseMessage -> morpheus.MultiResponseMessage
Added stage: <serialize-7; SerializeStage(include=(), exclude=(), fixed_columns=True)>
  └─ morpheus.MultiResponseMessage -> morpheus.MessageMeta
Added stage: <to-file-8; WriteToFileStage(filename=/tmp/detections.jsonlines, overwrite=True, file_type=FileTypes.Auto, include_index_col=True, flush=False)>
  └─ morpheus.MessageMeta -> morpheus.MessageMeta
====Building Segment Complete!====
Inference Rate[Complete]: 100 inf [00:01, 77.91 inf/s]
====Pipeline Complete====

Where have you looked?

https://github.com/nv-morpheus/Morpheus/blob/branch-23.11/docs/source/developer_guide/guides/2_real_world_phishing.md

Code of Conduct

  • I agree to follow this project's Code of Conduct
  • I have searched the open documentation issues and have found no duplicates for this bug report
@nyck33 nyck33 added the doc Improvements or additions to documentation label Sep 21, 2023
@jarmak-nv jarmak-nv added Needs Triage Need team to review and classify external This issue was filed by someone outside of the Morpheus team labels Sep 21, 2023
@jarmak-nv
Copy link
Contributor

Hi @nyck33!

Thanks for submitting this issue - our team has been notified and we'll get back to you as soon as we can!
In the mean time, feel free to add any relevant information to this issue.

@nyck33
Copy link
Author

nyck33 commented Sep 21, 2023

Also noticed some overflow issue in the Triton server output when running this one.


=============================
== Triton Inference Server ==
=============================

NVIDIA Release 23.06 (build 62878575)
Triton Server Version 2.35.0

Copyright (c) 2018-2023, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

I0921 00:38:52.141891 1 libtorch.cc:2253] TRITONBACKEND_Initialize: pytorch
I0921 00:38:52.141927 1 libtorch.cc:2263] Triton TRITONBACKEND API version: 1.13
I0921 00:38:52.141931 1 libtorch.cc:2269] 'pytorch' TRITONBACKEND API version: 1.13
I0921 00:38:52.272158 1 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f7fd2000000' with size 268435456
I0921 00:38:52.272432 1 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0921 00:38:52.274662 1 model_lifecycle.cc:462] loading: phishing-bert-onnx:1
I0921 00:38:52.276894 1 onnxruntime.cc:2530] TRITONBACKEND_Initialize: onnxruntime
I0921 00:38:52.276905 1 onnxruntime.cc:2540] Triton TRITONBACKEND API version: 1.13
I0921 00:38:52.276909 1 onnxruntime.cc:2546] 'onnxruntime' TRITONBACKEND API version: 1.13
I0921 00:38:52.276913 1 onnxruntime.cc:2576] backend configuration:
{"cmdline":{"auto-complete-config":"false","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","default-max-batch-size":"4"}}
I0921 00:38:52.287300 1 onnxruntime.cc:2641] TRITONBACKEND_ModelInitialize: phishing-bert-onnx (version 1)
I0921 00:38:52.287786 1 onnxruntime.cc:2702] TRITONBACKEND_ModelInstanceInitialize: phishing-bert-onnx (GPU device 0)
2023-09-21 00:38:57.599823815 [W:onnxruntime:log, tensorrt_execution_provider.h:75 log] [2023-09-21 00:38:57 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
2023-09-21 00:39:01.684426573 [W:onnxruntime:log, tensorrt_execution_provider.h:75 log] [2023-09-21 00:39:01 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
I0921 00:39:01.724387 1 model_lifecycle.cc:815] successfully loaded 'phishing-bert-onnx'
I0921 00:39:01.724490 1 server.cc:603] 
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0921 00:39:01.724545 1 server.cc:630] 
+-------------+-------------------------------------+-------------------------------------+
| Backend     | Path                                | Config                              |
+-------------+-------------------------------------+-------------------------------------+
| pytorch     | /opt/tritonserver/backends/pytorch/ | {}                                  |
|             | libtriton_pytorch.so                |                                     |
| onnxruntime | /opt/tritonserver/backends/onnxrunt | {"cmdline":{"auto-complete-config": |
|             | ime/libtriton_onnxruntime.so        | "false","backend-directory":"/opt/t |
|             |                                     | ritonserver/backends","min-compute- |
|             |                                     | capability":"6.000000","default-max |
|             |                                     | -batch-size":"4"}}                  |
|             |                                     |                                     |
+-------------+-------------------------------------+-------------------------------------+

I0921 00:39:01.724573 1 server.cc:673] 
+--------------------+---------+--------+
| Model              | Version | Status |
+--------------------+---------+--------+
| phishing-bert-onnx | 1       | READY  |
+--------------------+---------+--------+

I0921 00:39:01.751313 1 metrics.cc:808] Collecting metrics for GPU 0: NVIDIA GeForce GTX 1650
I0921 00:39:01.751570 1 metrics.cc:701] Collecting CPU metrics
I0921 00:39:01.751753 1 tritonserver.cc:2385] 
+----------------------------------+------------------------------------------------------+
| Option                           | Value                                                |
+----------------------------------+------------------------------------------------------+
| server_id                        | triton                                               |
| server_version                   | 2.35.0                                               |
| server_extensions                | classification sequence model_repository model_repos |
|                                  | itory(unload_dependents) schedule_policy model_confi |
|                                  | guration system_shared_memory cuda_shared_memory bin |
|                                  | ary_tensor_data parameters statistics trace logging  |
| model_repository_path[0]         | /models/triton-model-repo                            |
| model_control_mode               | MODE_EXPLICIT                                        |
| startup_models_0                 | phishing-bert-onnx                                   |
| strict_model_config              | 1                                                    |
| rate_limit                       | OFF                                                  |
| pinned_memory_pool_byte_size     | 268435456                                            |
| cuda_memory_pool_byte_size{0}    | 67108864                                             |
| min_supported_compute_capability | 6.0                                                  |
| strict_readiness                 | 0                                                    |
| exit_timeout                     | 30                                                   |
| cache_enabled                    | 0                                                    |
+----------------------------------+------------------------------------------------------+

I0921 00:39:01.752647 1 grpc_server.cc:2445] Started GRPCInferenceService at 0.0.0.0:8001
I0921 00:39:01.752809 1 http_server.cc:3555] Started HTTPService at 0.0.0.0:8000
I0921 00:39:01.794237 1 http_server.cc:185] Started Metrics Service at 0.0.0.0:8002
W0921 00:39:02.753767 1 metrics.cc:573] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0921 00:39:03.754510 1 metrics.cc:573] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0921 00:39:04.756759 1 metrics.cc:573] Unable to get power limit for GPU 0. Status:Success, value:0.000000
2023-09-21 00:41:27.789821861 [W:onnxruntime:log, tensorrt_execution_provider.h:75 log] [2023-09-21 00:41:27 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy.
2023-09-21 00:41:27.789843932 [W:onnxruntime:log, tensorrt_execution_provider.h:75 log] [2023-09-21 00:41:27 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
2023-09-21 00:41:27.789849589 [W:onnxruntime:log, tensorrt_execution_provider.h:75 log] [2023-09-21 00:41:27 WARNING] Check verbose logs for the list of affected weights.
2023-09-21 00:41:27.789855735 [W:onnxruntime:log, tensorrt_execution_provider.h:75 log] [2023-09-21 00:41:27 WARNING] - 123 weights are affected by this issue: Detected subnormal FP16 values.
2023-09-21 00:41:27.789899946 [W:onnxruntime:log, tensorrt_execution_provider.h:75 log] [2023-09-21 00:41:27 WARNING] - 59 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.
2023-09-21 00:41:27.789914263 [W:onnxruntime:log, tensorrt_execution_provider.h:75 log] [2023-09-21 00:41:27 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value.
2023-09-21 00:43:13.775554064 [W:onnxruntime:log, tensorrt_execution_provider.h:75 log] [2023-09-21 00:43:13 WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy.
2023-09-21 00:43:13.775575086 [W:onnxruntime:log, tensorrt_execution_provider.h:75 log] [2023-09-21 00:43:13 WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
2023-09-21 00:43:13.775580813 [W:onnxruntime:log, tensorrt_execution_provider.h:75 log] [2023-09-21 00:43:13 WARNING] Check verbose logs for the list of affected weights.
2023-09-21 00:43:13.775586401 [W:onnxruntime:log, tensorrt_execution_provider.h:75 log] [2023-09-21 00:43:13 WARNING] - 123 weights are affected by this issue: Detected subnormal FP16 values.
2023-09-21 00:43:13.775634243 [W:onnxruntime:log, tensorrt_execution_provider.h:75 log] [2023-09-21 00:43:13 WARNING] - 59 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.
2023-09-21 00:43:13.775664764 [W:onnxruntime:log, tensorrt_execution_provider.h:75 log] [2023-09-21 00:43:13 WARNING] - 1 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value.


@efajardo-nv
Copy link
Contributor

efajardo-nv commented Sep 21, 2023

@nyck33 Thank you for reporting this. I was able to reproduce your results. The labels file should have not_phishing and is_phishing to correspond to the model outputs. The Morpheus command also needs an add-class to get the not_phishing/is_phishing boolean output you're looking for. Here's an updated command that should work for you:

morpheus --log_level=debug --plugin examples/developer_guide/2_1_real_world_phishing/recipient_features_stage.py \
      run pipeline-nlp --label=not_phishing --label=is_phishing --model_seq_length=128 \
      from-file --filename=examples/data/email_with_addresses.jsonlines \
      recipient-features \
      deserialize \
      preprocess --vocab_hash_file=data/bert-base-uncased-hash.txt --truncation=true --do_lower_case=true --add_special_tokens=false \
      inf-triton --model_name=phishing-bert-onnx --server_url=localhost:8001 --force_convert_inputs=true \
      monitor --description="Inference Rate" --smoothing=0.001 --unit=inf \
      add-class \
      serialize \
      to-file --filename=/tmp/detections.jsonlines --overwrite

I'm manually specifying the labels here. Here's an example of how to use add-scores to get the probabilities:

morpheus --log_level=debug --plugin examples/developer_guide/2_1_real_world_phishing/recipient_features_stage.py \
        run pipeline-nlp --label=not_phishing --label=is_phishing --model_seq_length=128 \
        from-file --filename=examples/data/email_with_addresses.jsonlines \
        recipient-features \
        deserialize \
        preprocess --vocab_hash_file=data/bert-base-uncased-hash.txt --truncation=true --do_lower_case=true --add_special_tokens=false \
        inf-triton --model_name=phishing-bert-onnx --server_url=localhost:8001 --force_convert_inputs=true \
        monitor --description="Inference Rate" --smoothing=0.001 --unit=inf \
        add-scores --label=is_phishing \
        serialize \
        to-file --filename=/tmp/detections.jsonlines --overwrite

The overflow warnings from Triton are from automatic ONNX to TensorRT conversion we're doing here:
https://github.com/nv-morpheus/Morpheus/blob/branch-23.11/models/triton-model-repo/phishing-bert-onnx/config.pbtxt#L31-L37
Seems to be caused by using precision mode FP16. Warnings go away when I switch it to FP32 but will run a little slower. There are slight differences in the probability scores but the predictions are still the same between using FP16 and FP32.

Thanks again for reporting this. I'll create a PR to make the corrections.

@efajardo-nv efajardo-nv self-assigned this Sep 21, 2023
@nyck33 nyck33 closed this as completed Sep 22, 2023
rapids-bot bot pushed a commit that referenced this issue Sep 25, 2023
- Update labels file to reflect actual model outputs: `not_phishing` and `is_phishing`
- Add `AddScores` stage to Python and CLI examples
- Remove `FilterDetectionsStage` because model output contains probabilities for both `not_phishing` and `is_phishing`one of which will always exceed threshold. This results in nothing ever being filtered out.

Closes #1208

## By Submitting this PR I confirm:
- I am familiar with the [Contributing Guidelines](https://github.com/nv-morpheus/Morpheus/blob/main/docs/source/developer_guide/contributing.md).
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

Authors:
  - Eli Fajardo (https://github.com/efajardo-nv)

Approvers:
  - Michael Demoret (https://github.com/mdemoret-nv)

URL: #1215
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Improvements or additions to documentation external This issue was filed by someone outside of the Morpheus team Needs Triage Need team to review and classify
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants