diff --git a/.gitattributes b/.gitattributes new file mode 100644 index 0000000..2322902 --- /dev/null +++ b/.gitattributes @@ -0,0 +1,26 @@ +# Images +*.gif filter=lfs diff=lfs merge=lfs -text +*.jpg filter=lfs diff=lfs merge=lfs -text +*.png filter=lfs diff=lfs merge=lfs -text +*.psd filter=lfs diff=lfs merge=lfs -text +# Archives +*.gz filter=lfs diff=lfs merge=lfs -text +*.tar filter=lfs diff=lfs merge=lfs -text +*.zip filter=lfs diff=lfs merge=lfs -text +# Documents +*.pdf filter=lfs diff=lfs merge=lfs -text +# Numpy data +*.npy filter=lfs diff=lfs merge=lfs -text +# Debian package +*.deb filter=lfs diff=lfs merge=lfs -text +# Shared libraries +*.so filter=lfs diff=lfs merge=lfs -text +*.so.* filter=lfs diff=lfs merge=lfs -text +# PCD files +*.pcd filter=lfs diff=lfs merge=lfs -text +# Model files +*.onnx filter=lfs diff=lfs merge=lfs -text +.trt filter=lfs diff=lfs merge=lfs -text +*.trt filter=lfs diff=lfs merge=lfs -text +*.plan filter=lfs diff=lfs merge=lfs -text +*.etlt filter=lfs diff=lfs merge=lfs -text diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..f0c4405 --- /dev/null +++ b/.gitignore @@ -0,0 +1,6 @@ +# Ignore all pycache files +**/__pycache__/** +build/ +install/ +install_aarch64/ +log/ \ No newline at end of file diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..88bc900 --- /dev/null +++ b/LICENSE @@ -0,0 +1,66 @@ +NVIDIA ISAAC ROS SOFTWARE LICENSE + +This license is a legal agreement between you and NVIDIA Corporation ("NVIDIA") and governs the use of the NVIDIA Isaac ROS software and materials provided hereunder (“SOFTWARE”). + +This license can be accepted only by an adult of legal age of majority in the country in which the SOFTWARE is used. + +If you are entering into this license on behalf of a company or other legal entity, you represent that you have the legal authority to bind the entity to this license, in which case “you” will mean the entity you represent. + +If you don’t have the required age or authority to accept this license, or if you don’t accept all the terms and conditions of this license, do not download, install or use the SOFTWARE. + +You agree to use the SOFTWARE only for purposes that are permitted by (a) this license, and (b) any applicable law, regulation or generally accepted practices or guidelines in the relevant jurisdictions. + +1. LICENSE. Subject to the terms of this license, NVIDIA hereby grants you a non-exclusive, non-transferable license, without the right to sublicense (except as expressly provided in this license) to: +a. Install and use the SOFTWARE, +b. Modify and create derivative works of sample or reference source code delivered in the SOFTWARE, and +c. Distribute any part of the SOFTWARE (i) as incorporated into a software application that has material additional functionality beyond the included portions of the SOFTWARE, or (ii) unmodified in binary format, in each case subject to the distribution requirements indicated in this license. + +2. DISTRIBUTION REQUIREMENTS. These are the distribution requirements for you to exercise the distribution grant above: + a. The following notice shall be included in modifications and derivative works of source code distributed: “This software contains source code provided by NVIDIA Corporation.” + b. You agree to distribute the SOFTWARE subject to the terms at least as protective as the terms of this license, including (without limitation) terms relating to the license grant, license restrictions and protection of NVIDIA’s intellectual property rights. Additionally, you agree that you will protect the privacy, security and legal rights of your application users. + c. You agree to notify NVIDIA in writing of any known or suspected distribution or use of the SOFTWARE not in compliance with the requirements of this license, and to enforce the terms of your agreements with respect to the distributed portions of the SOFTWARE. +3. AUTHORIZED USERS. You may allow employees and contractors of your entity or of your subsidiary(ies) to access and use the SOFTWARE from your secure network to perform work on your behalf. If you are an academic institution you may allow users enrolled or employed by the academic institution to access and use the SOFTWARE from your secure network. You are responsible for the compliance with the terms of this license by your authorized users. + +4. LIMITATIONS. Your license to use the SOFTWARE is restricted as follows: + a. The SOFTWARE is licensed for you to develop applications only for their use in systems with NVIDIA GPUs. + b. You may not reverse engineer, decompile or disassemble, or remove copyright or other proprietary notices from any portion of the SOFTWARE or copies of the SOFTWARE. + c. Except as expressly stated above in this license, you may not sell, rent, sublicense, transfer, distribute, modify, or create derivative works of any portion of the SOFTWARE. + d. Unless you have an agreement with NVIDIA for this purpose, you may not indicate that an application created with the SOFTWARE is sponsored or endorsed by NVIDIA. + e. You may not bypass, disable, or circumvent any technical limitation, encryption, security, digital rights management or authentication mechanism in the SOFTWARE. + f. You may not use the SOFTWARE in any manner that would cause it to become subject to an open source software license. As examples, licenses that require as a condition of use, modification, and/or distribution that the SOFTWARE be: (i) disclosed or distributed in source code form; (ii) licensed for the purpose of making derivative works; or (iii) redistributable at no charge. + g. You acknowledge that the SOFTWARE as delivered is not tested or certified by NVIDIA for use in connection with the design, construction, maintenance, and/or operation of any system where the use or failure of such system could result in a situation that threatens the safety of human life or results in catastrophic damages (each, a "Critical Application"). Examples of Critical Applications include use in avionics, navigation, autonomous vehicle applications, ai solutions for automotive products, military, medical, life support or other life critical applications. NVIDIA shall not be liable to you or any third party, in whole or in part, for any claims or damages arising from such uses. You are solely responsible for ensuring that any product or service developed with the SOFTWARE as a whole includes sufficient features to comply with all applicable legal and regulatory standards and requirements. + h. You agree to defend, indemnify and hold harmless NVIDIA and its affiliates, and their respective employees, contractors, agents, officers and directors, from and against any and all claims, damages, obligations, losses, liabilities, costs or debt, fines, restitutions and expenses (including but not limited to attorney’s fees and costs incident to establishing the right of indemnification) arising out of or related to your use of goods and/or services that include or utilize the SOFTWARE, or for use of the SOFTWARE outside of the scope of this license or not in compliance with its terms. + +5. UPDATES. NVIDIA may, at its option, make available patches, workarounds or other updates to this SOFTWARE. Unless the updates are provided with their separate governing terms, they are deemed part of the SOFTWARE licensed to you as provided in this license. + +6. PRE-RELEASE VERSIONS. SOFTWARE versions identified as alpha, beta, preview, early access or otherwise as pre-release may not be fully functional, may contain errors or design flaws, and may have reduced or different security, privacy, availability, and reliability standards relative to commercial versions of NVIDIA software and materials. You may use a pre-release SOFTWARE version at your own risk, understanding that these versions are not intended for use in production or business-critical systems. + +7. COMPONENTS UNDER OTHER LICENSES. The SOFTWARE may include NVIDIA or third-party components with separate legal notices or terms as may be described in proprietary notices accompanying the SOFTWARE, such as components governed by open source software licenses. If and to the extent there is a conflict between the terms in this license and the license terms associated with a component, the license terms associated with the component controls only to the extent necessary to resolve the conflict. + +8. OWNERSHIP. + +8.1 NVIDIA reserves all rights, title and interest in and to the SOFTWARE not expressly granted to you under this license. NVIDIA and its suppliers hold all rights, title and interest in and to the SOFTWARE, including their respective intellectual property rights. The SOFTWARE is copyrighted and protected by the laws of the United States and other countries, and international treaty provisions. + +8.2 Subject to the rights of NVIDIA and its suppliers in the SOFTWARE, you hold all rights, title and interest in and to your applications and your derivative works of the sample or reference source code delivered in the SOFTWARE including their respective intellectual property rights. With respect to source code samples or reference source code licensed to you, NVIDIA and its affiliates are free to continue independently developing source code samples and you covenant not to sue NVIDIA, its affiliates or their licensees with respect to later versions of NVIDIA released source code. + +9. FEEDBACK. You may, but are not obligated to, provide to NVIDIA Feedback. “Feedback” means suggestions, fixes, modifications, feature requests or other feedback regarding the SOFTWARE. Feedback, even if designated as confidential by you, shall not create any confidentiality obligation for NVIDIA. NVIDIA and its designees have a perpetual, non-exclusive, worldwide, irrevocable license to use, reproduce, publicly display, modify, create derivative works of, license, sublicense, and otherwise distribute and exploit Feedback as NVIDIA sees fit without payment and without obligation or restriction of any kind on account of intellectual property rights or otherwise. + +10. NO WARRANTIES. THE SOFTWARE IS PROVIDED AS-IS. TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW NVIDIA AND ITS AFFILIATES EXPRESSLY DISCLAIM ALL WARRANTIES OF ANY KIND OR NATURE, WHETHER EXPRESS, IMPLIED OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY, NON-INFRINGEMENT, OR FITNESS FOR A PARTICULAR PURPOSE. NVIDIA DOES NOT WARRANT THAT THE SOFTWARE WILL MEET YOUR REQUIREMENTS OR THAT THE OPERATION THEREOF WILL BE UNINTERRUPTED OR ERROR-FREE, OR THAT ALL ERRORS WILL BE CORRECTED. + +11. LIMITATIONS OF LIABILITY. TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW NVIDIA AND ITS AFFILIATES SHALL NOT BE LIABLE FOR ANY SPECIAL, INCIDENTAL, PUNITIVE OR CONSEQUENTIAL DAMAGES, OR FOR ANY LOST PROFITS, PROJECT DELAYS, LOSS OF USE, LOSS OF DATA OR LOSS OF GOODWILL, OR THE COSTS OF PROCURING SUBSTITUTE PRODUCTS, ARISING OUT OF OR IN CONNECTION WITH THIS LICENSE OR THE USE OR PERFORMANCE OF THE SOFTWARE, WHETHER SUCH LIABILITY ARISES FROM ANY CLAIM BASED UPON BREACH OF CONTRACT, BREACH OF WARRANTY, TORT (INCLUDING NEGLIGENCE), PRODUCT LIABILITY OR ANY OTHER CAUSE OF ACTION OR THEORY OF LIABILITY, EVEN IF NVIDIA HAS PREVIOUSLY BEEN ADVISED OF, OR COULD REASONABLY HAVE FORESEEN, THE POSSIBILITY OF SUCH DAMAGES. IN NO EVENT WILL NVIDIA’S AND ITS AFFILIATES TOTAL CUMULATIVE LIABILITY UNDER OR ARISING OUT OF THIS LICENSE EXCEED US$10.00. THE NATURE OF THE LIABILITY OR THE NUMBER OF CLAIMS OR SUITS SHALL NOT ENLARGE OR EXTEND THIS LIMIT. + +12. TERMINATION. Your rights under this license will terminate automatically without notice from NVIDIA if you fail to comply with any term and condition of this license or if you commence or participate in any legal proceeding against NVIDIA with respect to the SOFTWARE. NVIDIA may terminate this license with advance written notice to you, if NVIDIA decides to no longer provide the SOFTWARE in a country or, in NVIDIA’s sole discretion, the continued use of it is no longer commercially viable. Upon any termination of this license, you agree to promptly discontinue use of the SOFTWARE and destroy all copies in your possession or control. Your prior distributions in accordance with this license are not affected by the termination of this license. All provisions of this license will survive termination, except for the license granted to you. + +13. APPLICABLE LAW. This license will be governed in all respects by the laws of the United States and of the State of Delaware, without regard to the conflicts of laws principles. The United Nations Convention on Contracts for the International Sale of Goods is specifically disclaimed. You agree to all terms of this license in the English language. The state or federal courts residing in Santa Clara County, California shall have exclusive jurisdiction over any dispute or claim arising out of this license. Notwithstanding this, you agree that NVIDIA shall still be allowed to apply for injunctive remedies or urgent legal relief in any jurisdiction. + +14. NO ASSIGNMENT. This license and your rights and obligations thereunder may not be assigned by you by any means or operation of law without NVIDIA’s permission. Any attempted assignment not approved by NVIDIA in writing shall be void and of no effect. NVIDIA may assign, delegate or transfer this license and its rights and obligations, and if to a non-affiliate you will be notified. + +15. EXPORT. The SOFTWARE is subject to United States export laws and regulations. You agree to comply with all applicable U.S. and international export laws, including the Export Administration Regulations (EAR) administered by the U.S. Department of Commerce and economic sanctions administered by the U.S. Department of Treasury’s Office of Foreign Assets Control (OFAC). These laws include restrictions on destinations, end-users and end-use. By accepting this license, you confirm that you are not currently residing in a country or region currently embargoed by the U.S. and that you are not otherwise prohibited from receiving the SOFTWARE. + +16. GOVERNMENT USE. The SOFTWARE is, and shall be treated as being, “Commercial Items” as that term is defined at 48 CFR § 2.101, consisting of “commercial computer software” and “commercial computer software documentation”, respectively, as such terms are used in, respectively, 48 CFR § 12.212 and 48 CFR §§ 227.7202 & 252.227-7014(a)(1). Use, duplication or disclosure by the U.S. Government or a U.S. Government subcontractor is subject to the restrictions in this license pursuant to 48 CFR § 12.212 or 48 CFR § 227.7202. In no event shall the US Government user acquire rights in the SOFTWARE beyond those specified in 48 C.F.R. 52.227-19(b)(1)-(2). + +17. NOTICES. Please direct your legal notices or other correspondence to NVIDIA Corporation, 2788 San Tomas Expressway, Santa Clara, California 95051, United States of America, Attention: Legal Department. + +18. ENTIRE AGREEMENT. This license is the final, complete and exclusive agreement between the parties relating to the subject matter of this license and supersedes all prior or contemporaneous understandings and agreements relating to this subject matter, whether oral or written. If any court of competent jurisdiction determines that any provision of this license is illegal, invalid or unenforceable, the remaining provisions will remain in full force and effect. Any amendment or waiver under this license shall be in writing and signed by representatives of both parties. + +(v. November 17, 2021) diff --git a/README.md b/README.md index 8bdeb28..2acf9f4 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,438 @@ -# isaac_ros_object_detection -Deep learning model support for object detection. +# Isaac ROS Object Detection + +![DetectNet output image showing 2 tennis balls correctly identified](resources/header-image.png "Tennis balls detected in image using DetectNet") + +## Overview +This repository provides a GPU-accelerated package for object detection based on [DetectNet](https://developer.nvidia.com/blog/detectnet-deep-neural-network-object-detection-digits/). Using a trained deep-learning model and a monocular camera, the `isaac_ros_detectnet` package can detect objects of interest in an image and provide bounding boxes. [DetectNet](ttps://catalog.ngc.nvidia.com/orgs/nvidia/models/tlt_pretrained_detectnet_v2/version) is similar to other popular object detection models such as YOLOV3, FasterRCNN, SSD, and others while being efficient working with multiple object classes in large images. + +Packages in this repository rely on accelerated DNN model inference using [Triton](https://github.com/triton-inference-server/server) from [Isaac ROS DNN Inference](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_dnn_inference) and a pretrained model from [NVIDIA GPU Cloud (NGC)](https://docs.nvidia.com/ngc/) or a [custom re-trained DetectNet model](https://docs.nvidia.com/isaac/isaac/doc/tutorials/training_in_docker.html). Please note that **there is no support for the [Isaac ROS TensorRT](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_dnn_inference/tree/main/isaac_ros_tensor_rt) package at this time**. + +For solutions to known issues, please visit the [Troubleshooting](#troubleshooting) section below. + +## System Requirements +This Isaac ROS package is designed and tested to be compatible with ROS2 Foxy on Jetson hardware, in addition to x86 systems with an NVIDIA GPU. On x86 systems, packages are only supported when run within the provided Isaac ROS Dev Docker container. + +### Jetson +- [Jetson AGX Xavier or Xavier NX](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/) +- [JetPack 4.6.1](https://developer.nvidia.com/embedded/jetpack) + +### x86_64 (in Isaac ROS Dev Docker Container) +- Ubuntu 20.04+ +- CUDA 11.4+ supported discrete GPU +- VPI 1.1.11 + + +**Note**: For best performance on Jetson, ensure that the [power settings](https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/power_management_jetson_xavier.html#wwpID0EUHA) are configured appropriately. + +### Docker +You need to use the Isaac ROS development Docker image from [Isaac ROS Common](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_common), based on the version 21.08 image from [Deep Learning Frameworks Containers](https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html). + +You must first install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) use the Docker container development/runtime environment. + +Configure `nvidia-container-runtime` as the default runtime for Docker by editing `/etc/docker/daemon.json` to include the following: +``` + "runtimes": { + "nvidia": { + "path": "nvidia-container-runtime", + "runtimeArgs": [] + } + }, + "default-runtime": "nvidia" +``` +Then restart Docker: `sudo systemctl daemon-reload && sudo systemctl restart docker` + +Run the following script in `isaac_ros_common` to build the image and launch the container on x86_64 or Jetson: + +`$ scripts/run_dev.sh ` + +### Dependencies +- [isaac_ros_common](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_common/tree/main/isaac_ros_common) +- [isaac_ros_nvengine](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_common/tree/main/isaac_ros_nvengine) +- [isaac_ros_nvengine_interfaces](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_common/tree/main/isaac_ros_nvengine_interfaces) +- [isaac_ros_triton](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_dnn_inference/tree/main/isaac_ros_triton) + +## Setup +1. Create a ROS2 workspace if one is not already prepared: + ``` + mkdir -p your_ws/src + ``` + **Note**: The workspace can have any name; this guide assumes you name it `your_ws`. + +2. Clone the Isaac ROS Object Detection, Isaac ROS DNN Inference, and Isaac ROS Common package repositories to `your_ws/src`. Check that you have [Git LFS](https://git-lfs.github.com/) installed before cloning to pull down all large files: + ``` + sudo apt-get install git-lfs + + cd your_ws/src + git clone https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_object_detection + git clone https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_dnn_inference + git clone https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_common + ``` + +3. Start the Docker interactive workspace: + ``` + isaac_ros_common/scripts/run_dev.sh your_ws + ``` + After this command, you will be inside the container at `/workspaces/isaac_ros-dev`. Running this command in different terminals will attach it to the same container. + + **Note**: The rest of this README assumes that you are inside this container. + +## Obtaining a Pre-Trained DetectNet Model + +The easiest way to obtain a DetectNet model is to download a pre-trained one from NVIDIA's [NGC repository](https://ngc.nvidia.com). This package only supports models based on the `Detectnet_v2` architecture. + +[The catalog of pre-trained models can be seen in the NGC documentation](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/containers/tao-toolkit-tf). Follow the instructions on the documentation for the latest version of the models and datasets you are interested in using. You can find `.etlt` files in the _File browser_ tab for each model's page, along with the _key_ to use when generating a machine-specific `.plan` file in the following steps. + +Some of the [supported DetectNet models](https://catalog.ngc.nvidia.com/?filters=&orderBy=scoreDESC&query=DetectNet) from NGC: + +| Model Name | Use Case | +| ------------------------------------------------------------------------------- | ------------------------------------------------------ | +| [TrafficCamNet](https://ngc.nvidia.com/catalog/models/nvidia:tao:trafficcamnet) | Detect and track cars | +| [PeopleNet](https://ngc.nvidia.com/catalog/models/nvidia:tao:peoplenet) | People counting, heatmap generation, social distancing | +| [DashCamNet](https://ngc.nvidia.com/catalog/models/nvidia:tao:dashcamnet) | Identify objects from a moving object | +| [FaceDetectIR](https://ngc.nvidia.com/catalog/models/nvidia:tao:facedetectir) | Detect faces in a dark environment with IR camera | + + + +## Training a model using simulation + +There are multiple ways to train your own `Detectnet_v2` base model. Note that you will need to update parameters, launch files, and more to match your specific trained model. + +### Use the TAO toolkit launcher +The `Train and Optimize` tookit from NVIDIA has all the tools you need to prepare a dataset and re-train a detector with an easy to follow Jupyter notebook tutorial. + +1. Install the `tao` command line utilities + ```bash + pip3 install jupyterlab + pip3 install nvidia-pyindex + pip3 install nvidia-tao + ``` +2. Obtain an [NGC API key](https://ngc.nvidia.com/setup/api-key). +3. Install and configure `ngc cli` from [NVIDIA NGC CLI Setup](https://ngc.nvidia.com/setup/installers/cli). + ```bash + wget -O ngccli_linux.zip https://ngc.nvidia.com/downloads/ngccli_linux.zip && unzip -o ngccli_linux.zip && chmod u+x ngc + md5sum -c ngc.md5 + echo "export PATH=\"\$PATH:$(pwd)\"" >> ~/.bash_profile && source ~/.bash_profile + ngc config set + ``` +4. Download the TAO cv examples to a local folder + ```bash + ngc registry resource download-version "nvidia/tao/cv_samples:v1.3.0" + ``` +5. Run the `DetectNet_v2` Jupyter notebook server. + ```bash + cd cv_samples_vv1.3.0 + jupyter-notebook --ip 0.0.0.0 --port 8888 --allow-root + ``` +6. Navigate to the DetectNet v2 notebook in `detectnet_v2/detectnet_v2.ipynb` or go to + ``` + http://0.0.0.0:8888/notebooks/detectnet_v2/detectnet_v2.ipynb + ``` + And follow the instructions on the tutorial. + +### Training object detection in simulation + +If you wish to generate training data from simulation using 3D models of the object classes you would like to detect, consider following the tutorial [Training Object detection from Simulation](https://docs.nvidia.com/isaac/isaac/doc/tutorials/training_in_docker.html). + +The tutorial will use simulation to create a dataset that can then be used to train a `DetectNet_v2` based model. It's an easy to use tool with full access to customize training parameters in a Jupyter notebook. + +Once you follow through the tutorial, you should have an `ETLT` file in `~/isaac-experiments/tlt-experiments/experiment_dir_final/resnet18_detector.etlt`. + +Consult the spec file in `~/isaac-experiments/specs/isaac_detect_net_inference.json` for the values to use in the following section when preparing the model for usage with this package. + +### Using the included dummy model for testing + +In this package, you will find a pre-trained DetectNet model that was trained solely for detecting tennis balls using the described simulation method. Please use this model only for verification or exploring the pipeline. + +**Note**: Do not use this tennis ball detection model in a production environment. + +You can find the `ETLT` file in `isaac_ros_detectnet/test/dummy_model/detectnet/1/resnet18_detector.etlt` and use the ETLT key `"object-detection-from-sim-pipeline"`, including the double quotes. + +```bash +export PRETRAINED_MODEL_ETLT_KEY=\"object-detection-from-sim-pipeline\" +``` + +## Model Preparation + +In order to use a pre-trained DetectNet model, it needs to be processed for Triton. The following assumes that you have a pre-trained model in the current directory named `resnet18_detector.etlt`,that your Triton model repository is located at `/tmp/models` and that you want to name your model as `detectnet` with version `1`. + +You should obtain an `.etlt` file and its key used for training using the methods described above along with any parameters you need for input configuration. + +The input image size is `368x640`. **This is not a standard size, since both dimensions must be divisible by 16 for DetectNet to process it.** + +Please refer to your DetectNet training or to NGC for the ETLT key and other parameters. The key will be referred to as `$PRETRAINED_MODEL_ETLT_KEY`, so please ensure you set that variable or replace in the commands below. + +For information on the options given to the tao-converter tool, please refer to the command line help with +```bash + /opt/nvidia/tao/tao-converter -h +``` + +To prepare your model, please run the following commands: + +```bash +# Create folder for our model with version number +mkdir -p /tmp/models/detectnet/1 + +# Create a plan file for Triton +/opt/nvidia/tao/tao-converter \ + -k $PRETRAINED_MODEL_ETLT_KEY \ + -d 3,368,640 \ + -p input_1,1x3x368x640,1x3x368x640,1x3x368x640 \ + -t fp16 \ + -e detectnet.engine \ + -o output_cov/Sigmoid,output_bbox/BiasAdd \ + resnet18_detector.etlt +# Deploy converted model to Triton +cp detectnet.engine /tmp/models/detectnet/1/model.plan +# Open an editor with the model configuration file for Triton. +# Copy the following section content into this file. +nano /tmp/models/detectnet/config.pbtxt +``` + +These commands will open a new file. The content of the file `/tmp/models/detectnet/config.pbtxt` must be as follows: + +``` +name: "detectnet" +platform: "tensorrt_plan" +max_batch_size: 16 +input [ + { + name: "input_1" + data_type: TYPE_FP32 + format: FORMAT_NCHW + dims: [ 3, 368, 640 ] + } +] +output [ + { + name: "output_bbox/BiasAdd" + data_type: TYPE_FP32 + dims: [ 8, 23, 40 ] + }, + { + name: "output_cov/Sigmoid" + data_type: TYPE_FP32 + dims: [ 2, 23, 40 ] + } +] +dynamic_batching { } +version_policy: { + specific { + versions: [ 1 ] + } +} +``` + +Please note that some of the values here will change depending on the way you trained your model. + +1. `input[0].dims` is a vector with the following. Please note that the input image width and height should be multiples of 16 since DetectNet slices the image in a grid of 16x16 squares. If this is not the case, the bounding boxes will be off on the output and will need to be scaled. + * The first position should be `3` for 3 RGB bytes. + * The second position is the height of the input image in pixels. + * The third position is the width of the input image in pixels. +2. `output[0].dims` is the bounding box output tensor size. + * The first position is `4*C`, where `C` is the number of classes the network was trained for. + * The second position is the number of grid rows, meaning `input image height / 16`. + * The third position is the number of grid columns, meaning `input image width / 16`. +3. `output[1].dims` is the coverage value tensor size. + * The first position is `C`, the number of classes the network was trained for. + * The second position is the number of grid rows, meaning `input image height / 16`. + * The third position is the number of grid columns, meaning `input image width / 16`. + +Once you have the models folder configured, you can point the DNN inference node to load models using the `model_repository_paths` parameter, as explained in the following section. + +See the [DetectNet documentation](https://docs.nvidia.com/metropolis/TLT/tlt-user-guide/text/object_detection/detectnet_v2.html#bbox-ground-truth-generator) for more information. + +## ROS2 Graph Configuration + +To run the DetectNet object detection inference, the following ROS2 nodes should be set up and running: + +![DetectNet output image showing 2 tennis balls correctly identified](resources/ros2_detectnet_node_setup.svg "Tennis balls detected in image using DetectNet") + +1. **Isaac ROS DNN Image encoder**: This will take an image message and convert it to a tensor ([`TensorList`](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_common/blob/main/isaac_ros_nvengine_interfaces/msg/TensorList.msg) that can be + processed by the network. +2. **Isaac ROS DNN Inference - Triton**: This will execute the DetectNet network and take as input the tensor from the DNN Image Encoder. **Note: The [Isaac ROS TensorRT](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_dnn_inference/tree/main/isaac_ros_tensor_rt) package is not able to perform inference with DetectNet models at this time.** + The output will be a TensorList message containing the encoded detections + * Use the parameters `model_name` and `model_repository_paths` to point to the model folder and set the model name. The `.plan` file should be located at `$model_repository_path/$model_name/1/model.plan` +3. **Isaac ROS Detectnet Decoder** + This node will take the TensorList with encoded detections as input, and output Detection2DArray messages + for each frame. See the following section for the parameters. + +## Package reference +### `isaac_ros_detectnet` + +#### Overview +The `isaac_ros_detectnet` package offers decoder to interpret inference results of a DetectNet_v2 model from the [`Triton Inference Server node`](https://gitlab-master.nvidia.com/isaac_ros/isaac_ros_dnn_inference/-/tree/dev/isaac_ros_triton). + +#### Package Dependencies +- [isaac_ros_dnn_encoders](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_dnn_inference/tree/main/isaac_ros_dnn_encoders) +- [isaac_ros_nvengine_interfaces](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_common/tree/main/isaac_ros_nvengine_interfaces) +- [isaac_ros_triton](https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_dnn_inference/tree/main/isaac_ros_triton) + +#### Available Components +| Component | Topics Subscribed | Topics Published | Parameters | +| ----------------- | -------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `DetectNetDecoderNode` | `tensor_sub`: The tensor that represents the inferred aligned bounding boxes | `detectnet/detections`: Aligned image bounding boxes with detection class ([vision_msgs/Detection2DArray](http://docs.ros.org/en/lunar/api/vision_msgs/html/msg/Detection2DArray.html)) | `label_names`: A list of strings with the names of the classes in the order they are used in the model. Keep the order of the list consistent with the training configuration.
`coverage_threshold`: A minimum coverage value for the boxes to be considered. Bounding boxes with lower value than this will not be in the output.
`bounding_box_scale`: The scale parameter, which should match the training configuration. `bounding_box_offset`: The bounding box parameter, which should match the training configuration.
`eps`: Epsilon value to use. Defaults to 0.01.
`min_boxes`: The minimum number of boxes to return. Defaults to 1.
`enable_athr_filter`: Enables the area-to-hit ratio (ATHR) filter. The ATHR is calculated as: __ATHR = sqrt(clusterArea) / nObjectsInCluster__. Defaults to 0.
`threshold_athr`: The `area-to-hit` ratio threshold. Defaults to 0.
`clustering_algorithm`: The clustering algorithm selection. Defaults to 1. (`1`: Enables DBScan clustering, `2`: Enables Hybrid clustering, resulting in more boxes that will need to be processed with NMS or other means of reducing overlapping detections. | + +To see more information about how these values are used, see the [DetectNet documentation](https://docs.nvidia.com/metropolis/TLT/tlt-user-guide/text/object_detection/detectnet_v2.html#bbox-ground-truth-generator) + +### Example Code +You can use the following example Python code to set up a `detectnet_conainer` node in your application. Once you have this code, you can set up your message publisher to send an `Image` message to the DNN encoder and subscribe to the `detectnet/detections` publisher from the DetectNet decoder. + +This code will not run by itself; it should be included in your application and properly started or shut down by ROS2. Use this code as a reference only. + +```python + +from launch_ros.actions.composable_node_container import ComposableNodeContainer +from launch_ros.descriptions.composable_node import ComposableNode + +MODELS_PATH = '/tmp/models' + +def generate_launch_description(): + """Generate launch description for running DetectNet inference.""" + launch_dir_path = os.path.dirname(os.path.realpath(__file__)) + config = launch_dir_path + '/../config/params.yaml' + + encoder_node = ComposableNode( + name='dnn_image_encoder', + package='isaac_ros_dnn_encoders', + plugin='isaac_ros::dnn_inference::DnnImageEncoderNode', + parameters=[{ + 'network_image_width': 640, + 'network_image_height': 368, + 'network_image_encoding': 'rgb8', + 'network_normalization_type': 'positive_negative', + 'tensor_name': 'input_tensor' + }], + remappings=[('encoded_tensor', 'tensor_pub')] + ) + + triton_node = ComposableNode( + name='triton_node', + package='isaac_ros_triton', + plugin='isaac_ros::dnn_inference::TritonNode', + parameters=[{ + 'model_name': 'detectnet', + 'model_repository_paths': [MODELS_PATH], + 'input_tensor_names': ['input_tensor'], + 'input_binding_names': ['input_1'], + 'output_tensor_names': ['output_cov', 'output_bbox'], + 'output_binding_names': ['output_cov/Sigmoid', 'output_bbox/BiasAdd'] + }]) + + detectnet_decoder_node = ComposableNode( + name='detectnet_decoder_node', + package='isaac_ros_detectnet', + plugin='isaac_ros::detectnet::DetectNetDecoderNode', + parameters=[config] + ) + + container = ComposableNodeContainer( + name='detectnet_container', + namespace='detectnet_container', + package='rclcpp_components', + executable='component_container', + composable_node_descriptions=[encoder_node, triton_node, detectnet_decoder_node], + output='screen' + ) + + return launch.LaunchDescription([container]) +``` +We have provided a launch file for your convenience. This launch file will load models from `/tmp/models`, as in the instructions above. + +## Running the launch files + +Included in this repository is a small script that will load an image and generate a visualization using the output from DetectNet. + +You can find the script in `isaac_ros_detectnet/scripts/isaac_ros_detectnet_visualizer.py`. + +1. Make a models repository. + ```bash + mkdir -p /tmp/models/detectnet/1 + ``` +2. Use the included ETLT file in `isaac_ros_detectnet/test/dummy_model/detectnet/1/resnet18_detector.etlt`. + ```bash + cp src/isaac_ros_object_detection/isaac_ros_detectnet/test/dummy_model/detectnet/1/resnet18_detector.etlt \ + /tmp/models/detectnet/1 + ``` +3. Convert the ETLT to ``model.plan``. + ```bash + cd /tmp/models/detectnet/1 + # This is the key for the provided pretrained model + # replace with your own key when using a model trained by any other means + export PRETRAINED_MODEL_ETLT_KEY=\"object-detection-from-sim-pipeline\" + /opt/nvidia/tao/tao-converter \ + -k $PRETRAINED_MODEL_ETLT_KEY \ + -d 3,368,640 \ + -p input_1,1x3x368x640,1x3x368x640,1x3x368x640 \ + -t fp16 \ + -e model.plan \ + -o output_cov/Sigmoid,output_bbox/BiasAdd \ + resnet18_detector.etlt + ``` +4. Edit the `/tmp/models/detectnet/config.pbtxt` file and copy the contents from the [Model Preparation](#model-preparation) section. +5. Make sure that the model repository is configured correctly. You should have the following files: + ```bash + ls -R /tmp/models/ + /tmp/models/: + detectnet + + /tmp/models/detectnet: + 1 config.pbtxt + + /tmp/models/detectnet/1: + model.plan resnet18_detector.etlt + ``` +6. Build and install the package from the root of your workspace. + ```bash + cd your_ws + colcon build --symlink-install --packages-up-to isaac_ros_detectnet + ``` +7. Execute the launch file and visualizer demo node along with `rqt` to inspect the messages. Open 3 terminal windows where you will run ros commands. If you are using VSCode with the remote development plugin to connect to the development Docker container you can skip the `docker exec` command. + 1. On the first terminal run: + ```bash + cd your_ws + ./scripts/run_dev.sh + source install/setup.bash + ros2 launch isaac_ros_detectnet isaac_ros_detectnet.launch.py + ``` + 2. On the second terminal run: + ```bash + cd your_ws + ./scripts/run_dev.sh + source install/setup.bash + ros2 run isaac_ros_detectnet isaac_ros_detectnet_visualizer.py + ``` + 3. On the third terminal run: + ```bash + cd your_ws + ./scripts/run_dev.sh + source install/setup.bash + rqt + ``` +8. The `rqt` window may need to be configured to show the graph and the image. + Enable the `Plugins > Visualization > Image View` window from the main menu in `rqt` and set the image topic to `/detectnet_processed_image`. You should see something like this: + + ![DetectNet output image showing a tennis ball correctly identified](resources/rqt_visualizer.png "RQT showing detection boxes of an NVIDIA Mug and a tennis ball from simulation using DetectNet") + +9. To stop the demo, press `Ctrl-C` on each terminal. + +## Troubleshooting +### Nodes crashed on initial launch reporting shared libraries have a file format not recognized +Many dependent shared library binary files are stored in `git-lfs`. These files need to be fetched in order for Isaac ROS nodes to function correctly. + +#### Symptoms +``` +/usr/bin/ld:/workspaces/isaac_ros-dev/ros_ws/src/isaac_ros_common/isaac_ros_nvengine/gxf/lib/gxf_jetpack46/core/libgxf_core.so: file format not recognized; treating as linker script +/usr/bin/ld:/workspaces/isaac_ros-dev/ros_ws/src/isaac_ros_common/isaac_ros_nvengine/gxf/lib/gxf_jetpack46/core/libgxf_core.so:1: syntax error +collect2: error: ld returned 1 exit status +make[2]: *** [libgxe_node.so] Error 1 +make[1]: *** [CMakeFiles/gxe_node.dir/all] Error 2 +make: *** [all] Error 2 +``` +#### Solution +Run `git lfs pull` in each Isaac ROS repository you have checked out, especially `isaac_ros_common`, to ensure all of the large binary files have been downloaded. + +# Updates + +| Date | Changes | +| ---------- | --------------- | +| 2022-03-21 | Initial release | diff --git a/isaac_ros_detectnet/CMakeLists.txt b/isaac_ros_detectnet/CMakeLists.txt new file mode 100644 index 0000000..4056147 --- /dev/null +++ b/isaac_ros_detectnet/CMakeLists.txt @@ -0,0 +1,80 @@ +# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved. +# +# NVIDIA CORPORATION and its licensors retain all intellectual property +# and proprietary rights in and to this software, related documentation +# and any modifications thereto. Any use, reproduction, disclosure or +# distribution of this software and related documentation without an express +# license agreement from NVIDIA CORPORATION is strictly prohibited. + +cmake_minimum_required(VERSION 3.5) +project(isaac_ros_detectnet LANGUAGES C CXX) + +# Default to C++17 +if(NOT CMAKE_CXX_STANDARD) + set(CMAKE_CXX_STANDARD 17) +endif() + +if(CMAKE_COMPILER_IS_GNUCXX OR CMAKE_CXX_COMPILER_ID MATCHES "Clang") + add_compile_options(-Wall -Wextra -Wpedantic) +endif() + +find_package(ament_cmake_auto REQUIRED) +find_package(ament_cmake_python REQUIRED) + +ament_auto_find_build_dependencies() + +# NVDSINFER_DBSCAN + +execute_process(COMMAND uname -m COMMAND tr -d '\n' + OUTPUT_VARIABLE ARCHITECTURE +) +message( STATUS "Architecture: ${ARCHITECTURE}" ) + +add_library(nvdsinfer_dbscan SHARED IMPORTED) +if( ${ARCHITECTURE} STREQUAL "x86_64" ) + set_property(TARGET nvdsinfer_dbscan PROPERTY IMPORTED_LOCATION ${CMAKE_CURRENT_SOURCE_DIR}/dbscan/x86_64_cuda_11_4/libnvds_dbscan.so) +elseif( ${ARCHITECTURE} STREQUAL "aarch64" ) + set_property(TARGET nvdsinfer_dbscan PROPERTY IMPORTED_LOCATION ${CMAKE_CURRENT_SOURCE_DIR}/dbscan/jetpack46_1/libnvds_dbscan.so) +endif() + + +# Decoder node +ament_auto_add_library(detectnet_decoder_node SHARED src/detectnet_decoder_node.cpp) +target_compile_definitions(detectnet_decoder_node + PRIVATE "COMPOSITION_BUILDING_DLL" +) +target_link_libraries(detectnet_decoder_node nvdsinfer_dbscan) + +rclcpp_components_register_nodes(detectnet_decoder_node "isaac_ros::detectnet::DetectNetDecoderNode") +set(node_plugins "${node_plugins}isaac_ros::detectnet::DetectNetDecoderNode;$\n") + +install(TARGETS detectnet_decoder_node + ARCHIVE DESTINATION lib + LIBRARY DESTINATION lib + RUNTIME DESTINATION bin +) + +if(BUILD_TESTING) + find_package(ament_lint_auto REQUIRED) + + # Ignore copyright notices since we use custom NVIDIA Isaac ROS Software License + set(ament_cmake_copyright_FOUND TRUE) + + ament_lint_auto_find_test_dependencies() + + find_package(launch_testing_ament_cmake REQUIRED) + add_launch_test(test/isaac_ros_detectnet_pol_test.py TIMEOUT "300") +endif() + +# Visualizer python scripts + +ament_python_install_package(${PROJECT_NAME}) +# Install Python executables +install(PROGRAMS + scripts/isaac_ros_detectnet_visualizer.py + DESTINATION lib/${PROJECT_NAME} +) + +ament_auto_package(INSTALL_TO_SHARE launch) + +find_package(vision_msgs REQUIRED) diff --git a/isaac_ros_detectnet/config/params.yaml b/isaac_ros_detectnet/config/params.yaml new file mode 100644 index 0000000..0ba82fb --- /dev/null +++ b/isaac_ros_detectnet/config/params.yaml @@ -0,0 +1,12 @@ +model_repository_paths: + - '/tmp/models' +model_name: 'detectnet' +label_names: + - 'nvidia_mug' +coverage_threshold: 0.5 +bounding_box_scale: 35.0 +bounding_box_offset: 0.5 +eps: 0.5 +min_boxes: 5 +clustering_algorithm: 1 +verbose: False diff --git a/isaac_ros_detectnet/dbscan/jetpack46_1/libnvds_dbscan.so b/isaac_ros_detectnet/dbscan/jetpack46_1/libnvds_dbscan.so new file mode 100755 index 0000000..774a83c --- /dev/null +++ b/isaac_ros_detectnet/dbscan/jetpack46_1/libnvds_dbscan.so @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:4cc2e6fe6728afbad88f7ecb0c295519b31c06b3119c07c64b23d7fbe03c4fb4 +size 23656 diff --git a/isaac_ros_detectnet/dbscan/x86_64_cuda_11_4/libnvds_dbscan.so b/isaac_ros_detectnet/dbscan/x86_64_cuda_11_4/libnvds_dbscan.so new file mode 100755 index 0000000..c1e4dac --- /dev/null +++ b/isaac_ros_detectnet/dbscan/x86_64_cuda_11_4/libnvds_dbscan.so @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:90ff06d10be8f97f04d0cc45d45666d9f1f66d417842e258daaf8aa225ad384a +size 25872 diff --git a/isaac_ros_detectnet/examples/demo.png b/isaac_ros_detectnet/examples/demo.png new file mode 100644 index 0000000..294b7dc --- /dev/null +++ b/isaac_ros_detectnet/examples/demo.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:5810b529a4c64da5cec73c73172a54c2ed09a34290de57aa7c7934646283d776 +size 398496 diff --git a/isaac_ros_detectnet/include/isaac_ros_detectnet/detectnet_decoder_node.hpp b/isaac_ros_detectnet/include/isaac_ros_detectnet/detectnet_decoder_node.hpp new file mode 100644 index 0000000..e5f7fb1 --- /dev/null +++ b/isaac_ros_detectnet/include/isaac_ros_detectnet/detectnet_decoder_node.hpp @@ -0,0 +1,79 @@ +/** + * Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. + * + * NVIDIA CORPORATION and its licensors retain all intellectual property + * and proprietary rights in and to this software, related documentation + * and any modifications thereto. Any use, reproduction, disclosure or + * distribution of this software and related documentation without an express + * license agreement from NVIDIA CORPORATION is strictly prohibited. + */ + +#ifndef ISAAC_ROS_DETECTNET__DETECTNET_DECODER_NODE_HPP_ +#define ISAAC_ROS_DETECTNET__DETECTNET_DECODER_NODE_HPP_ + +#include +#include +#include + +#include "isaac_ros_nvengine_interfaces/msg/tensor_list.hpp" +#include "vision_msgs/msg/detection2_d_array.hpp" +#include "rclcpp/rclcpp.hpp" + +namespace isaac_ros +{ +namespace detectnet +{ + +class DetectNetDecoderNode : public rclcpp::Node +{ +public: + explicit DetectNetDecoderNode(const rclcpp::NodeOptions options = rclcpp::NodeOptions()); + ~DetectNetDecoderNode(); + +private: +/** + * @brief Callback to decode a tensor list output by a DetectNet architecture + * and then publish a detection list + * + * @param tensor_list_msg The TensorList msg representing the detection list output by DetectNet + */ + void DetectNetDecoderCallback( + const isaac_ros_nvengine_interfaces::msg::TensorList::ConstSharedPtr tensor_list_msg); + + // Queue size of subscriber + int queue_size_; + + // Frame id that the message should be in + std::string header_frame_id_; + // A list of class labels in the order they are used in the model + std::vector label_names_; + // coverage threshold to discard detections. + // Detections with lower coverage than the threshold will be discarded + float coverage_threshold_; + // Bounding box normalization for both X and Y dimensions. This value is set in the DetectNetv2 + // training specification. + float bounding_box_scale_; + // Bounding box offset for both X and Y dimensions. This value is set in the DetectNetv2 + // training specification. + float bounding_box_offset_; + + // Parameters for DBscan. + float eps_; + int min_boxes_; + int enable_athr_filter_; + float threshold_athr_; + int clustering_algorithm_; + + // Subscribes to a Tensor that will be converted to a detection list + rclcpp::Subscription::SharedPtr tensor_list_sub_; + + // Publishes the processed Tensor as an array of detections (Detection2DArray) + rclcpp::Publisher::SharedPtr detections_pub_; + struct DetectNetDecoderImpl; + std::unique_ptr impl_; // Pointer to implementation +}; + +} // namespace detectnet +} // namespace isaac_ros + +#endif // ISAAC_ROS_DETECTNET__DETECTNET_DECODER_NODE_HPP_ diff --git a/isaac_ros_detectnet/include/nvdsinfer/AMENT_IGNORE b/isaac_ros_detectnet/include/nvdsinfer/AMENT_IGNORE new file mode 100644 index 0000000..e69de29 diff --git a/isaac_ros_detectnet/include/nvdsinfer/nvdsinfer.h b/isaac_ros_detectnet/include/nvdsinfer/nvdsinfer.h new file mode 100644 index 0000000..e3833ea --- /dev/null +++ b/isaac_ros_detectnet/include/nvdsinfer/nvdsinfer.h @@ -0,0 +1,303 @@ +/* + * Copyright (c) 2017-2020, NVIDIA CORPORATION. All rights reserved. + * + * NVIDIA Corporation and its licensors retain all intellectual property + * and proprietary rights in and to this software, related documentation + * and any modifications thereto. Any use, reproduction, disclosure or + * distribution of this software and related documentation without an express + * license agreement from NVIDIA Corporation is strictly prohibited. + * + */ + +/** + * @file + * NVIDIA DeepStream inference specifications + * + * @b Description: This file defines common elements used in the API + * exposed by the Gst-nvinfer plugin. + */ + +/** + * @defgroup ee_nvinf Gst-infer API Common Elements + * + * Defines common elements used in the API exposed by the Gst-inference plugin. + * @ingroup NvDsInferApi + * @{ + */ + +#ifndef _NVDSINFER_H_ +#define _NVDSINFER_H_ + +#include + +#ifdef __cplusplus +extern "C" +{ +#endif + +#define NVDSINFER_MAX_DIMS 8 + +#define _DS_DEPRECATED_(STR) __attribute__ ((deprecated (STR))) + +/** + * Holds the dimensions of a layer. + */ +typedef struct +{ + /** Holds the number of dimesions in the layer.*/ + unsigned int numDims; + /** Holds the size of the layer in each dimension. */ + unsigned int d[NVDSINFER_MAX_DIMS]; + /** Holds the number of elements in the layer, including all dimensions.*/ + unsigned int numElements; +} NvDsInferDims; + +/** + * Holds the dimensions of a three-dimensional layer. + */ +typedef struct +{ + /** Holds the channel count of the layer.*/ + unsigned int c; + /** Holds the height of the layer.*/ + unsigned int h; + /** Holds the width of the layer.*/ + unsigned int w; +} NvDsInferDimsCHW; + +/** + * Specifies the data type of a layer. + */ +typedef enum +{ + /** Specifies FP32 format. */ + FLOAT = 0, + /** Specifies FP16 format. */ + HALF = 1, + /** Specifies INT8 format. */ + INT8 = 2, + /** Specifies INT32 format. */ + INT32 = 3 +} NvDsInferDataType; + +/** + * Holds information about one layer in the model. + */ +typedef struct +{ + /** Holds the data type of the layer. */ + NvDsInferDataType dataType; + /** Holds the dimensions of the layer. */ + union { + NvDsInferDims inferDims; + NvDsInferDims dims _DS_DEPRECATED_("dims is deprecated. Use inferDims instead"); + }; + /** Holds the TensorRT binding index of the layer. */ + int bindingIndex; + /** Holds the name of the layer. */ + const char* layerName; + /** Holds a pointer to the buffer for the layer data. */ + void *buffer; + /** Holds a Boolean; true if the layer is an input layer, + or false if an output layer. */ + int isInput; +} NvDsInferLayerInfo; + +/** + * Holds information about the model network. + */ +typedef struct +{ + /** Holds the input width for the model. */ + unsigned int width; + /** Holds the input height for the model. */ + unsigned int height; + /** Holds the number of input channels for the model. */ + unsigned int channels; +} NvDsInferNetworkInfo; + +/** + * Sets values on a @ref NvDsInferDimsCHW structure from a @ref NvDsInferDims + * structure. + */ +#define getDimsCHWFromDims(dimsCHW,dims) \ + do { \ + (dimsCHW).c = (dims).d[0]; \ + (dimsCHW).h = (dims).d[1]; \ + (dimsCHW).w = (dims).d[2]; \ + } while (0) + +#define getDimsHWCFromDims(dimsCHW,dims) \ + do { \ + (dimsCHW).h = (dims).d[0]; \ + (dimsCHW).w = (dims).d[1]; \ + (dimsCHW).c = (dims).d[2]; \ + } while (0) + +/** + * Holds information about one parsed object from a detector's output. + */ +typedef struct +{ + /** Holds the ID of the class to which the object belongs. */ + unsigned int classId; + + /** Holds the horizontal offset of the bounding box shape for the object. */ + float left; + /** Holds the vertical offset of the object's bounding box. */ + float top; + /** Holds the width of the object's bounding box. */ + float width; + /** Holds the height of the object's bounding box. */ + float height; + + /** Holds the object detection confidence level; must in the range + [0.0,1.0]. */ + float detectionConfidence; +} NvDsInferObjectDetectionInfo; + +/** + * A typedef defined to maintain backward compatibility. + */ +typedef NvDsInferObjectDetectionInfo NvDsInferParseObjectInfo; + +/** + * Holds information about one parsed object and instance mask from a detector's output. + */ +typedef struct +{ + /** Holds the ID of the class to which the object belongs. */ + unsigned int classId; + + /** Holds the horizontal offset of the bounding box shape for the object. */ + float left; + /** Holds the vertical offset of the object's bounding box. */ + float top; + /** Holds the width of the object's bounding box. */ + float width; + /** Holds the height of the object's bounding box. */ + float height; + + /** Holds the object detection confidence level; must in the range + [0.0,1.0]. */ + float detectionConfidence; + + /** Holds object segment mask */ + float *mask; + /** Holds width of mask */ + unsigned int mask_width; + /** Holds height of mask */ + unsigned int mask_height; + /** Holds size of mask in bytes*/ + unsigned int mask_size; +} NvDsInferInstanceMaskInfo; + +/** + * Holds information about one classified attribute. + */ +typedef struct +{ + /** Holds the index of the attribute's label. This index corresponds to + the order of output layers specified in the @a outputCoverageLayerNames + vector during initialization. */ + unsigned int attributeIndex; + /** Holds the the attribute's output value. */ + unsigned int attributeValue; + /** Holds the attribute's confidence level. */ + float attributeConfidence; + /** Holds a pointer to a string containing the attribute's label. + Memory for the string must not be freed. Custom parsing functions must + allocate strings on heap using strdup or equivalent. */ + char *attributeLabel; +} NvDsInferAttribute; + +/** + * Enum for the status codes returned by NvDsInferContext. + */ +typedef enum { + /** NvDsInferContext operation succeeded. */ + NVDSINFER_SUCCESS = 0, + /** Failed to configure the NvDsInferContext instance possibly due to an + * erroneous initialization property. */ + NVDSINFER_CONFIG_FAILED, + /** Custom Library interface implementation failed. */ + NVDSINFER_CUSTOM_LIB_FAILED, + /** Invalid parameters were supplied. */ + NVDSINFER_INVALID_PARAMS, + /** Output parsing failed. */ + NVDSINFER_OUTPUT_PARSING_FAILED, + /** CUDA error was encountered. */ + NVDSINFER_CUDA_ERROR, + /** TensorRT interface failed. */ + NVDSINFER_TENSORRT_ERROR, + /** Resource error was encountered. */ + NVDSINFER_RESOURCE_ERROR, + /** Triton error was encountered. Renamed TRT-IS to Triton. */ + NVDSINFER_TRITON_ERROR, + /** [deprecated]TRT-IS error was encountered */ + NVDSINFER_TRTIS_ERROR = NVDSINFER_TRITON_ERROR, + /** Unknown error was encountered. */ + NVDSINFER_UNKNOWN_ERROR +} NvDsInferStatus; + +/** + * Enum for the log levels of NvDsInferContext. + */ +typedef enum { + NVDSINFER_LOG_ERROR = 0, + NVDSINFER_LOG_WARNING, + NVDSINFER_LOG_INFO, + NVDSINFER_LOG_DEBUG, +} NvDsInferLogLevel; + +/** + * Get the string name for the status. + * + * @param[in] status An NvDsInferStatus value. + * @return String name for the status. Memory is owned by the function. Callers + * should not free the pointer. + */ +const char* NvDsInferStatus2Str(NvDsInferStatus status); + +#ifdef __cplusplus +} +#endif + +/* C++ data types */ +#ifdef __cplusplus + +/** + * Enum for selecting between minimum/optimal/maximum dimensions of a layer + * in case of dynamic shape network. + */ +typedef enum +{ + kSELECTOR_MIN = 0, + kSELECTOR_OPT, + kSELECTOR_MAX, + kSELECTOR_SIZE +} NvDsInferProfileSelector; + +/** + * Holds full dimensions (including batch size) for a layer. + */ +typedef struct +{ + int batchSize = 0; + NvDsInferDims dims = {0}; +} NvDsInferBatchDims; + +/** + * Extended structure for bound layer information which additionally includes + * min/optimal/max full dimensions of a layer in case of dynamic shape. + */ +struct NvDsInferBatchDimsLayerInfo : NvDsInferLayerInfo +{ + NvDsInferBatchDims profileDims[kSELECTOR_SIZE]; +}; + +#endif + +#endif + +/** @} */ diff --git a/isaac_ros_detectnet/include/nvdsinferutils/AMENT_IGNORE b/isaac_ros_detectnet/include/nvdsinferutils/AMENT_IGNORE new file mode 100644 index 0000000..e69de29 diff --git a/isaac_ros_detectnet/include/nvdsinferutils/dbscan/EigenDefs.hpp b/isaac_ros_detectnet/include/nvdsinferutils/dbscan/EigenDefs.hpp new file mode 100644 index 0000000..e88db1f --- /dev/null +++ b/isaac_ros_detectnet/include/nvdsinferutils/dbscan/EigenDefs.hpp @@ -0,0 +1,173 @@ +///////////////////////////////////////////////////////////////////////////////////////// +// Copyright (c) 2018-2019 NVIDIA Corporation. All rights reserved. +// +// NVIDIA Corporation and its licensors retain all intellectual property and proprietary +// rights in and to this software and related documentation and any modifications thereto. +// Any use, reproduction, disclosure or distribution of this software and related +// documentation without an express license agreement from NVIDIA Corporation is +// strictly prohibited. +// +///////////////////////////////////////////////////////////////////////////////////////// + +#ifndef DW_CORE_EIGENDEFS_HPP__ +#define DW_CORE_EIGENDEFS_HPP__ + +#ifdef Success + #undef Success +#endif + +#include + +#include + + + +///////////////////////////////////////////////////////////////////////////////////////////////// +// Alignment issues: +// There is an outstanding bug with Eigen's alignment requirement. The classes of the sfm +// module contain Eigen matrices as members. They seem to be aligned but the compiler sometimes +// returns the wrong address for the member. The ReconstructorTests unit tests fail when alignment +// is enabled in linux and PX. The main problem was observed with Matrix34f but it probably +// affects all Eigen types. The bug was observed in .cu files suggesting it is a problem +// between Eigen and nvcc. +// +// All alignment is now disabled pending further investigation. +///////////////////////////////////////////////////////////////////////////////////////////////// +template +using Matrix = Eigen::Matrix; + +template +using Vector = Eigen::Matrix; + +template +using RowVector = Eigen::Matrix; + +template +using UnalignedMatrix = Eigen::Matrix; + +template +using UnalignedVector = Eigen::Matrix; + +template +using UnalignedRowVector = Eigen::Matrix; + +// clang-format off +#define EIGEN_MAKE_FIXED_TYPEDEFS(Type, TypeSuffix, Size, SizeSuffix) \ + /** \ingroup matrixtypedefs */ \ + \ + typedef UnalignedMatrix UnalignedMatrix##SizeSuffix##TypeSuffix; \ + /** \ingroup matrixtypedefs */ \ + \ + typedef Matrix Matrix##SizeSuffix##TypeSuffix; \ + /** \ingroup matrixtypedefs */ \ + \ + typedef UnalignedVector UnalignedVector##SizeSuffix##TypeSuffix; \ + /** \ingroup matrixtypedefs */ \ + \ + typedef Vector Vector##SizeSuffix##TypeSuffix; \ + /** \ingroup matrixtypedefs */ \ + \ + typedef UnalignedRowVector UnalignedRowVector##SizeSuffix##TypeSuffix; \ + /** \ingroup matrixtypedefs */ \ + \ + typedef RowVector RowVector##SizeSuffix##TypeSuffix; + +#define EIGEN_MAKE_DYNAMIC_TYPEDEFS(Type, TypeSuffix, Size) \ + /** \ingroup matrixtypedefs */ \ + \ + typedef Eigen::Matrix Matrix##Size##X##TypeSuffix; \ + /** \ingroup matrixtypedefs */ \ + \ + typedef Eigen::Matrix Matrix##X##Size##TypeSuffix; + +#define EIGEN_MAKE_TYPEDEFS_ALL_SIZES(Type, TypeSuffix) \ + \ + EIGEN_MAKE_FIXED_TYPEDEFS(Type, TypeSuffix, 2, 2) \ + \ + EIGEN_MAKE_FIXED_TYPEDEFS(Type, TypeSuffix, 3, 3) \ + \ + EIGEN_MAKE_FIXED_TYPEDEFS(Type, TypeSuffix, 4, 4) \ + \ + EIGEN_MAKE_FIXED_TYPEDEFS(Type, TypeSuffix, 5, 5) \ + \ + EIGEN_MAKE_FIXED_TYPEDEFS(Type, TypeSuffix, 6, 6) \ + \ + EIGEN_MAKE_FIXED_TYPEDEFS(Type, TypeSuffix, Eigen::Dynamic, X) \ + \ + EIGEN_MAKE_DYNAMIC_TYPEDEFS(Type, TypeSuffix, 2) \ + \ + EIGEN_MAKE_DYNAMIC_TYPEDEFS(Type, TypeSuffix, 3) \ + \ + EIGEN_MAKE_DYNAMIC_TYPEDEFS(Type, TypeSuffix, 4) +// clang-format on + +EIGEN_MAKE_TYPEDEFS_ALL_SIZES(uint8_t, ub) +EIGEN_MAKE_TYPEDEFS_ALL_SIZES(int8_t, b) +EIGEN_MAKE_TYPEDEFS_ALL_SIZES(uint16_t, us) +EIGEN_MAKE_TYPEDEFS_ALL_SIZES(int16_t, s) +EIGEN_MAKE_TYPEDEFS_ALL_SIZES(uint32_t, ui) +EIGEN_MAKE_TYPEDEFS_ALL_SIZES(int32_t, i) +EIGEN_MAKE_TYPEDEFS_ALL_SIZES(float, f) +EIGEN_MAKE_TYPEDEFS_ALL_SIZES(double, d) + +#undef EIGEN_MAKE_TYPEDEFS_ALL_SIZES +#undef EIGEN_MAKE_FIXED_TYPEDEFS +#undef EIGEN_MAKE_DYNAMIC_TYPEDEFS + +#undef ALIGN + +////////////////////////////////////////////// +// Array used for images + +template +struct EigenImage +{ + typedef Eigen::Array ArrayX; + typedef Eigen::Array ArrayXr; +}; + +template<> +struct EigenImage +{ + typedef Eigen::Array ArrayX; + typedef Eigen::Array ArrayXr; +}; + + +typedef EigenImage::ArrayX ArrayXb; +typedef EigenImage::ArrayXr ArrayXbr; + + +////////////////////////////////////////////// +// saturated_cast + +template +Tout saturated_cast(Tin value) +{ + if (value < Tin(std::numeric_limits::min())) + return std::numeric_limits::min(); + else if (value >= Tin(std::numeric_limits::max())) + return std::numeric_limits::max(); + else + return static_cast(value); +} + +////////////////////////////////////////////// +// Quaternion +typedef Eigen::Quaternion Quaternionf; + +////////////////////////////////////////////// +// Geometry +template +using Hyperplane = Eigen::Hyperplane; + +typedef Hyperplane Hyperplane2f; +typedef Hyperplane Hyperplane3f; + +template +using ParametrizedLine = Eigen::ParametrizedLine; + +typedef ParametrizedLine ParametrizedLine2f; +typedef ParametrizedLine ParametrizedLine3f; + +#endif // DW_CORE_EIGENDEFS_HPP__ diff --git a/isaac_ros_detectnet/include/nvdsinferutils/dbscan/nvdsinfer_dbscan.hpp b/isaac_ros_detectnet/include/nvdsinferutils/dbscan/nvdsinfer_dbscan.hpp new file mode 100644 index 0000000..a887927 --- /dev/null +++ b/isaac_ros_detectnet/include/nvdsinferutils/dbscan/nvdsinfer_dbscan.hpp @@ -0,0 +1,101 @@ +///////////////////////////////////////////////////////////////////////////////////////// +// Copyright (c) 2018-2019 NVIDIA Corporation. All rights reserved. +// +// NVIDIA Corporation and its licensors retain all intellectual property and proprietary +// rights in and to this software and related documentation and any modifications thereto. +// Any use, reproduction, disclosure or distribution of this software and related +// documentation without an express license agreement from NVIDIA Corporation is +// strictly prohibited. +// +///////////////////////////////////////////////////////////////////////////////////////// + + +#ifndef __NVDSINFER_DBSCANCLUSTERING_HPP__ +#define __NVDSINFER_DBSCANCLUSTERING_HPP__ + +#include +#include + +/* Ignore errors from open source headers. */ +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wunused-local-typedefs" +#if __GNUC__ >= 6 +#pragma GCC diagnostic ignored "-Wmisleading-indentation" +#endif +#if __GNUC__ >= 7 +#pragma GCC diagnostic ignored "-Wint-in-bool-context" +#endif +#include +#pragma GCC diagnostic pop +#include "EigenDefs.hpp" + +#include "nvdsinferutils/include/nvdsinfer_dbscan.h" + +/** + * Holds the bounding box co-ordindates. + */ +typedef struct +{ + float x; + float y; + float width; + float height; +} BBox; + +typedef struct +{ + /// Bounding box of the detected object. + BBox box; + /// Variance of bounding boxes of the members of the cluster. + BBox boxVariance; + /// Total confidence of the members of the cluster. + float totalConfidence; + /// Maximum confidence of the members of the cluster. + float maximumConfidence; + /// Number of members of the cluster. + uint32_t numMembers; +} ClusteredObject; + +///////////////////////////////////////////////////////////////////////////////////////////////////////////// +/** +* NvDsInferDBScan class +**/ +struct NvDsInferDBScan { +private: + typedef Eigen::Matrix RowMatrixXi; + + NvDsInferDBScan(); + ~NvDsInferDBScan(); + + void clusterObjects(NvDsInferObjectDetectionInfo *detections, size_t &numDetections, + NvDsInferDBScanClusteringParams *params, bool hybridClustering = false); + + void re_allocate (size_t num_detections); + + void buildDistanceMatrix(const NvDsInferObjectDetectionInfo input[], + int32_t inputSize); + + void findNeighbors(int32_t boxIdx, int32_t inputSize, float eps); + + void joinNeighbors(int32_t boxDstIdx, int32_t boxSrcIdx); + + std::unique_ptr m_distanceMatrix; + std::unique_ptr m_neighborListMatrix; + std::unique_ptr m_neighborshipMatrix; + std::vector m_neighborCounts; + std::vector m_labels; + std::vector m_visited; + size_t m_maxProposals; + std::vector m_clusteredObjects; + + friend NvDsInferDBScanHandle NvDsInferDBScanCreate(); + friend void NvDsInferDBScanDestroy(NvDsInferDBScanHandle handle); + friend void NvDsInferDBScanCluster(NvDsInferDBScanHandle handle, + NvDsInferDBScanClusteringParams *params, + NvDsInferObjectDetectionInfo *objects, size_t *numObjects); + friend void NvDsInferDBScanClusterHybrid(NvDsInferDBScanHandle handle, + NvDsInferDBScanClusteringParams *params, + NvDsInferObjectDetectionInfo *objects, size_t *numObjects); +}; + +#endif diff --git a/isaac_ros_detectnet/include/nvdsinferutils/include/nvdsinfer_dbscan.h b/isaac_ros_detectnet/include/nvdsinferutils/include/nvdsinfer_dbscan.h new file mode 100644 index 0000000..51e251a --- /dev/null +++ b/isaac_ros_detectnet/include/nvdsinferutils/include/nvdsinfer_dbscan.h @@ -0,0 +1,117 @@ +/* + * Copyright (c) 2018-2019, NVIDIA CORPORATION. All rights reserved. + * + * NVIDIA Corporation and its licensors retain all intellectual property + * and proprietary rights in and to this software, related documentation + * and any modifications thereto. Any use, reproduction, disclosure or + * distribution of this software and related documentation without an express + * license agreement from NVIDIA Corporation is strictly prohibited. + */ + +/** + * @file nvdsinfer_dbscan.h + * NVIDIA DeepStream DBScan based Object Clustering API + * + * @b Description: This file defines the API for the DBScan-based object + * clustering algorithm. + */ + +/** + * @defgroup ee_dbscan DBScan Based Object Clustering API + * + * Defines the API for DBScan-based object clustering. + * + * @ingroup NvDsInferApi + * @{ + */ + +#ifndef __NVDSINFER_DBSCAN_H__ +#define __NVDSINFER_DBSCAN_H__ + +#include +#include + +#include "nvdsinfer/nvdsinfer.h" + +#ifdef __cplusplus +extern "C" { +#endif + +/** Holds an opaque structure for the DBScan object clustering context. */ +struct NvDsInferDBScan; + +/** Holds an opaque DBScan clustering context handle. */ +typedef struct NvDsInferDBScan *NvDsInferDBScanHandle; + +/** Holds object clustering parameters required by DBSCAN. */ +typedef struct +{ + float eps; + uint32_t minBoxes; + /** Holds a Boolean; true enables the area-to-hit ratio (ATHR) filter. + The ATHR is calculated as: ATHR = sqrt(clusterArea) / nObjectsInCluster. */ + int enableATHRFilter; + /** Holds the area-to-hit ratio threshold. */ + float thresholdATHR; + /** Holds the sum of neighborhood confidence thresholds. */ + float minScore; +} NvDsInferDBScanClusteringParams; + +/** + * Creates a new DBScan object clustering context. + * + * @return A handle to the created context. + */ +NvDsInferDBScanHandle NvDsInferDBScanCreate(); + +/** + * Destroys a DBScan object clustering context. + * + * @param[in] handle The handle to the context to be destroyed. + */ +void NvDsInferDBScanDestroy(NvDsInferDBScanHandle handle); + +/** + * Clusters an array of objects in place using specified clustering parameters. + * + * @param[in] handle A handle to the context be used for clustering. + * @param[in] params A pointer to a clustering parameter structure. + * @param[in,out] objects A pointer to an array of objects to be + * clustered. The function places the clustered + * objects in the same array. + * @param[in,out] numObjects A pointer to the number of valid objects + * in the @a objects array. The function sets + * this value after clustering. + */ +void NvDsInferDBScanCluster(NvDsInferDBScanHandle handle, + NvDsInferDBScanClusteringParams *params, NvDsInferObjectDetectionInfo *objects, + size_t *numObjects); + +/** + * Clusters an array of objects in place using specified clustering parameters. + * The outputs are partially only clustered i.e to merge close neighbors of + * the same cluster together only and the mean normalization of all the + * proposals in a cluster is not performed. The outputs from this stage are + * later fed into another clustering algorithm like NMS to obtain the final + * results. + * + * @param[in] handle A handle to the context be used for clustering. + * @param[in] params A pointer to a clustering parameter structure. + * @param[in,out] objects A pointer to an array of objects to be + * clustered. The function places the clustered + * objects in the same array. + * @param[in,out] numObjects A pointer to the number of valid objects + * in the @a objects array. The function sets + * this value after clustering. + */ +void NvDsInferDBScanClusterHybrid(NvDsInferDBScanHandle handle, + NvDsInferDBScanClusteringParams *params, NvDsInferObjectDetectionInfo *objects, + size_t *numObjects); + +#ifdef __cplusplus +} +#endif + +#endif + +/** @} */ diff --git a/isaac_ros_detectnet/include/nvdsinferutils/include/nvdsinfer_utils.h b/isaac_ros_detectnet/include/nvdsinferutils/include/nvdsinfer_utils.h new file mode 100644 index 0000000..7bf53ba --- /dev/null +++ b/isaac_ros_detectnet/include/nvdsinferutils/include/nvdsinfer_utils.h @@ -0,0 +1,22 @@ +/* + * Copyright (c) 2018-2019, NVIDIA CORPORATION. All rights reserved. + * + * NVIDIA Corporation and its licensors retain all intellectual property + * and proprietary rights in and to this software, related documentation + * and any modifications thereto. Any use, reproduction, disclosure or + * distribution of this software and related documentation without an express + * license agreement from NVIDIA Corporation is strictly prohibited. + */ + +/** + * @file + * Utility functions required by DeepStream Inferance API + */ + +#ifndef __NVDSINFER_UTILS_H__ +#define __NVDSINFER_UTILS_H__ + +#include "nvdsinfer_dbscan.h" +#include "nvdsinfer_tlt.h" + +#endif diff --git a/isaac_ros_detectnet/isaac_ros_detectnet/__init__.py b/isaac_ros_detectnet/isaac_ros_detectnet/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/isaac_ros_detectnet/launch/isaac_ros_detectnet.launch.py b/isaac_ros_detectnet/launch/isaac_ros_detectnet.launch.py new file mode 100644 index 0000000..7c57ac2 --- /dev/null +++ b/isaac_ros_detectnet/launch/isaac_ros_detectnet.launch.py @@ -0,0 +1,65 @@ +# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. +# +# NVIDIA CORPORATION and its licensors retain all intellectual property +# and proprietary rights in and to this software, related documentation +# and any modifications thereto. Any use, reproduction, disclosure or +# distribution of this software and related documentation without an express +# license agreement from NVIDIA CORPORATION is strictly prohibited. + +import os + +import launch +from launch_ros.actions import ComposableNodeContainer +from launch_ros.descriptions import ComposableNode + + +def generate_launch_description(): + """Generate launch description for testing relevant nodes.""" + launch_dir_path = os.path.dirname(os.path.realpath(__file__)) + config = launch_dir_path + '/../config/params.yaml' + + encoder_node = ComposableNode( + name='dnn_image_encoder', + package='isaac_ros_dnn_encoders', + plugin='isaac_ros::dnn_inference::DnnImageEncoderNode', + parameters=[{ + 'network_image_width': 640, + 'network_image_height': 368, + 'network_image_encoding': 'rgb8', + 'network_normalization_type': 'positive_negative', + 'tensor_name': 'input_tensor' + }], + remappings=[('encoded_tensor', 'tensor_pub')] + ) + + triton_node = ComposableNode( + name='triton_node', + package='isaac_ros_triton', + plugin='isaac_ros::dnn_inference::TritonNode', + parameters=[{ + 'model_name': 'detectnet', + 'model_repository_paths': ['/tmp/models'], + 'input_tensor_names': ['input_tensor'], + 'input_binding_names': ['input_1'], + 'output_tensor_names': ['output_cov', 'output_bbox'], + 'output_binding_names': ['output_cov/Sigmoid', 'output_bbox/BiasAdd'], + 'log_level': 0 + }]) + + detectnet_decoder_node = ComposableNode( + name='detectnet_decoder_node', + package='isaac_ros_detectnet', + plugin='isaac_ros::detectnet::DetectNetDecoderNode', + parameters=[config] + ) + + container = ComposableNodeContainer( + name='detectnet_container', + namespace='detectnet_container', + package='rclcpp_components', + executable='component_container', + composable_node_descriptions=[encoder_node, triton_node, detectnet_decoder_node], + output='screen' + ) + + return launch.LaunchDescription([container]) diff --git a/isaac_ros_detectnet/package.xml b/isaac_ros_detectnet/package.xml new file mode 100644 index 0000000..aff3ca5 --- /dev/null +++ b/isaac_ros_detectnet/package.xml @@ -0,0 +1,46 @@ + + + + + + + isaac_ros_detectnet + 0.9.0 + DetectNet model processing + + Hemal Shah + NVIDIA Isaac ROS Software License + https://developer.nvidia.com/isaac-ros-gems/ + Herón Ordóñez Guillén + ament_cmake + ament_cmake_python + + eigen + + rclcpp + rclpy + rclcpp_components + vision_msgs + isaac_ros_dnn_encoders + isaac_ros_nvengine_interfaces + + ament_lint_auto + ament_lint_common + isaac_ros_test + isaac_ros_nvengine + + isaac_ros_nvengine + isaac_ros_triton + + + ament_cmake + + diff --git a/isaac_ros_detectnet/scripts/isaac_ros_detectnet_visualizer.py b/isaac_ros_detectnet/scripts/isaac_ros_detectnet_visualizer.py new file mode 100755 index 0000000..e1c7e9d --- /dev/null +++ b/isaac_ros_detectnet/scripts/isaac_ros_detectnet_visualizer.py @@ -0,0 +1,81 @@ +#!/usr/bin/env python3 +# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. +# +# NVIDIA CORPORATION and its licensors retain all intellectual property +# and proprietary rights in and to this software, related documentation +# and any modifications thereto. Any use, reproduction, disclosure or +# distribution of this software and related documentation without an express +# license agreement from NVIDIA CORPORATION is strictly prohibited. + +# This script loads images from a folder and sends them to the detectnet pipeline for inference, +# then renders the output boxes on top of the image and publishes the result as an image message +# to visualize using rqt + +import os +from pprint import pformat + +import cv2 +import cv_bridge +import numpy as np +import rclpy +from rclpy.node import Node +from sensor_msgs.msg import Image +from vision_msgs.msg import Detection2DArray + + +class DetectNetVisualizer(Node): + QUEUE_SIZE = 10 + color = (0, 255, 0) + bbox_thickness = 1 + encoding = 'bgr8' + + def __init__(self): + super().__init__('detectnet_visualizer') + self._bridge = cv_bridge.CvBridge() + self._processed_image_pub = self.create_publisher( + Image, 'detectnet_processed_image', self.QUEUE_SIZE) + self._image_pub = self.create_publisher( + Image, 'image', 10) + + self._detections_subscription = self.create_subscription( + Detection2DArray, + 'detectnet/detections', + self.detections_callback, + 10) + + self.create_timer(5, self.timer_callback) + script_path = os.path.dirname(os.path.realpath(__file__)) + self.input_image_path = os.path.join(script_path, '../examples/demo.png') + + def timer_callback(self): + cv2_img = cv2.imread(os.path.join(self.input_image_path)) + img = self._bridge.cv2_to_imgmsg(np.array(cv2_img), self.encoding) + self.current_img = img + self._image_pub.publish(img) + + def detections_callback(self, detections_msg): + cv2_img = self._bridge.imgmsg_to_cv2(self.current_img) + self.get_logger().info(pformat(detections_msg)) + for detection in detections_msg.detections: + center_x = detection.bbox.center.x + center_y = detection.bbox.center.y + width = detection.bbox.size_x + height = detection.bbox.size_y + + min_pt = (round(center_x - (width / 2.0)), round(center_y - (height / 2.0))) + max_pt = (round(center_x + (width / 2.0)), round(center_y + (height / 2.0))) + + cv2.rectangle(cv2_img, min_pt, max_pt, self.color, self.bbox_thickness) + + processed_img = self._bridge.cv2_to_imgmsg(cv2_img, encoding=self.encoding) + self._processed_image_pub.publish(processed_img) + + +def main(): + rclpy.init() + rclpy.spin(DetectNetVisualizer()) + rclpy.shutdown() + + +if __name__ == '__main__': + main() diff --git a/isaac_ros_detectnet/src/detectnet_decoder_node.cpp b/isaac_ros_detectnet/src/detectnet_decoder_node.cpp new file mode 100644 index 0000000..b139ab6 --- /dev/null +++ b/isaac_ros_detectnet/src/detectnet_decoder_node.cpp @@ -0,0 +1,346 @@ +/** + * Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved. + * + * NVIDIA CORPORATION and its licensors retain all intellectual property + * and proprietary rights in and to this software, related documentation + * and any modifications thereto. Any use, reproduction, disclosure or + * distribution of this software and related documentation without an express + * license agreement from NVIDIA CORPORATION is strictly prohibited. + */ + +#include "isaac_ros_detectnet/detectnet_decoder_node.hpp" + +#include +#include +#include +#include + +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wmissing-field-initializers" +#include "nvdsinferutils/dbscan/nvdsinfer_dbscan.hpp" +#pragma GCC diagnostic pop +#include "std_msgs/msg/header.hpp" +#include "vision_msgs/msg/detection2_d_array.hpp" +#include "vision_msgs/msg/detection2_d.hpp" + +namespace +{ +const int32_t kFloat32 = 9; +const int32_t kMinBoxArea = 100; +const int kBoundingBoxParams = 4; + +// Grid box length in pixels. DetectNetv2 currently only supports square grid boxes of size 16. +const int kStride = 16; +} // namespace + +namespace isaac_ros +{ +namespace detectnet +{ +struct DetectNetDecoderNode::DetectNetDecoderImpl +{ + std::string header_frame_id_; + const int32_t kTensorHeightIdx = 2; + const int32_t kTensorWidthIdx = 3; + const int32_t kTensorClassIdx = 1; + std::vector label_names_; + float coverage_threshold_; + float bounding_box_scale_; + float bounding_box_offset_; + NvDsInferDBScanClusteringParams params_; + int clustering_algorithm_; + + DetectNetDecoderImpl( + const std::string & header_frame_id, + const std::vector & label_names, + const float & coverage_threshold, + const float & bounding_box_scale, + const float & bounding_box_offset, + const float & eps, + const int & min_boxes, + const int & enable_athr_filter, + const float & threshold_athr, + const int & clustering_algorithm) + { + header_frame_id_ = header_frame_id; + label_names_ = label_names; + coverage_threshold_ = coverage_threshold; + bounding_box_scale_ = bounding_box_scale; + bounding_box_offset_ = bounding_box_offset; + + params_.eps = eps; + params_.minBoxes = min_boxes > 0 ? min_boxes : 0; + params_.enableATHRFilter = enable_athr_filter; + params_.thresholdATHR = threshold_athr; + params_.minScore = coverage_threshold; + + clustering_algorithm_ = clustering_algorithm; + } + + void OnCallback( + vision_msgs::msg::Detection2DArray & detections_msg, + const isaac_ros_nvengine_interfaces::msg::Tensor & bbox_tensor, + const isaac_ros_nvengine_interfaces::msg::Tensor & cov_tensor, + const std_msgs::msg::Header & tensor_header, + const rclcpp::Logger & logger) + { + ConvertTensorToDetectionsArray( + detections_msg, + bbox_tensor, + cov_tensor, + tensor_header, + logger); + } + + void ConvertTensorToDetectionsArray( + vision_msgs::msg::Detection2DArray & detections_msg, + const isaac_ros_nvengine_interfaces::msg::Tensor & bbox_tensor, + const isaac_ros_nvengine_interfaces::msg::Tensor & cov_tensor, + const std_msgs::msg::Header & tensor_header, + const rclcpp::Logger & logger) + { + if (bbox_tensor.data_type == kFloat32) { + DecodeDetections(detections_msg, bbox_tensor, cov_tensor, tensor_header, logger); + } else { + throw std::runtime_error("Received invalid Tensor data! Expected float32!"); + } + } + + void DecodeDetections( + vision_msgs::msg::Detection2DArray & detections_msg, + const isaac_ros_nvengine_interfaces::msg::Tensor & bbox_tensor, + const isaac_ros_nvengine_interfaces::msg::Tensor & cov_tensor, + const std_msgs::msg::Header & tensor_header, + const rclcpp::Logger & logger) + { + // Reinterpret the strides (which are in bytes) + data as the relevant data type + const float * bbox_tensor_data = reinterpret_cast(bbox_tensor.data.data()); + const uint32_t bbox_tensor_height_stride = bbox_tensor.strides[kTensorHeightIdx] / + sizeof(float); + const uint32_t bbox_tensor_width_stride = bbox_tensor.strides[kTensorWidthIdx] / sizeof(float); + const uint32_t bbox_tensor_class_stride = bbox_tensor.strides[kTensorClassIdx] / sizeof(float); + + // Reinterpret the strides and data as above, this time for the coverage tensor + const float * cov_tensor_data = reinterpret_cast(cov_tensor.data.data()); + const uint32_t cov_tensor_height_stride = cov_tensor.strides[kTensorHeightIdx] / sizeof(float); + const uint32_t cov_tensor_width_stride = cov_tensor.strides[kTensorWidthIdx] / sizeof(float); + const uint32_t cov_tensor_class_stride = cov_tensor.strides[kTensorClassIdx] / sizeof(float); + + // Get grid size based on the size of bbox_tensor + const int num_classes = cov_tensor.shape.dims[kTensorClassIdx]; + const int grid_size_rows = bbox_tensor.shape.dims[kTensorHeightIdx]; + const int grid_size_cols = bbox_tensor.shape.dims[kTensorWidthIdx]; + + // Validate number of bounding box parameters + const int num_box_parameters = bbox_tensor.shape.dims[kTensorClassIdx] / num_classes; + if (num_box_parameters != kBoundingBoxParams) { + RCLCPP_WARN(logger, "Received wrong number of box parameters"); + return; + } + + std::vector detection_info_list; + // Go through every grid box and extract all bboxes for the image + for (int row = 0; row < grid_size_rows; row++) { + for (int col = 0; col < grid_size_cols; col++) { + for (int object_class = 0; object_class < num_classes; object_class++) { + // position of the current bounding box coverage value in the reinterpreted cov tensor + int cov_pos = (row * cov_tensor_height_stride) + (col * cov_tensor_width_stride) + + (object_class * cov_tensor_class_stride); + float coverage = cov_tensor_data[cov_pos]; + + // Center of the grid in pixels + float grid_center_y = (row + bounding_box_offset_ ) * kStride; + float grid_center_x = (col + bounding_box_offset_ ) * kStride; + + // Get each element of the bounding box + float bbox[kBoundingBoxParams]; + int grid_offset = (row * bbox_tensor_height_stride) + (col * bbox_tensor_width_stride); + for (int bbox_element = 0; bbox_element < num_box_parameters; bbox_element++) { + int pos = grid_offset + ((object_class + bbox_element) * bbox_tensor_class_stride); + bbox[bbox_element] = bbox_tensor_data[pos] * bounding_box_scale_; + } + + float size_x = bbox[0] + bbox[2]; + float size_y = bbox[1] + bbox[3]; + + // Filter by box area. + double bbox_area = size_x * size_y; + if (bbox_area < kMinBoxArea) { + continue; + } + + // Bounding box is in the form of (x0, y0, x1, y1) in grid coordinates + // Center relative to is grid found averaging the averaging the dimensions and offset + // by grid center position to convert to image coordinates + NvDsInferObjectDetectionInfo detection_info = GetNewDetectionInfo( + static_cast(object_class), + grid_center_x - bbox[0], grid_center_y - bbox[1], size_x, size_y, coverage); + detection_info_list.push_back(detection_info); + } + } + } + + NvDsInferObjectDetectionInfo * detetction_info_pointer = &detection_info_list[0]; + size_t num_objs = detection_info_list.size(); + NvDsInferDBScanHandle dbscan_hdl = NvDsInferDBScanCreate(); + if (clustering_algorithm_ == 1) { + NvDsInferDBScanCluster(dbscan_hdl, ¶ms_, detetction_info_pointer, &num_objs); + } else if (clustering_algorithm_ == 2) { + NvDsInferDBScanClusterHybrid(dbscan_hdl, ¶ms_, detetction_info_pointer, &num_objs); + } + NvDsInferDBScanDestroy(dbscan_hdl); + + std::vector detections_list; + for (size_t i = 0; i < num_objs; i++) { + if (detetction_info_pointer[i].detectionConfidence < coverage_threshold_) { + continue; + } + vision_msgs::msg::Detection2D bbox_detection = toDetection2DMsg( + detetction_info_pointer[i], tensor_header); + detections_list.push_back(bbox_detection); + } + detections_msg.header = tensor_header; + detections_msg.detections = detections_list; + } + + NvDsInferObjectDetectionInfo GetNewDetectionInfo( + unsigned int classId, + float left, + float top, + float width, + float height, + float detectionConfidence) + { + NvDsInferObjectDetectionInfo detection_info; + + detection_info.classId = classId; + detection_info.left = left; + detection_info.top = top; + detection_info.width = width; + detection_info.height = height; + detection_info.detectionConfidence = detectionConfidence; + + return detection_info; + } + + vision_msgs::msg::Detection2D toDetection2DMsg( + NvDsInferObjectDetectionInfo detection_info, + const std_msgs::msg::Header & tensor_header) + { + int center_x = static_cast(detection_info.left + (detection_info.width / 2)); + int center_y = static_cast(detection_info.top + (detection_info.height / 2)); + int size_x = static_cast(detection_info.width); + int size_y = static_cast(detection_info.height); + + vision_msgs::msg::Detection2D bbox_detection = GetNewDetection2DMsg( + center_x, center_y, size_x, size_y, + detection_info.classId, detection_info.detectionConfidence, + tensor_header); + + return bbox_detection; + } + + vision_msgs::msg::Detection2D GetNewDetection2DMsg( + const int center_x, + const int center_y, + const int size_x, + const int size_y, + const int class_id, + const float detection_score, + const std_msgs::msg::Header & tensor_header) + { + // Create an empty message with the correct dimensions using the tensor + vision_msgs::msg::Detection2D detection_msg; + std::vector hypothesis_list; + vision_msgs::msg::ObjectHypothesisWithPose hypothesis; + + hypothesis.id = class_id; + hypothesis.score = static_cast<_Float64>(detection_score); + hypothesis_list.push_back(hypothesis); + + detection_msg.header = tensor_header; + detection_msg.header.frame_id = header_frame_id_; + + detection_msg.bbox.center.x = center_x; + detection_msg.bbox.center.y = center_y; + detection_msg.bbox.center.theta = 0; + detection_msg.bbox.size_x = size_x; + detection_msg.bbox.size_y = size_y; + detection_msg.results = hypothesis_list; + + return detection_msg; + } +}; + +DetectNetDecoderNode::DetectNetDecoderNode(const rclcpp::NodeOptions options) +: Node("detectnet_decoder_node", options), + // Parameters + queue_size_(declare_parameter("queue_size", rmw_qos_profile_default.depth)), + header_frame_id_(declare_parameter("frame_id", "")), + label_names_(declare_parameter>("label_names", {})), + coverage_threshold_(declare_parameter("coverage_threshold", 0.6)), + bounding_box_scale_(declare_parameter("bounding_box_square", 35.0)), + bounding_box_offset_(declare_parameter("bounding_box_offset", 0.5)), + + // Parameters for DBScan + eps_(declare_parameter("eps", 0.01)), + min_boxes_(declare_parameter("min_boxes", 1)), + enable_athr_filter_(declare_parameter("enable_athr_filter", 0)), + threshold_athr_(declare_parameter("threshold_athr", 0)), + clustering_algorithm_(declare_parameter("clustering_algorithm", 1)), + + // Subscribers + tensor_list_sub_(create_subscription( + "tensor_sub", queue_size_, + std::bind(&DetectNetDecoderNode::DetectNetDecoderCallback, this, std::placeholders::_1))), + // Publishers + detections_pub_(create_publisher( + "detectnet/detections", + 1)), + // Impl initialization + impl_(std::make_unique( + header_frame_id_, label_names_, coverage_threshold_, bounding_box_scale_, + bounding_box_offset_, eps_, min_boxes_, enable_athr_filter_, threshold_athr_, + clustering_algorithm_)) +{ + // Received empty header frame id + if (header_frame_id_.empty()) { + RCLCPP_WARN(get_logger(), "Received empty frame id! Header will be published without one."); + } +} + +void DetectNetDecoderNode::DetectNetDecoderCallback( + const isaac_ros_nvengine_interfaces::msg::TensorList::ConstSharedPtr tensor_list_msg) +{ + if (tensor_list_msg->tensors.size() != 2) { + RCLCPP_ERROR( + get_logger(), "Received invalid tensor count! Expected two tensors. Not processing."); + } + + auto cov_tensor = tensor_list_msg->tensors[0]; + auto bbox_tensor = tensor_list_msg->tensors[1]; + + vision_msgs::msg::Detection2DArray detections_msg; + + try { + impl_->OnCallback( + detections_msg, + bbox_tensor, + cov_tensor, + tensor_list_msg->header, + get_logger()); + detections_pub_->publish(detections_msg); + } catch (const std::runtime_error & e) { + RCLCPP_ERROR(get_logger(), e.what()); + return; + } +} + +DetectNetDecoderNode::~DetectNetDecoderNode() = default; + +} // namespace detectnet +} // namespace isaac_ros + +// Register as component +#include "rclcpp_components/register_node_macro.hpp" +RCLCPP_COMPONENTS_REGISTER_NODE(isaac_ros::detectnet::DetectNetDecoderNode) diff --git a/isaac_ros_detectnet/test/dummy_model/.gitignore b/isaac_ros_detectnet/test/dummy_model/.gitignore new file mode 100644 index 0000000..eb8715e --- /dev/null +++ b/isaac_ros_detectnet/test/dummy_model/.gitignore @@ -0,0 +1 @@ +**/model.plan \ No newline at end of file diff --git a/isaac_ros_detectnet/test/dummy_model/detectnet/1/resnet18_detector.etlt b/isaac_ros_detectnet/test/dummy_model/detectnet/1/resnet18_detector.etlt new file mode 100644 index 0000000..50b7689 --- /dev/null +++ b/isaac_ros_detectnet/test/dummy_model/detectnet/1/resnet18_detector.etlt @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:984c3bf856d11b3fae0b0c5f4040fe6d8709799397b52c660ca0216ac09e35ec +size 44874341 diff --git a/isaac_ros_detectnet/test/dummy_model/detectnet/config.pbtxt b/isaac_ros_detectnet/test/dummy_model/detectnet/config.pbtxt new file mode 100644 index 0000000..d24c6c0 --- /dev/null +++ b/isaac_ros_detectnet/test/dummy_model/detectnet/config.pbtxt @@ -0,0 +1,29 @@ +name: "detectnet" +platform: "tensorrt_plan" +max_batch_size: 16 +input [ + { + name: "input_1" + data_type: TYPE_FP32 + format: FORMAT_NCHW + dims: [ 3, 368, 640 ] + } +] +output [ + { + name: "output_bbox/BiasAdd" + data_type: TYPE_FP32 + dims: [ 8, 23, 40 ] + }, + { + name: "output_cov/Sigmoid" + data_type: TYPE_FP32 + dims: [ 2, 23, 40 ] + } +] +dynamic_batching { } +version_policy: { + specific { + versions: [ 1 ] + } +} diff --git a/isaac_ros_detectnet/test/dummy_model/detectnet/labels.txt b/isaac_ros_detectnet/test/dummy_model/detectnet/labels.txt new file mode 100644 index 0000000..335537e --- /dev/null +++ b/isaac_ros_detectnet/test/dummy_model/detectnet/labels.txt @@ -0,0 +1,2 @@ +nvidia_mug +tennis_ball \ No newline at end of file diff --git a/isaac_ros_detectnet/test/isaac_ros_detectnet_pol_test.py b/isaac_ros_detectnet/test/isaac_ros_detectnet_pol_test.py new file mode 100644 index 0000000..354d049 --- /dev/null +++ b/isaac_ros_detectnet/test/isaac_ros_detectnet_pol_test.py @@ -0,0 +1,204 @@ +# Copyright (c) 2022, NVIDIA CORPORATION. All rights reserved. +# +# NVIDIA CORPORATION and its licensors retain all intellectual property +# and proprietary rights in and to this software, related documentation +# and any modifications thereto. Any use, reproduction, disclosure or +# distribution of this software and related documentation without an express +# license agreement from NVIDIA CORPORATION is strictly prohibited. + +""" +Proof-Of-Life test for the Isaac ROS DetectNet package. + + 1. Sets up DnnImageEncoderNode, TensorRTNode, DetectNetDecoderNode + 2. Loads a sample image and publishes it + 3. Subscribes to the relevant topics, waiting for an output from DetectNetDecoderNode + 4. Verifies that the received output sizes and encodings are correct (based on dummy model) + + Note: the data is not verified because the model is initialized with random weights +""" + + +import os +import pathlib +from pprint import pprint +import subprocess +import time + +from isaac_ros_test import IsaacROSBaseTest, JSONConversion +from launch_ros.actions.composable_node_container import ComposableNodeContainer +from launch_ros.descriptions.composable_node import ComposableNode + +import pytest +import rclpy + +from sensor_msgs.msg import Image +from vision_msgs.msg import Detection2DArray + +_TEST_CASE_NAMESPACE = 'detectnet_node_test' + + +@pytest.mark.rostest +def generate_test_description(): + """Generate launch description for testing relevant nodes.""" + launch_dir_path = os.path.dirname(os.path.realpath(__file__)) + model_dir_path = launch_dir_path + '/dummy_model' + model_name = 'detectnet' + model_version = 1 + engine_file_path = f'{model_dir_path}/{model_name}/{model_version}/model.plan' + + # Read labels from text file + labels_file_path = f'{model_dir_path}/{model_name}/labels.txt' + with open(labels_file_path, 'r') as fd: + label_list = fd.read().strip().splitlines() + + # Generate engine file using tao-converter + if not os.path.isfile(engine_file_path): + tao_converter_args = [ + '-k', '"object-detection-from-sim-pipeline"', + '-d', '3,368,640', + '-t', 'fp16', + '-p', 'input_1,1x3x368x640,1x3x368x640,1x3x368x640', + '-e', engine_file_path, + '-o', 'output_cov/Sigmoid,output_bbox/BiasAdd', + f'{model_dir_path}/{model_name}/1/resnet18_detector.etlt' + ] + tao_converter_executable = '/opt/nvidia/tao/tao-converter' + print('Running command:\n' + ' '.join([tao_converter_executable] + tao_converter_args)) + + result = subprocess.run( + [tao_converter_executable] + tao_converter_args, + env=os.environ, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE + ) + if result.returncode != 0: + raise Exception( + f'Failed to convert with status: {result.returncode}.\n' + f'stderr:\n' + result.stderr.decode('utf-8') + ) + + encoder_node = ComposableNode( + package='isaac_ros_dnn_encoders', + plugin='isaac_ros::dnn_inference::DnnImageEncoderNode', + namespace=IsaacROSDetectNetPipelineTest.generate_namespace(_TEST_CASE_NAMESPACE), + parameters=[{ + 'network_image_width': 640, + 'network_image_height': 368, + 'network_image_encoding': 'rgb8', + 'network_normalization_type': 'positive_negative', + 'tensor_name': 'input_tensor' + }], + remappings=[('encoded_tensor', 'tensor_pub')] + ) + + triton_node = ComposableNode( + package='isaac_ros_triton', + namespace=IsaacROSDetectNetPipelineTest.generate_namespace(_TEST_CASE_NAMESPACE), + plugin='isaac_ros::dnn_inference::TritonNode', + parameters=[{ + 'model_name': 'detectnet', + 'model_repository_paths': [model_dir_path], + 'input_tensor_names': ['input_tensor'], + 'input_binding_names': ['input_1'], + 'output_tensor_names': ['output_cov', 'output_bbox'], + 'output_binding_names': ['output_cov/Sigmoid', 'output_bbox/BiasAdd'], + 'log_level': 0 + }]) + + detectnet_decoder_node = ComposableNode( + package='isaac_ros_detectnet', + plugin='isaac_ros::detectnet::DetectNetDecoderNode', + namespace=IsaacROSDetectNetPipelineTest.generate_namespace(_TEST_CASE_NAMESPACE), + parameters=[{ + 'frame_id': 'detectnet', + 'label_names': label_list, + 'coverage_threshold': 0.5, + 'bounding_box_scale': 35.0, + 'bounding_box_offset': 0.5, + 'eps': 0.5, + 'min_boxes': 2, + 'verbose': False, + }]) + + container = ComposableNodeContainer( + name='detectnet_container', + namespace='', + package='rclcpp_components', + executable='component_container', + composable_node_descriptions=[encoder_node, triton_node, detectnet_decoder_node], + output='screen' + ) + + return IsaacROSDetectNetPipelineTest.generate_test_description([container]) + + +class IsaacROSDetectNetPipelineTest(IsaacROSBaseTest): + """Validates a DetectNet model with randomized weights with a sample output from Python.""" + + filepath = pathlib.Path(os.path.dirname(__file__)) + MODEL_GENERATION_TIMEOUT_SEC = 5 + INIT_WAIT_SEC = 1 + MODEL_PATH = filepath.joinpath('dummy_model/detectnet.engine') + + @IsaacROSBaseTest.for_each_test_case() + def test_image_detection(self, test_folder): + start_time = time.time() + + while (time.time() - start_time) < self.MODEL_GENERATION_TIMEOUT_SEC: + time.sleep(self.INIT_WAIT_SEC) + + """Expect the node to segment an image.""" + self.generate_namespace_lookup( + ['image', 'detectnet/detections'], _TEST_CASE_NAMESPACE) + image_pub = self.node.create_publisher(Image, self.namespaces['image'], self.DEFAULT_QOS) + received_messages = {} + detectnet_detections = self.create_logging_subscribers( + [('detectnet/detections', Detection2DArray)], + received_messages, accept_multiple_messages=False) + + self.generate_namespace_lookup( + ['image', 'detectnet/detections'], _TEST_CASE_NAMESPACE) + + try: + image = JSONConversion.load_image_from_json(test_folder / 'detections.json') + ground_truth = open(test_folder.joinpath('expected_detections.txt'), 'r') + expected_detections = [] + + for ground_detection in ground_truth.readlines(): + ground_detection_split = ground_detection.split() + gtd = [float(ground_detection_split[4]), float(ground_detection_split[5]), + float(ground_detection_split[6]), float(ground_detection_split[7])] + expected_detections.append( + {'width': int(gtd[2] - gtd[0]), 'height': int(gtd[3] - gtd[1]), + 'center': {'x': int((gtd[2]+gtd[0])/2), 'y': int((gtd[3]+gtd[1])/2)}} + ) + + TIMEOUT = 60 + end_time = time.time() + TIMEOUT + done = False + while time.time() < end_time: + image_pub.publish(image) + rclpy.spin_once(self.node, timeout_sec=0.1) + + if 'detectnet/detections' in received_messages: + pprint(received_messages['detectnet/detections'].detections[0]) + done = True + break + + self.assertTrue( + done, "Didn't receive output on detectnet/detections topic!") + + detection_list = received_messages['detectnet/detections'].detections + + pixel_tolerance = 2.0 + self.assertEqual(pytest.approx(detection_list[0].bbox.size_x, pixel_tolerance), + expected_detections[0]['width'], 'Received incorrect width') + self.assertEqual(pytest.approx(detection_list[0].bbox.size_y, pixel_tolerance), + expected_detections[0]['height'], 'Received incorrect height') + self.assertEqual(pytest.approx(detection_list[0].bbox.center.x, pixel_tolerance), + expected_detections[0]['center']['x'], 'Received incorrect center') + self.assertEqual(pytest.approx(detection_list[0].bbox.center.y, pixel_tolerance), + expected_detections[0]['center']['y'], 'Received incorrect center') + finally: + self.node.destroy_subscription(detectnet_detections) + self.node.destroy_publisher(image_pub) diff --git a/isaac_ros_detectnet/test/test_cases/single_detection/detections.json b/isaac_ros_detectnet/test/test_cases/single_detection/detections.json new file mode 100644 index 0000000..794111f --- /dev/null +++ b/isaac_ros_detectnet/test/test_cases/single_detection/detections.json @@ -0,0 +1,4 @@ +{ + "image": "single_detection.png", + "encoding": "bgr8" +} \ No newline at end of file diff --git a/isaac_ros_detectnet/test/test_cases/single_detection/expected_detections.txt b/isaac_ros_detectnet/test/test_cases/single_detection/expected_detections.txt new file mode 100644 index 0000000..63e3f8c --- /dev/null +++ b/isaac_ros_detectnet/test/test_cases/single_detection/expected_detections.txt @@ -0,0 +1 @@ +nvidia_mug 0.00 0 0.00 437.56 13.29 483.04 49.58 0.00 0.00 0.00 0.00 0.00 0.00 0.00 diff --git a/isaac_ros_detectnet/test/test_cases/single_detection/single_detection.png b/isaac_ros_detectnet/test/test_cases/single_detection/single_detection.png new file mode 100644 index 0000000..581fdac --- /dev/null +++ b/isaac_ros_detectnet/test/test_cases/single_detection/single_detection.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:30fc4f687852c45253fa1c8f025e70b81bcbbabb22563f44556476f006e4bd75 +size 405452 diff --git a/resources/header-image.png b/resources/header-image.png new file mode 100644 index 0000000..945e2a1 --- /dev/null +++ b/resources/header-image.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:8db273f7dcd11c39e0bb73c0c7979d3beda9477cfc65397e34f637781b8bb819 +size 376620 diff --git a/resources/ros2_detectnet_node_setup.svg b/resources/ros2_detectnet_node_setup.svg new file mode 100644 index 0000000..488b7f6 --- /dev/null +++ b/resources/ros2_detectnet_node_setup.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/resources/rqt_visualizer.png b/resources/rqt_visualizer.png new file mode 100644 index 0000000..701a2c6 --- /dev/null +++ b/resources/rqt_visualizer.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:fb81c26967ac397e69b5bc92afdfedb921045b80fe4c2b52820912236a4c39cc +size 683481 diff --git a/resources/tlt_matrix2_updated.png b/resources/tlt_matrix2_updated.png new file mode 100644 index 0000000..35e32b5 --- /dev/null +++ b/resources/tlt_matrix2_updated.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:63c3224b163e1a01ec6f3e6f6d2e262c715ecd40b88cdb19c52d32aaec4809bd +size 66217