diff --git a/docs/source/docs/apriltag-pipelines/index.md b/docs/source/docs/apriltag-pipelines/index.md index 1a9044e964..23a59f39a2 100644 --- a/docs/source/docs/apriltag-pipelines/index.md +++ b/docs/source/docs/apriltag-pipelines/index.md @@ -4,6 +4,7 @@ about-apriltags detector-types 2D-tracking-tuning +ml-tag 3D-tracking multitag coordinate-systems diff --git a/docs/source/docs/apriltag-pipelines/ml-tag.md b/docs/source/docs/apriltag-pipelines/ml-tag.md new file mode 100644 index 0000000000..dd6ae2c95f --- /dev/null +++ b/docs/source/docs/apriltag-pipelines/ml-tag.md @@ -0,0 +1,55 @@ +# ML-Tag (ML AprilTag Acceleration) + +## How does it work? + +ML-Tag is an acceleration mode within the AprilTag pipeline. Instead of running the AprilTag detector across the full camera frame, an ML model running on the coprocessor's NPU first proposes bounding-box regions where AprilTags appear to be in the image, and the standard AprilTag decoder then runs only inside those regions. On supported hardware this reduces the work the classical detector has to do, leaving more headroom for higher resolutions, higher framerates, and lower latency. + +The ML model is purely a localizer: it answers "is there an AprilTag here?" and nothing else. It does not read the tag's ID, identify its family, recover its orientation, or estimate its pose — all of that is still done by the same classical detector used in the standard AprilTag pipeline. ML-Tag is therefore transparent to your robot code: detections come out as the same targets, and {ref}`3D Tracking ` and {ref}`MultiTag ` work without any changes. + +## Hardware Requirements + +ML-Tag runs on the same NPUs PhotonVision uses for object detection: + +- RK3588 boards — Orange Pi 5 / 5 Plus, Rock 5C, CoolPi 4B (RKNN backend) +- QCS6490 boards — Rubik Pi 3 (TFLite backend) + +For installation, model conversion, and other platform-specific notes, see the coprocessor pages used for object detection: {ref}`Orange Pi 5 ` and {ref}`Rubik Pi 3 `. + +:::{note} +The ML-Tag controls only appear in the AprilTag pipeline tab when a supported NPU backend is detected. On other coprocessors the AprilTag pipeline runs in its standard CPU-only configuration. +::: + +## Enabling ML-Tag + +In the AprilTag pipeline tab, scroll past the standard AprilTag tuning settings to the "ML-Tag (ML AprilTag Acceleration)" section. Toggle **Enable ML-Tag**, and the remaining ML-Tag controls will appear below. + +## Tuning ML-Tag + +The standard AprilTag tuning parameters described in {ref}`2D AprilTag Tuning / Tracking ` still apply — they control how the detector decodes each region the ML model hands it. The settings below only control the ML localizer stage. + +### Model + +Selects which ML model the NPU uses to locate AprilTag regions in the image. PhotonVision ships with a YoloV11 AprilTag model for each supported NPU; any compatible models you have uploaded will also appear here. Only models compatible with the detected NPU backend are listed. + +### Confidence Threshold + +The minimum confidence score (between 0 and 1) the ML model must report for a proposed region before it is passed to the classical decoder. Higher values reject more weak proposals — good for cutting false positives, but at the risk of missing tags the model is less sure about. The default of 0.5 is a reasonable starting point; lower it if tags are being missed and the ROI overlay confirms the model is hesitating on them. + +### NMS Threshold + +The non-maximum suppression overlap cutoff (between 0 and 1) used to merge overlapping proposals of the same tag. Higher values allow more overlapping boxes through; lower values are stricter about merging duplicates. The default of 0.45 works well in most cases. + +### ROI Padding (px) + +The number of pixels of padding added around each proposed region before the classical decoder runs on it. Padding is applied in pixels, so it naturally adapts to tag size in the image: small, far-away tags receive proportionally more expansion (helping the decoder see their borders), while large, nearby tags receive less. Raise this if tag corners are being clipped at the edges of the ROIs; lower it if neighboring tags are being merged into a single region. + +### Show ROI Boxes + +When enabled, draws the ML model's proposed bounding boxes onto the processed stream. This is valuable when tuning Confidence Threshold and ROI Padding — you can see exactly which regions the model is finding and how tightly they fit each tag. Turn this off in competition for a cleaner stream. + +## Limitations + +- ML-Tag is only available on the supported NPU coprocessors listed above. On any other hardware the AprilTag pipeline behaves in the standard manner. +- When several overlapping ROIs decode the same tag, only the detection with the highest decision margin is kept. In practice this is the desired behavior, but it does mean alternate solutions from the same tag are not surfaced. +- The ML model is purely a localizer. If a tag is in the frame but the model fails to propose a region around it, the classical decoder will never see it and the tag will not be detected. When tuning, use **Show ROI Boxes** to confirm the model is finding the tags you care about, and lower **Confidence Threshold** if it is not. +- The ML model has shown in testing to be able to identify AprilTag at distances where there is not enough pixel density from the region of image to makeout what specific Tag the detector is finding. diff --git a/photon-client/src/components/dashboard/tabs/AprilTagTab.vue b/photon-client/src/components/dashboard/tabs/AprilTagTab.vue index f653bc6674..2b6a366ff5 100644 --- a/photon-client/src/components/dashboard/tabs/AprilTagTab.vue +++ b/photon-client/src/components/dashboard/tabs/AprilTagTab.vue @@ -1,11 +1,13 @@ diff --git a/photon-client/src/types/PipelineTypes.ts b/photon-client/src/types/PipelineTypes.ts index 64abdc80fa..f0662c0119 100644 --- a/photon-client/src/types/PipelineTypes.ts +++ b/photon-client/src/types/PipelineTypes.ts @@ -276,6 +276,13 @@ export interface AprilTagPipelineSettings extends PipelineSettings { tagFamily: AprilTagFamily; doMultiTarget: boolean; doSingleTargetAlways: boolean; + // ML-assisted detection settings + useMLDetection: boolean; + mlConfidenceThreshold: number; + mlNmsThreshold: number; + mlRoiPaddingPixels: number; + model: ObjectDetectionModelProperties; + showDetectionBoxes: boolean; } export type ConfigurableAprilTagPipelineSettings = Partial< Omit @@ -299,7 +306,14 @@ export const DefaultAprilTagPipelineSettings: AprilTagPipelineSettings = { threads: 4, tagFamily: AprilTagFamily.Family36h11, doMultiTarget: false, - doSingleTargetAlways: false + doSingleTargetAlways: false, + // ML-assisted detection defaults + useMLDetection: false, + mlConfidenceThreshold: 0.5, + mlNmsThreshold: 0.45, + mlRoiPaddingPixels: 40, + model: {} as ObjectDetectionModelProperties, + showDetectionBoxes: true }; export interface ArucoPipelineSettings extends PipelineSettings { diff --git a/photon-core/src/main/java/org/photonvision/common/configuration/NeuralNetworkModelManager.java b/photon-core/src/main/java/org/photonvision/common/configuration/NeuralNetworkModelManager.java index b1df50fa44..eeda8df560 100644 --- a/photon-core/src/main/java/org/photonvision/common/configuration/NeuralNetworkModelManager.java +++ b/photon-core/src/main/java/org/photonvision/common/configuration/NeuralNetworkModelManager.java @@ -169,6 +169,16 @@ private NeuralNetworkModelsSettings getShippedProperties(File modelsDirectory) { Family.RKNN, Version.YOLOV11)); + nnProps.addModelProperties( + new ModelProperties( + Path.of(modelsDirectory.getAbsolutePath(), "apriltagV4-yolo11.rknn"), + "AprilTag V4", + new LinkedList(List.of("AprilTag")), + 640, + 640, + Family.RKNN, + Version.YOLOV11)); + nnProps.addModelProperties( new ModelProperties( Path.of(modelsDirectory.getAbsolutePath(), "yolov8nCOCO.tflite"), @@ -189,6 +199,16 @@ private NeuralNetworkModelsSettings getShippedProperties(File modelsDirectory) { Family.RUBIK, Version.YOLOV11)); + nnProps.addModelProperties( + new ModelProperties( + Path.of(modelsDirectory.getAbsolutePath(), "apriltagV4-yolo11.tflite"), + "AprilTag V4", + new LinkedList(List.of("AprilTag")), + 640, + 640, + Family.RUBIK, + Version.YOLOV11)); + return nnProps; } @@ -311,6 +331,54 @@ public Optional getDefaultModel() { return models.get(supportedBackends.get(0)).stream().findFirst(); } + /** + * Gets the default AprilTag detection model. Searches for a model with "apriltag" in its nickname + * (case-insensitive). + * + * @return Optional containing the AprilTag model if found + */ + public Optional getDefaultAprilTagModel() { + if (models == null || supportedBackends.isEmpty()) { + return Optional.empty(); + } + + for (Family backend : supportedBackends) { + if (models.containsKey(backend)) { + Optional model = + models.get(backend).stream() + .filter(m -> m.getNickname().toLowerCase().contains("apriltag")) + .findFirst(); + if (model.isPresent()) { + return model; + } + } + } + return Optional.empty(); + } + + /** + * Gets a model by its exact nickname. + * + * @param name The nickname of the model to retrieve + * @return Optional containing the model if found + */ + public Optional getModelByName(String name) { + if (models == null || supportedBackends.isEmpty() || name == null) { + return Optional.empty(); + } + + for (Family backend : supportedBackends) { + if (models.containsKey(backend)) { + Optional model = + models.get(backend).stream().filter(m -> m.getNickname().equals(name)).findFirst(); + if (model.isPresent()) { + return model; + } + } + } + return Optional.empty(); + } + // Do checking later on, when we create the model object private void loadModel(Path path) { if (models == null) { diff --git a/photon-core/src/main/java/org/photonvision/vision/frame/Frame.java b/photon-core/src/main/java/org/photonvision/vision/frame/Frame.java index b37c03a8a2..6b0a5f2e2c 100644 --- a/photon-core/src/main/java/org/photonvision/vision/frame/Frame.java +++ b/photon-core/src/main/java/org/photonvision/vision/frame/Frame.java @@ -17,6 +17,8 @@ package org.photonvision.vision.frame; +import java.util.List; +import org.opencv.core.RotatedRect; import org.photonvision.common.logging.LogGroup; import org.photonvision.common.logging.Logger; import org.photonvision.common.util.math.MathUtils; @@ -36,6 +38,9 @@ public class Frame implements Releasable { public final FrameStaticProperties frameStaticProperties; + /** Optional ML detection ROI bounding boxes for visualization. Set by ML-assisted pipelines. */ + public List mlDetectionRois = List.of(); + public Frame( long sequenceID, CVMat color, diff --git a/photon-core/src/main/java/org/photonvision/vision/pipe/impl/AprilTagROIDecodePipe.java b/photon-core/src/main/java/org/photonvision/vision/pipe/impl/AprilTagROIDecodePipe.java new file mode 100644 index 0000000000..5468e403dc --- /dev/null +++ b/photon-core/src/main/java/org/photonvision/vision/pipe/impl/AprilTagROIDecodePipe.java @@ -0,0 +1,503 @@ +/* + * Copyright (C) Photon Vision. + * + * This program is free software: you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation, either version 3 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see . + */ + +package org.photonvision.vision.pipe.impl; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import org.opencv.core.Mat; +import org.opencv.core.Point; +import org.opencv.core.Rect; +import org.opencv.core.RotatedRect; +import org.opencv.core.Size; +import org.opencv.imgproc.Imgproc; +import org.photonvision.common.logging.LogGroup; +import org.photonvision.common.logging.Logger; +import org.photonvision.vision.apriltag.AprilTagFamily; +import org.photonvision.vision.opencv.CVMat; +import org.photonvision.vision.opencv.Releasable; +import org.photonvision.vision.pipe.CVPipe; +import org.wpilib.vision.apriltag.AprilTagDetection; +import org.wpilib.vision.apriltag.AprilTagDetector; + +/** + * Pipe that decodes AprilTags within ROIs detected by ML. For each ROI from ML detection, it + * extracts a sub-image, runs the traditional WPILib AprilTag detector, and maps the detected corner + * coordinates and homography back to full-frame coordinates. + * + *

Both corner coordinates AND the homography matrix must be transformed from ROI coordinates to + * full-frame coordinates. + */ +public class AprilTagROIDecodePipe + extends CVPipe< + AprilTagROIDecodePipe.ROIDecodeInput, + List, + AprilTagROIDecodePipe.ROIDecodeParams> + implements Releasable { + private static final Logger logger = + new Logger(AprilTagROIDecodePipe.class, LogGroup.VisionModule); + private static final boolean DEBUG_COORDINATE_MAPPING = false; + + /** Context object containing ATR scaling information for a single ROI. */ + private static class ATRContext { + final double scaleFactor; + final boolean wasScaled; + + ATRContext(double scaleFactor) { + this.scaleFactor = scaleFactor; + this.wasScaled = scaleFactor < 1.0; + } + + static ATRContext noScaling() { + return new ATRContext(1.0); + } + } + + /** Input container for ROI decode pipe */ + public static class ROIDecodeInput { + public final CVMat grayFrame; + public final List rois; + + public ROIDecodeInput(CVMat grayFrame, List rois) { + this.grayFrame = grayFrame; + this.rois = rois; + } + } + + /** Parameters for ROI decode pipe */ + public static class ROIDecodeParams { + public AprilTagFamily tagFamily = AprilTagFamily.kTag36h11; + public AprilTagDetector.Config detectorConfig; + public AprilTagDetector.QuadThresholdParameters quadParams; + public int maxHammingDistance = 0; + public double minDecisionMargin = 35; + + // Adaptive Tag Resizing (ATR) parameters + /** Enable ATR (adaptive tag resizing) */ + public boolean atrEnabled = true; + + /** Target dimension for ATR - tags wider than this will be downscaled */ + public int atrTargetDimension = 160; + + /** Minimum scale factor for ATR (prevents extreme downscaling) */ + public double atrMinScaleFactor = 0.25; + + public ROIDecodeParams() { + detectorConfig = new AprilTagDetector.Config(); + detectorConfig.numThreads = 1; + detectorConfig.quadDecimate = 1; + quadParams = new AprilTagDetector.QuadThresholdParameters(); + // Match the defaults from AprilTagPipeline + quadParams.minClusterPixels = 5; + quadParams.maxNumMaxima = 10; + quadParams.criticalAngle = 45 * Math.PI / 180.0; + quadParams.maxLineFitMSE = 10.0f; + quadParams.minWhiteBlackDiff = 5; + quadParams.deglitch = false; + } + } + + private AprilTagDetector detector; + private AprilTagFamily currentFamily; + + public AprilTagROIDecodePipe() { + detector = new AprilTagDetector(); + } + + @Override + protected List process(ROIDecodeInput input) { + List allDetections = new ArrayList<>(); + + if (input.grayFrame == null || input.grayFrame.getMat().empty()) { + return allDetections; + } + + Mat fullFrame = input.grayFrame.getMat(); + int frameWidth = fullFrame.cols(); + int frameHeight = fullFrame.rows(); + + for (RotatedRect roi : input.rois) { + // ROIs are already expanded by the caller (AprilTagPipeline.processMLHybrid) + Rect roiRect = toIntRect(roi); + + if (roiRect.width <= 0 || roiRect.height <= 0) { + continue; + } + + // Clamp to frame bounds (defensive) + roiRect = clampToFrame(roiRect, frameWidth, frameHeight); + if (roiRect.width <= 0 || roiRect.height <= 0) { + continue; + } + + if (DEBUG_COORDINATE_MAPPING) { + logger.debug( + "Original ROI: center=(" + + roi.center.x + + ", " + + roi.center.y + + "), size=" + + roi.size.width + + "x" + + roi.size.height + + ", angle=" + + roi.angle); + logger.debug( + "Expanded ROI (used for mapping): x=" + + roiRect.x + + ", y=" + + roiRect.y + + ", w=" + + roiRect.width + + ", h=" + + roiRect.height); + } + + // Extract full-resolution submat + Mat roiMat = fullFrame.submat(roiRect); + + // === STAGE 1: COARSE - Calculate scaling and resize === + ATRContext atrContext; + Mat processingMat; + + if (params.atrEnabled && roiRect.width > params.atrTargetDimension) { + // Calculate scale factor: S = min(1.0, T_dim / w) + double S = (double) params.atrTargetDimension / roiRect.width; + S = Math.max(S, params.atrMinScaleFactor); // Clamp to prevent extreme scaling + + int scaledWidth = (int) Math.round(roiRect.width * S); + int scaledHeight = (int) Math.round(roiRect.height * S); + + atrContext = new ATRContext(S); + + // Downscale ROI to target dimension + processingMat = new Mat(); + Imgproc.resize( + roiMat, + processingMat, + new Size(scaledWidth, scaledHeight), + 0, + 0, + Imgproc.INTER_AREA); // INTER_AREA best for downscaling + + if (DEBUG_COORDINATE_MAPPING) { + logger.debug( + "ATR: ROI " + + roiRect.width + + "x" + + roiRect.height + + " -> " + + scaledWidth + + "x" + + scaledHeight + + " (S=" + + S + + ")"); + } + } else { + // No scaling needed (tag already small or ATR disabled) + atrContext = ATRContext.noScaling(); + processingMat = roiMat; + } + + // Run AprilTag detection on (possibly scaled) image + AprilTagDetection[] roiDetections = detector.detect(processingMat); + + for (AprilTagDetection det : roiDetections) { + if (det.getHamming() > params.maxHammingDistance) continue; + if (det.getDecisionMargin() < params.minDecisionMargin) continue; + + // === STAGE 2: PROJECTION - Map coordinates back to full-frame === + AprilTagDetection mappedDetection; + if (atrContext.wasScaled) { + mappedDetection = mapToFullFrameWithATR(det, roiRect, atrContext); + } else { + mappedDetection = mapToFullFrame(det, roiRect); + } + + if (DEBUG_COORDINATE_MAPPING) { + logger.debug( + "Tag " + + det.getId() + + " corner 0: ROI=(" + + det.getCornerX(0) + + ", " + + det.getCornerY(0) + + "), Full=(" + + mappedDetection.getCornerX(0) + + ", " + + mappedDetection.getCornerY(0) + + ")"); + } + + allDetections.add(mappedDetection); + } + + // Cleanup + roiMat.release(); + if (atrContext.wasScaled) { + processingMat.release(); + } + } + + // === STAGE 4: POSE ESTIMATION happens in AprilTagPoseEstimatorPipe === + return deduplicateByTagId(allDetections); + } + + /** + * Maps detection coordinates from ROI space to full-frame space. + * + *

Both corners AND homography must be transformed. The pose estimator uses the homography + * matrix internally for pose estimation, not just the corners. + * + * @param det The detection in ROI coordinates + * @param roiOffset The ROI rectangle defining the offset from full frame origin + * @return A new AprilTagDetection with coordinates mapped to full frame + */ + private AprilTagDetection mapToFullFrame(AprilTagDetection det, Rect roiOffset) { + // Map all 4 corners from ROI coordinates to full-frame coordinates + double[] mappedCorners = new double[8]; + for (int i = 0; i < 4; i++) { + mappedCorners[i * 2] = det.getCornerX(i) + roiOffset.x; + mappedCorners[i * 2 + 1] = det.getCornerY(i) + roiOffset.y; + } + + // Map center coordinate + double centerX = det.getCenterX() + roiOffset.x; + double centerY = det.getCenterY() + roiOffset.y; + + // Transform homography from ROI coordinates to full-frame coordinates + double[] transformedHomography = + transformHomography(det.getHomography(), roiOffset.x, roiOffset.y); + + return new AprilTagDetection( + det.getFamily(), + det.getId(), + det.getHamming(), + det.getDecisionMargin(), + transformedHomography, + centerX, + centerY, + mappedCorners); + } + + /** + * Maps detection coordinates from scaled ROI space to full-frame space. Handles both translation + * AND scaling transformations. + * + * @param det The detection in scaled ROI coordinates + * @param roiOffset The ROI rectangle defining the offset from full frame origin + * @param atrContext The ATR scaling context + * @return A new AprilTagDetection with coordinates mapped to full frame + */ + private AprilTagDetection mapToFullFrameWithATR( + AprilTagDetection det, Rect roiOffset, ATRContext atrContext) { + double invS = 1.0 / atrContext.scaleFactor; + + // Map all 4 corners from scaled ROI coordinates to full-frame coordinates + // x_full = (x_scaled / S) + roi_x + double[] mappedCorners = new double[8]; + for (int i = 0; i < 4; i++) { + mappedCorners[i * 2] = det.getCornerX(i) * invS + roiOffset.x; + mappedCorners[i * 2 + 1] = det.getCornerY(i) * invS + roiOffset.y; + } + + // Map center coordinate + double centerX = det.getCenterX() * invS + roiOffset.x; + double centerY = det.getCenterY() * invS + roiOffset.y; + + // Transform homography with scaling + double[] transformedHomography = + transformHomographyWithScale( + det.getHomography(), roiOffset.x, roiOffset.y, atrContext.scaleFactor); + + return new AprilTagDetection( + det.getFamily(), + det.getId(), + det.getHamming(), + det.getDecisionMargin(), + transformedHomography, + centerX, + centerY, + mappedCorners); + } + + /** + * Transforms a homography matrix from ROI coordinates to full-frame coordinates. + * + *

The homography H satisfies: [x_roi, y_roi, 1]^T ~ H * [X_tag, Y_tag, 1]^T + * + *

To convert to full-frame coordinates where x_full = x_roi + offsetX and y_full = y_roi + + * offsetY, we compute H_full = T * H_roi where T is the translation matrix: + * + *

+     * | 1  0  offsetX |
+     * | 0  1  offsetY |
+     * | 0  0  1       |
+     * 
+ * + *

The UMich AprilTag library stores homography as row-major 3x3: [h00, h01, h02, h10, h11, + * h12, h20, h21, h22] source: https://april.eecs.umich.edu/media/pdfs/olson2011tags.pdf + * + * @param h The original homography (9 elements, row-major 3x3) + * @param offsetX The x offset from ROI origin to full-frame origin + * @param offsetY The y offset from ROI origin to full-frame origin + * @return The transformed homography in full-frame coordinates + */ + private double[] transformHomography(double[] h, int offsetX, int offsetY) { + // T * H where T = [[1,0,tx],[0,1,ty],[0,0,1]] + // (T*H)[0][j] = 1*H[0][j] + 0*H[1][j] + tx*H[2][j] = H[0][j] + tx*H[2][j] + // (T*H)[1][j] = 0*H[0][j] + 1*H[1][j] + ty*H[2][j] = H[1][j] + ty*H[2][j] + // (T*H)[2][j] = 0*H[0][j] + 0*H[1][j] + 1*H[2][j] = H[2][j] + + double[] result = new double[9]; + // Row 0: H[0][j] + offsetX * H[2][j] + result[0] = h[0] + offsetX * h[6]; + result[1] = h[1] + offsetX * h[7]; + result[2] = h[2] + offsetX * h[8]; + // Row 1: H[1][j] + offsetY * H[2][j] + result[3] = h[3] + offsetY * h[6]; + result[4] = h[4] + offsetY * h[7]; + result[5] = h[5] + offsetY * h[8]; + // Row 2: unchanged + result[6] = h[6]; + result[7] = h[7]; + result[8] = h[8]; + return result; + } + + /** + * Transforms a homography matrix from scaled ROI coordinates to full-frame coordinates. + * + *

The transformation combines inverse scaling and translation: H_full = T * S_inv * H_scaled + * + *

Where: - S_inv = [[1/S, 0, 0], [0, 1/S, 0], [0, 0, 1]] (inverse scale) - T = [[1, 0, + * offsetX], [0, 1, offsetY], [0, 0, 1]] (translation) + * + *

Combined formula: Row 0: H[0..2] / S + offsetX * H[6..8] Row 1: H[3..5] / S + offsetY * + * H[6..8] Row 2: unchanged (H[6..8]) + * + * @param h The original homography (9 elements, row-major 3x3) + * @param offsetX The x offset from ROI origin to full-frame origin + * @param offsetY The y offset from ROI origin to full-frame origin + * @param S The scale factor used for ROI resizing (< 1 means downscaled) + * @return The transformed homography in full-frame coordinates + */ + private double[] transformHomographyWithScale(double[] h, int offsetX, int offsetY, double S) { + double invS = 1.0 / S; + double[] result = new double[9]; + + // Row 0: H[i]/S + offsetX * H[6+j] + result[0] = h[0] * invS + offsetX * h[6]; + result[1] = h[1] * invS + offsetX * h[7]; + result[2] = h[2] * invS + offsetX * h[8]; + + // Row 1: H[i]/S + offsetY * H[6+j] + result[3] = h[3] * invS + offsetY * h[6]; + result[4] = h[4] * invS + offsetY * h[7]; + result[5] = h[5] * invS + offsetY * h[8]; + + // Row 2: unchanged + result[6] = h[6]; + result[7] = h[7]; + result[8] = h[8]; + + return result; + } + + /** + * Expands a bounding box by adding fixed pixel padding on each side, then clamping to image + * bounds. Additive padding naturally provides more relative expansion for small/far tags and less + * for large/close tags. + * + * @param bbox Original bounding box + * @param paddingPixels Number of pixels to add on each side of the bounding box + * @param imageWidth Image width for clamping + * @param imageHeight Image height for clamping + * @return Expanded and clamped bounding box + */ + public static RotatedRect expandBbox( + RotatedRect bbox, int paddingPixels, int imageWidth, int imageHeight) { + double newWidth = bbox.size.width + 2.0 * paddingPixels; + double newHeight = bbox.size.height + 2.0 * paddingPixels; + + // Clamp center so the (axis-aligned) half-extents stay within frame + double halfW = newWidth / 2.0; + double halfH = newHeight / 2.0; + double cx = Math.max(halfW, Math.min(bbox.center.x, imageWidth - halfW)); + double cy = Math.max(halfH, Math.min(bbox.center.y, imageHeight - halfH)); + + return new RotatedRect(new Point(cx, cy), new Size(newWidth, newHeight), bbox.angle); + } + + /** Converts a RotatedRect to integer Rect (axis-aligned bounding rectangle). */ + private Rect toIntRect(RotatedRect r) { + return r.boundingRect(); + } + + /** Clamps a rectangle to frame bounds. */ + private Rect clampToFrame(Rect r, int frameWidth, int frameHeight) { + int x = Math.max(0, r.x); + int y = Math.max(0, r.y); + int w = Math.min(r.width, frameWidth - x); + int h = Math.min(r.height, frameHeight - y); + return new Rect(x, y, w, h); + } + + /** + * Deduplicates detections by tag ID, keeping the one with highest decision margin. This handles + * cases where overlapping ROIs detect the same tag. + */ + private List deduplicateByTagId(List detections) { + Map bestByTagId = new HashMap<>(); + + for (AprilTagDetection det : detections) { + int tagId = det.getId(); + AprilTagDetection existing = bestByTagId.get(tagId); + + if (existing == null || det.getDecisionMargin() > existing.getDecisionMargin()) { + bestByTagId.put(tagId, det); + } + } + + return new ArrayList<>(bestByTagId.values()); + } + + @Override + public void setParams(ROIDecodeParams newParams) { + if (newParams.tagFamily != currentFamily) { + detector.clearFamilies(); + detector.addFamily(newParams.tagFamily.getNativeName()); + currentFamily = newParams.tagFamily; + } + + detector.setConfig(newParams.detectorConfig); + detector.setQuadThresholdParameters(newParams.quadParams); + + super.setParams(newParams); + } + + @Override + public void release() { + if (detector != null) { + detector.close(); + detector = null; + } + } +} diff --git a/photon-core/src/main/java/org/photonvision/vision/pipe/impl/AprilTagROIDetectionPipe.java b/photon-core/src/main/java/org/photonvision/vision/pipe/impl/AprilTagROIDetectionPipe.java new file mode 100644 index 0000000000..cc1d74eab0 --- /dev/null +++ b/photon-core/src/main/java/org/photonvision/vision/pipe/impl/AprilTagROIDetectionPipe.java @@ -0,0 +1,106 @@ +/* + * Copyright (C) Photon Vision. + * + * This program is free software: you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation, either version 3 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see . + */ + +package org.photonvision.vision.pipe.impl; + +import java.util.ArrayList; +import java.util.List; +import org.opencv.core.RotatedRect; +import org.photonvision.vision.objects.Model; +import org.photonvision.vision.objects.NullModel; +import org.photonvision.vision.objects.ObjectDetector; +import org.photonvision.vision.opencv.CVMat; +import org.photonvision.vision.opencv.Releasable; +import org.photonvision.vision.pipe.CVPipe; + +/** + * Pipe that runs YOLO model on NPU to detect AprilTag bounding boxes in the full frame. This is + * Stage 1 of the ML-assisted AprilTag detection pipeline. + */ +public class AprilTagROIDetectionPipe + extends CVPipe, AprilTagROIDetectionPipe.AprilTagROIDetectionParams> + implements Releasable { + private ObjectDetector detector; + private Model currentModel; + + public static class AprilTagROIDetectionParams { + public final Model model; + public final double confidenceThreshold; + public final double nmsThreshold; + + public AprilTagROIDetectionParams(Model model, double confidence, double nms) { + this.model = model; + this.confidenceThreshold = confidence; + this.nmsThreshold = nms; + } + } + + public AprilTagROIDetectionPipe() { + detector = NullModel.getInstance(); + } + + @Override + protected List process(CVMat in) { + List rois = new ArrayList<>(); + + if (detector == null || detector instanceof NullModel) { + return rois; + } + + if (in.getMat().empty()) { + return rois; + } + + List detections = + detector.detect(in.getMat(), params.nmsThreshold, params.confidenceThreshold); + + for (NeuralNetworkPipeResult detection : detections) { + rois.add(detection.bbox()); + } + + return rois; + } + + @Override + public void setParams(AprilTagROIDetectionParams newParams) { + if (newParams.model != null && newParams.model != currentModel) { + if (detector != null && !(detector instanceof NullModel)) { + detector.release(); + } + detector = newParams.model.load(); + currentModel = newParams.model; + } + super.setParams(newParams); + } + + /** + * Returns whether ML detection is available (a valid model is loaded). + * + * @return true if ML detection is available + */ + public boolean isAvailable() { + return detector != null && !(detector instanceof NullModel); + } + + @Override + public void release() { + if (detector != null && !(detector instanceof NullModel)) { + detector.release(); + detector = null; + } + } +} diff --git a/photon-core/src/main/java/org/photonvision/vision/pipe/impl/DrawMLROIPipe.java b/photon-core/src/main/java/org/photonvision/vision/pipe/impl/DrawMLROIPipe.java new file mode 100644 index 0000000000..d62d10a19c --- /dev/null +++ b/photon-core/src/main/java/org/photonvision/vision/pipe/impl/DrawMLROIPipe.java @@ -0,0 +1,73 @@ +/* + * Copyright (C) Photon Vision. + * + * This program is free software: you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation, either version 3 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see . + */ + +package org.photonvision.vision.pipe.impl; + +import java.util.List; +import org.opencv.core.*; +import org.opencv.imgproc.Imgproc; +import org.photonvision.vision.frame.FrameDivisor; +import org.photonvision.vision.pipe.MutatingPipe; +import org.wpilib.math.util.Pair; + +/** + * Draws ML detection ROI bounding boxes on the output image. Used to visualize where the ML model + * detected potential targets before traditional decoding. + */ +public class DrawMLROIPipe + extends MutatingPipe>, DrawMLROIPipe.DrawMLROIParams> { + private static final Scalar ROI_COLOR = new Scalar(255, 255, 0); // Cyan in BGR + + @Override + protected Void process(Pair> in) { + if (!params.shouldDraw || !params.showDetectionBoxes) return null; + + var mat = in.getFirst(); + var rois = in.getSecond(); + if (rois == null || rois.isEmpty()) return null; + + var imageSize = Math.sqrt((double) mat.rows() * mat.cols()); + var thickness = (int) Math.ceil(imageSize * 0.007); + + Point[] corners = new Point[4]; + for (RotatedRect roi : rois) { + roi.points(corners); + double d = params.divisor.value; + MatOfPoint scaled = + new MatOfPoint( + new Point(corners[0].x / d, corners[0].y / d), + new Point(corners[1].x / d, corners[1].y / d), + new Point(corners[2].x / d, corners[2].y / d), + new Point(corners[3].x / d, corners[3].y / d)); + Imgproc.polylines(mat, List.of(scaled), true, ROI_COLOR, thickness); + } + + return null; + } + + public static class DrawMLROIParams { + public final boolean shouldDraw; + public final boolean showDetectionBoxes; + public final FrameDivisor divisor; + + public DrawMLROIParams(boolean shouldDraw, boolean showDetectionBoxes, FrameDivisor divisor) { + this.shouldDraw = shouldDraw; + this.showDetectionBoxes = showDetectionBoxes; + this.divisor = divisor; + } + } +} diff --git a/photon-core/src/main/java/org/photonvision/vision/pipe/impl/MLDetectionResult.java b/photon-core/src/main/java/org/photonvision/vision/pipe/impl/MLDetectionResult.java new file mode 100644 index 0000000000..f886d23025 --- /dev/null +++ b/photon-core/src/main/java/org/photonvision/vision/pipe/impl/MLDetectionResult.java @@ -0,0 +1,26 @@ +/* + * Copyright (C) Photon Vision. + * + * This program is free software: you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation, either version 3 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see . + */ + +package org.photonvision.vision.pipe.impl; + +import java.util.List; +import org.opencv.core.RotatedRect; +import org.wpilib.vision.apriltag.AprilTagDetection; + +/** Result container for ML hybrid detection */ +public record MLDetectionResult( + List detections, List rois, long nanosElapsed) {} diff --git a/photon-core/src/main/java/org/photonvision/vision/pipeline/AprilTagPipeline.java b/photon-core/src/main/java/org/photonvision/vision/pipeline/AprilTagPipeline.java index 5ae3539aaf..3d154d80b4 100644 --- a/photon-core/src/main/java/org/photonvision/vision/pipeline/AprilTagPipeline.java +++ b/photon-core/src/main/java/org/photonvision/vision/pipeline/AprilTagPipeline.java @@ -20,8 +20,11 @@ import java.util.ArrayList; import java.util.List; import java.util.Optional; +import org.opencv.core.RotatedRect; import org.photonvision.common.configuration.ConfigManager; +import org.photonvision.common.configuration.NeuralNetworkModelManager; import org.photonvision.common.dataflow.structures.Packet; +import org.photonvision.common.hardware.Platform; import org.photonvision.common.logging.LogGroup; import org.photonvision.common.logging.Logger; import org.photonvision.common.util.math.MathUtils; @@ -30,12 +33,16 @@ import org.photonvision.vision.apriltag.AprilTagFamily; import org.photonvision.vision.frame.Frame; import org.photonvision.vision.frame.FrameThresholdType; +import org.photonvision.vision.objects.Model; import org.photonvision.vision.pipe.CVPipe.CVPipeResult; import org.photonvision.vision.pipe.impl.AprilTagDetectionPipe; import org.photonvision.vision.pipe.impl.AprilTagDetectionPipe.AprilTagDetectionPipeParams; import org.photonvision.vision.pipe.impl.AprilTagPoseEstimatorPipe; import org.photonvision.vision.pipe.impl.AprilTagPoseEstimatorPipe.AprilTagPoseEstimatorPipeParams; +import org.photonvision.vision.pipe.impl.AprilTagROIDecodePipe; +import org.photonvision.vision.pipe.impl.AprilTagROIDetectionPipe; import org.photonvision.vision.pipe.impl.CalculateFPSPipe; +import org.photonvision.vision.pipe.impl.MLDetectionResult; import org.photonvision.vision.pipe.impl.MultiTargetPNPPipe; import org.photonvision.vision.pipe.impl.MultiTargetPNPPipe.MultiTargetPNPPipeParams; import org.photonvision.vision.pipeline.result.CVPipelineResult; @@ -60,6 +67,12 @@ public class AprilTagPipeline extends CVPipeline> tagDetectionPipeResult = - aprilTagDetectionPipe.run(frame.processedImage); - sumPipeNanosElapsed += tagDetectionPipeResult.nanosElapsed; + // Perform AprilTag detection (traditional or ML-assisted) + List detections; + long detectionNanos; + + if (settings.useMLDetection && mlAvailable) { + // Use ML-assisted hybrid detection + var mlDetectionResult = processMLHybrid(frame); + detections = mlDetectionResult.detections(); + detectionNanos = mlDetectionResult.nanosElapsed(); + + // Preserve ROIs for visualization in the output stream + frame.mlDetectionRois = mlDetectionResult.rois(); + } else { + // Use traditional detection + CVPipeResult> tagDetectionPipeResult = + aprilTagDetectionPipe.run(frame.processedImage); + detections = tagDetectionPipeResult.output; + detectionNanos = tagDetectionPipeResult.nanosElapsed; + } + sumPipeNanosElapsed += detectionNanos; - List detections = tagDetectionPipeResult.output; List usedDetections = new ArrayList<>(); List targetList = new ArrayList<>(); @@ -168,8 +248,8 @@ protected CVPipelineResult process(Frame frame, AprilTagPipelineSettings setting targetList.add(target); } - // Do multi-tag pose estimation Optional multiTagResult = Optional.empty(); + if (settings.solvePNPEnabled && settings.doMultiTarget) { var multiTagOutput = multiTagPNPPipe.run(targetList); sumPipeNanosElapsed += multiTagOutput.nanosElapsed; @@ -194,7 +274,7 @@ protected CVPipelineResult process(Frame frame, AprilTagPipelineSettings setting tagPoseEstimate = poseResult.output; } - // If single-tag estimation was not done, this is a multi-target tag from the layout + // If single-tag estimation was not done, this tag was used in multi-tag estimation if (tagPoseEstimate == null && multiTagResult.isPresent()) { // compute this tag's camera-to-tag transform using the multitag result var tagPose = atfl.getTagPose(detection.getId()); @@ -250,10 +330,73 @@ protected CVPipelineResult process(Frame frame, AprilTagPipelineSettings setting frame.sequenceID, sumPipeNanosElapsed, fps, targetList, multiTagResult, frame); } + /** + * Performs ML-assisted hybrid AprilTag detection. Stage 1: ML model detects ROIs Stage 2: + * Traditional detector decodes tags within ROIs + */ + private MLDetectionResult processMLHybrid(Frame frame) { + long totalNanos = 0; + + // Stage 1: ML detection to find ROIs + CVPipeResult> mlResult = mlDetectionPipe.run(frame.colorImage); + totalNanos += mlResult.nanosElapsed; + List rawRois = mlResult.output; + + if (rawRois.isEmpty()) { + return new MLDetectionResult(new ArrayList<>(), List.of(), totalNanos); + } + + // Expand ROIs before passing to decode pipe and visualization + int frameWidth = frame.colorImage.getMat().cols(); + int frameHeight = frame.colorImage.getMat().rows(); + List expandedRois = new ArrayList<>(rawRois.size()); + for (RotatedRect roi : rawRois) { + expandedRois.add( + AprilTagROIDecodePipe.expandBbox( + roi, settings.mlRoiPaddingPixels, frameWidth, frameHeight)); + } + + // Stage 2: Decode tags within expanded ROIs using traditional detector + AprilTagROIDecodePipe.ROIDecodeInput decodeInput = + new AprilTagROIDecodePipe.ROIDecodeInput(frame.processedImage, expandedRois); + + CVPipeResult> decodeResult = mlDecodePipe.run(decodeInput); + totalNanos += decodeResult.nanosElapsed; + + return new MLDetectionResult(decodeResult.output, expandedRois, totalNanos); + } + + /** + * Checks if ML detection is available on the current platform. Currently supported: RK3588 + * (Orange Pi 5, Rock 5C, CoolPi 4B) and QCS6490 (Rubik Pi 3). + */ + private boolean checkMLAvailability() { + Platform platform = Platform.getCurrentPlatform(); + return platform == Platform.LINUX_QCS6490 || platform == Platform.LINUX_RK3588_64; + } + + /** + * Gets the AprilTag detection model, either by name or the default model. + * + * @param modelName Optional model name to look up, or null for default + * @return The model, or null if no suitable model is found + */ + private Model getAprilTagModel(String modelName) { + NeuralNetworkModelManager manager = NeuralNetworkModelManager.getInstance(); + + if (modelName != null && !modelName.isEmpty()) { + return manager.getModelByName(modelName).orElse(null); + } + + return manager.getDefaultAprilTagModel().orElse(null); + } + @Override public void release() { aprilTagDetectionPipe.release(); singleTagPoseEstimatorPipe.release(); + mlDetectionPipe.release(); + mlDecodePipe.release(); super.release(); } } diff --git a/photon-core/src/main/java/org/photonvision/vision/pipeline/AprilTagPipelineSettings.java b/photon-core/src/main/java/org/photonvision/vision/pipeline/AprilTagPipelineSettings.java index 0c4b054759..24dc6a361b 100644 --- a/photon-core/src/main/java/org/photonvision/vision/pipeline/AprilTagPipelineSettings.java +++ b/photon-core/src/main/java/org/photonvision/vision/pipeline/AprilTagPipelineSettings.java @@ -18,7 +18,10 @@ package org.photonvision.vision.pipeline; import com.fasterxml.jackson.annotation.JsonTypeName; +import org.photonvision.common.configuration.NeuralNetworkModelManager; +import org.photonvision.common.configuration.NeuralNetworkModelsSettings.ModelProperties; import org.photonvision.vision.apriltag.AprilTagFamily; +import org.photonvision.vision.objects.Model; import org.photonvision.vision.target.TargetModel; @JsonTypeName("AprilTagPipelineSettings") @@ -35,6 +38,28 @@ public class AprilTagPipelineSettings extends AdvancedPipelineSettings { public boolean doMultiTarget = false; public boolean doSingleTargetAlways = false; + // ML-assisted detection settings + public boolean useMLDetection = false; + public double mlConfidenceThreshold = 0.5; + public double mlNmsThreshold = 0.45; + public int mlRoiPaddingPixels = 40; + public ModelProperties model = + NeuralNetworkModelManager.getInstance() + .getDefaultModel() + .map(Model::getProperties) + .orElse(null); + public boolean showDetectionBoxes = true; + + // Adaptive Tag Resizing (ATR) settings + /** Enable adaptive tag resizing for ML-assisted detection */ + public boolean atrEnabled = true; + + /** Target dimension (pixels) for ATR resizing. Tags larger than this will be downscaled. */ + public int atrTargetDimension = 200; + + /** Minimum scale factor - prevents extreme downscaling. Default: 0.25 (4x max downscale) */ + public double atrMinScaleFactor = 0.25; + // 3d settings public AprilTagPipelineSettings() { @@ -63,6 +88,20 @@ public int hashCode() { result = prime * result + decisionMargin; result = prime * result + (doMultiTarget ? 1231 : 1237); result = prime * result + (doSingleTargetAlways ? 1231 : 1237); + // ML-assisted detection fields + result = prime * result + (useMLDetection ? 1231 : 1237); + temp = Double.doubleToLongBits(mlConfidenceThreshold); + result = prime * result + (int) (temp ^ (temp >>> 32)); + temp = Double.doubleToLongBits(mlNmsThreshold); + result = prime * result + (int) (temp ^ (temp >>> 32)); + result = prime * result + mlRoiPaddingPixels; + result = prime * result + ((model.modelPath() == null) ? 0 : model.modelPath().hashCode()); + result = prime * result + (showDetectionBoxes ? 1231 : 1237); + // ATR fields + result = prime * result + (atrEnabled ? 1231 : 1237); + result = prime * result + atrTargetDimension; + temp = Double.doubleToLongBits(atrMinScaleFactor); + result = prime * result + (int) (temp ^ (temp >>> 32)); return result; } @@ -83,6 +122,22 @@ public boolean equals(Object obj) { if (decisionMargin != other.decisionMargin) return false; if (doMultiTarget != other.doMultiTarget) return false; if (doSingleTargetAlways != other.doSingleTargetAlways) return false; + // ML-assisted detection fields + if (useMLDetection != other.useMLDetection) return false; + if (Double.doubleToLongBits(mlConfidenceThreshold) + != Double.doubleToLongBits(other.mlConfidenceThreshold)) return false; + if (Double.doubleToLongBits(mlNmsThreshold) != Double.doubleToLongBits(other.mlNmsThreshold)) + return false; + if (mlRoiPaddingPixels != other.mlRoiPaddingPixels) return false; + if (model == null) { + if (other.model != null) return false; + } else if (!model.equals(other.model)) return false; + if (showDetectionBoxes != other.showDetectionBoxes) return false; + // ATR fields + if (atrEnabled != other.atrEnabled) return false; + if (atrTargetDimension != other.atrTargetDimension) return false; + if (Double.doubleToLongBits(atrMinScaleFactor) + != Double.doubleToLongBits(other.atrMinScaleFactor)) return false; return true; } } diff --git a/photon-core/src/main/java/org/photonvision/vision/pipeline/ArucoPipeline.java b/photon-core/src/main/java/org/photonvision/vision/pipeline/ArucoPipeline.java index 52c2f9f5bd..0901d5f6fc 100644 --- a/photon-core/src/main/java/org/photonvision/vision/pipeline/ArucoPipeline.java +++ b/photon-core/src/main/java/org/photonvision/vision/pipeline/ArucoPipeline.java @@ -160,8 +160,8 @@ protected CVPipelineResult process(Frame frame, ArucoPipelineSettings settings) false, null, null, null, null, frameStaticProperties))); } - // Do multi-tag pose estimation Optional multiTagResult = Optional.empty(); + if (settings.solvePNPEnabled && settings.doMultiTarget) { var multiTagOutput = multiTagPNPPipe.run(targetList); sumPipeNanosElapsed += multiTagOutput.nanosElapsed; @@ -186,7 +186,7 @@ protected CVPipelineResult process(Frame frame, ArucoPipelineSettings settings) tagPoseEstimate = poseResult.output; } - // If single-tag estimation was not done, this is a multi-target tag from the layout + // If single-tag estimation was not done, this tag was used in multi-tag estimation if (tagPoseEstimate == null && multiTagResult.isPresent()) { // compute this tag's camera-to-tag transform using the multitag result var tagPose = atfl.getTagPose(detection.getId()); diff --git a/photon-core/src/main/java/org/photonvision/vision/pipeline/OutputStreamPipeline.java b/photon-core/src/main/java/org/photonvision/vision/pipeline/OutputStreamPipeline.java index 74d46d7b63..eb9c8ae52f 100644 --- a/photon-core/src/main/java/org/photonvision/vision/pipeline/OutputStreamPipeline.java +++ b/photon-core/src/main/java/org/photonvision/vision/pipeline/OutputStreamPipeline.java @@ -41,6 +41,7 @@ public class OutputStreamPipeline { private final Draw2dArucoPipe draw2dArucoPipe = new Draw2dArucoPipe(); private final Draw3dArucoPipe draw3dArucoPipe = new Draw3dArucoPipe(); + private final DrawMLROIPipe drawMLROIPipe = new DrawMLROIPipe(); private final CalculateFPSPipe calculateFPSPipe = new CalculateFPSPipe(); private final ResizeImagePipe resizeImagePipe = new ResizeImagePipe(); @@ -112,6 +113,14 @@ protected void setPipeParams( new DrawCalibrationPipe.DrawCalibrationPipeParams( pipelineSettings.streamingFrameDivisor, pipelineSettings.drawAllSnapshots)); } + + if (settings instanceof AprilTagPipelineSettings atSettings) { + drawMLROIPipe.setParams( + new DrawMLROIPipe.DrawMLROIParams( + settings.outputShouldDraw, + atSettings.showDetectionBoxes, + settings.streamingFrameDivisor)); + } } public CVPipelineResult process( @@ -181,6 +190,9 @@ public CVPipelineResult process( pipeProfileNanos[8] = 0; } else if (settings instanceof AprilTagPipelineSettings) { + // Draw ML detection ROI boxes (underneath tag overlays) + drawMLROIPipe.run(Pair.of(outMat, inputAndOutputFrame.mlDetectionRois)); + // If we are doing apriltags... if (settings.solvePNPEnabled) { // Draw 3d Apriltag markers (camera is calibrated and running in 3d mode) diff --git a/photon-core/src/test/java/org/photonvision/vision/pipe/AprilTagROIDecodePipeTest.java b/photon-core/src/test/java/org/photonvision/vision/pipe/AprilTagROIDecodePipeTest.java new file mode 100644 index 0000000000..c8db588a3f --- /dev/null +++ b/photon-core/src/test/java/org/photonvision/vision/pipe/AprilTagROIDecodePipeTest.java @@ -0,0 +1,1029 @@ +/* + * Copyright (C) Photon Vision. + * + * This program is free software: you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation, either version 3 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see . + */ + +package org.photonvision.vision.pipe; + +import static org.junit.jupiter.api.Assertions.*; +import static org.junit.jupiter.api.Assumptions.*; + +import java.lang.reflect.Method; +import java.util.ArrayList; +import java.util.List; +import java.util.stream.Stream; +import org.junit.jupiter.api.BeforeAll; +import org.junit.jupiter.api.Test; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.Arguments; +import org.junit.jupiter.params.provider.MethodSource; +import org.opencv.core.Mat; +import org.opencv.core.Point; +import org.opencv.core.RotatedRect; +import org.opencv.core.Size; +import org.photonvision.common.LoadJNI; +import org.photonvision.common.configuration.ConfigManager; +import org.photonvision.common.util.TestUtils; +import org.photonvision.vision.apriltag.AprilTagFamily; +import org.photonvision.vision.frame.provider.FileFrameProvider; +import org.photonvision.vision.opencv.CVMat; +import org.photonvision.vision.pipe.impl.AprilTagROIDecodePipe; +import org.photonvision.vision.pipe.impl.AprilTagROIDecodePipe.ROIDecodeInput; +import org.photonvision.vision.pipe.impl.AprilTagROIDecodePipe.ROIDecodeParams; +import org.wpilib.vision.apriltag.AprilTagDetection; +import org.wpilib.vision.apriltag.AprilTagDetector; + +/** + * Unit tests for AprilTagROIDecodePipe, focusing on coordinate mapping accuracy. These tests verify + * that corners detected within ROIs are correctly mapped back to full-frame coordinates. + */ +public class AprilTagROIDecodePipeTest { + @BeforeAll + public static void init() { + LoadJNI.loadLibraries(); + ConfigManager.getInstance().load(); + } + + /** Convenience helper: build an axis-aligned RotatedRect (angle=0) from x/y/w/h. */ + private static RotatedRect roiFromXYWH(double x, double y, double w, double h) { + return new RotatedRect(new Point(x + w / 2.0, y + h / 2.0), new Size(w, h), 0); + } + + /** + * CRITICAL TEST: Verify coordinate mapping from ROI to full frame. This test compares ML-assisted + * detection (simulated with manual ROI) against traditional full-frame detection. The mapped + * corners should match within a small tolerance. + */ + @Test + public void testCoordinateMappingAccuracy() { + // Load a test image with a known AprilTag + var frameProvider = + new FileFrameProvider( + TestUtils.getApriltagImagePath(TestUtils.ApriltagTestImages.kTag1_640_480, false), + TestUtils.WPI2020Image.FOV, + TestUtils.get2020LifeCamCoeffs(false)); + + var frame = frameProvider.get(); + Mat grayMat = frame.processedImage.getMat(); + + // First, run full-frame detection to get ground truth corners + AprilTagDetector fullFrameDetector = new AprilTagDetector(); + fullFrameDetector.addFamily(AprilTagFamily.kTag36h11.getNativeName()); + + AprilTagDetection[] fullFrameDetections; + try { + fullFrameDetections = fullFrameDetector.detect(grayMat); + } catch (Exception e) { + // Skip test if native library doesn't work on this platform + assumeTrue(false, "Native AprilTag library not available: " + e.getMessage()); + return; + } + assumeTrue(fullFrameDetections != null, "Native AprilTag detector returned null"); + assumeTrue(fullFrameDetections.length > 0, "Should detect at least one tag in full frame"); + + AprilTagDetection groundTruth = fullFrameDetections[0]; + + // Create an ROI that contains the tag (simulating ML detection) + // The ROI should be slightly larger than the tag's bounding box + double minX = Double.MAX_VALUE, maxX = Double.MIN_VALUE; + double minY = Double.MAX_VALUE, maxY = Double.MIN_VALUE; + for (int i = 0; i < 4; i++) { + minX = Math.min(minX, groundTruth.getCornerX(i)); + maxX = Math.max(maxX, groundTruth.getCornerX(i)); + minY = Math.min(minY, groundTruth.getCornerY(i)); + maxY = Math.max(maxY, groundTruth.getCornerY(i)); + } + + // Add some padding around the tag (simulating ML detection bbox) + double padding = 20; + RotatedRect simulatedMLBbox = + roiFromXYWH( + minX - padding, minY - padding, maxX - minX + 2 * padding, maxY - minY + 2 * padding); + + // Run the ROI decode pipe + AprilTagROIDecodePipe decodePipe = new AprilTagROIDecodePipe(); + + ROIDecodeParams params = new ROIDecodeParams(); + params.tagFamily = AprilTagFamily.kTag36h11; + params.maxHammingDistance = 0; + params.minDecisionMargin = 35; + decodePipe.setParams(params); + + List rois = new ArrayList<>(); + rois.add(simulatedMLBbox); + + ROIDecodeInput input = new ROIDecodeInput(frame.processedImage, rois); + var result = decodePipe.run(input); + + List roiDetections = result.output; + assertEquals(1, roiDetections.size(), "Should detect exactly one tag in ROI"); + + AprilTagDetection mappedDetection = roiDetections.get(0); + + // Verify corner mapping accuracy (should be within 0.5 pixel) + double tolerance = 0.5; + for (int i = 0; i < 4; i++) { + assertEquals( + groundTruth.getCornerX(i), + mappedDetection.getCornerX(i), + tolerance, + "Corner " + i + " X coordinate should match"); + assertEquals( + groundTruth.getCornerY(i), + mappedDetection.getCornerY(i), + tolerance, + "Corner " + i + " Y coordinate should match"); + } + + // Verify center mapping + assertEquals( + groundTruth.getCenterX(), mappedDetection.getCenterX(), tolerance, "Center X should match"); + assertEquals( + groundTruth.getCenterY(), mappedDetection.getCenterY(), tolerance, "Center Y should match"); + + // Verify tag ID matches + assertEquals(groundTruth.getId(), mappedDetection.getId(), "Tag ID should match"); + + fullFrameDetector.close(); + decodePipe.release(); + } + + /** Test that ROI expansion correctly clamps to image bounds when near edges. */ + @Test + public void testExpandBboxClamping() { + AprilTagROIDecodePipe decodePipe = new AprilTagROIDecodePipe(); + + ROIDecodeParams params = new ROIDecodeParams(); + params.tagFamily = AprilTagFamily.kTag36h11; + decodePipe.setParams(params); + + // Create a test image + int imageWidth = 640; + int imageHeight = 480; + Mat testMat = Mat.zeros(imageHeight, imageWidth, org.opencv.core.CvType.CV_8UC1); + CVMat cvMat = new CVMat(testMat); + + // ROI at edge that would expand outside bounds + RotatedRect edgeRoi = roiFromXYWH(600, 440, 30, 30); // Near bottom-right corner + + List rois = new ArrayList<>(); + rois.add(edgeRoi); + + ROIDecodeInput input = new ROIDecodeInput(cvMat, rois); + var result = decodePipe.run(input); + + // Should not throw exception - ROI should be clamped + assertNotNull(result.output, "Result should not be null"); + // Empty because there's no actual tag, but no crash + + testMat.release(); + decodePipe.release(); + } + + /** Test that deduplication keeps the detection with highest decision margin. */ + @Test + public void testDeduplicationKeepsHighestMargin() { + // Load a test image with a known AprilTag + var frameProvider = + new FileFrameProvider( + TestUtils.getApriltagImagePath(TestUtils.ApriltagTestImages.kTag1_640_480, false), + TestUtils.WPI2020Image.FOV, + TestUtils.get2020LifeCamCoeffs(false)); + + var frame = frameProvider.get(); + + AprilTagROIDecodePipe decodePipe = new AprilTagROIDecodePipe(); + + ROIDecodeParams params = new ROIDecodeParams(); + params.tagFamily = AprilTagFamily.kTag36h11; + params.maxHammingDistance = 0; + params.minDecisionMargin = 0; // Accept all + decodePipe.setParams(params); + + // Create two overlapping ROIs that both contain the same tag + // This simulates overlapping ML detections + RotatedRect roi1 = roiFromXYWH(200, 150, 200, 150); + RotatedRect roi2 = roiFromXYWH(220, 170, 200, 150); // Overlapping + + List rois = new ArrayList<>(); + rois.add(roi1); + rois.add(roi2); + + ROIDecodeInput input = new ROIDecodeInput(frame.processedImage, rois); + var result = decodePipe.run(input); + + // Should deduplicate to a single detection + assertTrue( + result.output.size() <= 1, "Overlapping ROIs detecting same tag should be deduplicated"); + + decodePipe.release(); + } + + /** Test that empty ROI list returns empty result without errors. */ + @Test + public void testEmptyROIListReturnsEmpty() { + // Create a simple test image + Mat testMat = Mat.zeros(480, 640, org.opencv.core.CvType.CV_8UC1); + CVMat cvMat = new CVMat(testMat); + + AprilTagROIDecodePipe decodePipe = new AprilTagROIDecodePipe(); + + ROIDecodeParams params = new ROIDecodeParams(); + params.tagFamily = AprilTagFamily.kTag36h11; + decodePipe.setParams(params); + + List emptyRois = new ArrayList<>(); + ROIDecodeInput input = new ROIDecodeInput(cvMat, emptyRois); + + var result = decodePipe.run(input); + + assertNotNull(result.output, "Result should not be null"); + assertTrue(result.output.isEmpty(), "Empty ROI list should produce empty result"); + + testMat.release(); + decodePipe.release(); + } + + /** Test that ROIs with zero or negative dimensions are skipped. */ + @Test + public void testInvalidROISkipped() { + Mat testMat = Mat.zeros(480, 640, org.opencv.core.CvType.CV_8UC1); + CVMat cvMat = new CVMat(testMat); + + AprilTagROIDecodePipe decodePipe = new AprilTagROIDecodePipe(); + + ROIDecodeParams params = new ROIDecodeParams(); + params.tagFamily = AprilTagFamily.kTag36h11; + decodePipe.setParams(params); + + List rois = new ArrayList<>(); + rois.add(roiFromXYWH(100, 100, 0, 50)); // Zero width + rois.add(roiFromXYWH(100, 100, 50, -10)); // Negative height + rois.add(roiFromXYWH(-50, 100, 50, 50)); // Negative x (will be clamped) + + ROIDecodeInput input = new ROIDecodeInput(cvMat, rois); + + // Should not throw exception + var result = decodePipe.run(input); + assertNotNull(result.output, "Result should not be null even with invalid ROIs"); + + testMat.release(); + decodePipe.release(); + } + + /** + * Test integer conversion for rectangles preserves area by using floor for position and ceil for + * size. + */ + @Test + public void testToIntRectPreservesArea() { + // Test with floating point values (documents expected floor/ceil behavior) + double inX = 10.3, inY = 20.7, inW = 100.2, inH = 100.8; + + // Expected: x=10 (floor), y=20 (floor), w=101 (ceil), h=101 (ceil) + // This ensures we don't clip any part of the tag. + // toIntRect now delegates to RotatedRect.boundingRect() which provides the + // same guarantee for axis-aligned (angle=0) rects. + + int expectedX = 10; + int expectedY = 20; + int expectedW = 101; + int expectedH = 101; + + // Verify that floor/ceil logic matches expectations + assertEquals(expectedX, (int) Math.floor(inX)); + assertEquals(expectedY, (int) Math.floor(inY)); + assertEquals(expectedW, (int) Math.ceil(inW)); + assertEquals(expectedH, (int) Math.ceil(inH)); + } + + // ==================== Homography Transformation Tests ==================== + + /** Helper method to access the private transformHomography method via reflection. */ + private double[] invokeTransformHomography( + AprilTagROIDecodePipe pipe, double[] h, int offsetX, int offsetY) throws Exception { + Method method = + AprilTagROIDecodePipe.class.getDeclaredMethod( + "transformHomography", double[].class, int.class, int.class); + method.setAccessible(true); + return (double[]) method.invoke(pipe, h, offsetX, offsetY); + } + + /** Test that zero offset returns the homography unchanged. */ + @Test + public void testTransformHomography_ZeroOffset_ReturnsUnchanged() throws Exception { + AprilTagROIDecodePipe pipe = new AprilTagROIDecodePipe(); + + double[] h = {0.9, 0.1, 50.0, -0.1, 0.9, 60.0, 0.001, 0.002, 1.0}; + double[] result = invokeTransformHomography(pipe, h, 0, 0); + + double tolerance = 1e-15; + for (int i = 0; i < 9; i++) { + assertEquals( + h[i], result[i], tolerance, "Element " + i + " should be unchanged with zero offset"); + } + + pipe.release(); + } + + /** Test that identity homography with offset becomes a pure translation matrix. */ + @Test + public void testTransformHomography_IdentityHomography_AddsTranslationOnly() throws Exception { + AprilTagROIDecodePipe pipe = new AprilTagROIDecodePipe(); + + // Identity matrix: [[1,0,0],[0,1,0],[0,0,1]] + double[] identity = {1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0}; + int offsetX = 50; + int offsetY = 75; + + double[] result = invokeTransformHomography(pipe, identity, offsetX, offsetY); + + double tolerance = 1e-15; + // Expected: [[1,0,50],[0,1,75],[0,0,1]] + assertEquals(1.0, result[0], tolerance, "h[0] should remain 1"); + assertEquals(0.0, result[1], tolerance, "h[1] should remain 0"); + assertEquals(50.0, result[2], tolerance, "h[2] should become offsetX"); + assertEquals(0.0, result[3], tolerance, "h[3] should remain 0"); + assertEquals(1.0, result[4], tolerance, "h[4] should remain 1"); + assertEquals(75.0, result[5], tolerance, "h[5] should become offsetY"); + assertEquals(0.0, result[6], tolerance, "h[6] should remain 0"); + assertEquals(0.0, result[7], tolerance, "h[7] should remain 0"); + assertEquals(1.0, result[8], tolerance, "h[8] should remain 1"); + + pipe.release(); + } + + /** Test that the third row of the homography is always preserved unchanged. */ + @Test + public void testTransformHomography_ThirdRowUnchanged() throws Exception { + AprilTagROIDecodePipe pipe = new AprilTagROIDecodePipe(); + + // Test various homographies with different third rows + double[][] testCases = { + {0.9, 0.1, 50.0, -0.1, 0.9, 60.0, 0.001, 0.002, 1.0}, + {1.5, -0.3, 100.0, 0.2, 1.8, -50.0, -0.005, 0.003, 1.2}, + {0.5, 0.0, 0.0, 0.0, 0.5, 0.0, 0.0, 0.0, 1.0} + }; + int[][] offsets = {{100, 150}, {500, 300}, {1920, 1080}}; + + double tolerance = 1e-15; + for (int t = 0; t < testCases.length; t++) { + double[] h = testCases[t]; + int[] offset = offsets[t]; + double[] result = invokeTransformHomography(pipe, h, offset[0], offset[1]); + + assertEquals(h[6], result[6], tolerance, "h[6] should be unchanged for test case " + t); + assertEquals(h[7], result[7], tolerance, "h[7] should be unchanged for test case " + t); + assertEquals(h[8], result[8], tolerance, "h[8] should be unchanged for test case " + t); + } + + pipe.release(); + } + + /** Test the standard transformation case with manually computed expected values. */ + @Test + public void testTransformHomography_StandardCase_CorrectComputation() throws Exception { + AprilTagROIDecodePipe pipe = new AprilTagROIDecodePipe(); + + // Input homography + double[] h = {0.9, 0.1, 50.0, -0.1, 0.9, 60.0, 0.001, 0.002, 1.0}; + int offsetX = 100; + int offsetY = 150; + + // Manually compute expected T*H where T = [[1,0,100],[0,1,150],[0,0,1]] + // Row 0: H[0][j] + offsetX * H[2][j] + double expected0 = 0.9 + 100 * 0.001; // 0.9 + 0.1 = 1.0 + double expected1 = 0.1 + 100 * 0.002; // 0.1 + 0.2 = 0.3 + double expected2 = 50.0 + 100 * 1.0; // 50 + 100 = 150 + // Row 1: H[1][j] + offsetY * H[2][j] + double expected3 = -0.1 + 150 * 0.001; // -0.1 + 0.15 = 0.05 + double expected4 = 0.9 + 150 * 0.002; // 0.9 + 0.3 = 1.2 + double expected5 = 60.0 + 150 * 1.0; // 60 + 150 = 210 + // Row 2: unchanged + double expected6 = 0.001; + double expected7 = 0.002; + double expected8 = 1.0; + + double[] result = invokeTransformHomography(pipe, h, offsetX, offsetY); + + double tolerance = 1e-10; + assertEquals(expected0, result[0], tolerance, "h[0] computation incorrect"); + assertEquals(expected1, result[1], tolerance, "h[1] computation incorrect"); + assertEquals(expected2, result[2], tolerance, "h[2] computation incorrect"); + assertEquals(expected3, result[3], tolerance, "h[3] computation incorrect"); + assertEquals(expected4, result[4], tolerance, "h[4] computation incorrect"); + assertEquals(expected5, result[5], tolerance, "h[5] computation incorrect"); + assertEquals(expected6, result[6], tolerance, "h[6] computation incorrect"); + assertEquals(expected7, result[7], tolerance, "h[7] computation incorrect"); + assertEquals(expected8, result[8], tolerance, "h[8] computation incorrect"); + + pipe.release(); + } + + /** + * Test point mapping consistency: for a point p, project(H_roi, p) + offset == project(H_full, p) + * This verifies the mathematical correctness of the transformation. + */ + @Test + public void testTransformHomography_PointMapping_Consistency() throws Exception { + AprilTagROIDecodePipe pipe = new AprilTagROIDecodePipe(); + + // A realistic homography from a tag detection + double[] hRoi = {0.95, 0.05, 45.0, -0.03, 0.98, 55.0, 0.0002, 0.0001, 1.0}; + int offsetX = 200; + int offsetY = 150; + + double[] hFull = invokeTransformHomography(pipe, hRoi, offsetX, offsetY); + + // Test points (normalized tag corners: -1 to 1) + double[][] testPoints = { + {-1.0, -1.0}, + {1.0, -1.0}, + {1.0, 1.0}, + {-1.0, 1.0}, + {0.0, 0.0} // center + }; + + double tolerance = 1e-10; + for (double[] point : testPoints) { + // Project with ROI homography and add offset + double[] roiProjected = projectPoint(hRoi, point[0], point[1]); + double roiFullX = roiProjected[0] + offsetX; + double roiFullY = roiProjected[1] + offsetY; + + // Project with full-frame homography + double[] fullProjected = projectPoint(hFull, point[0], point[1]); + + assertEquals( + roiFullX, + fullProjected[0], + tolerance, + "X coordinate mismatch for point (" + point[0] + ", " + point[1] + ")"); + assertEquals( + roiFullY, + fullProjected[1], + tolerance, + "Y coordinate mismatch for point (" + point[0] + ", " + point[1] + ")"); + } + + pipe.release(); + } + + /** + * Helper method to project a point using a homography. Computes [x', y'] from H * [X, Y, 1]^T + * with perspective division. + */ + private double[] projectPoint(double[] h, double X, double Y) { + double w = h[6] * X + h[7] * Y + h[8]; + double x = (h[0] * X + h[1] * Y + h[2]) / w; + double y = (h[3] * X + h[4] * Y + h[5]) / w; + return new double[] {x, y}; + } + + /** Test numerical stability with large offsets (4K resolution). */ + @Test + public void testTransformHomography_LargeOffsets_NumericalStability() throws Exception { + AprilTagROIDecodePipe pipe = new AprilTagROIDecodePipe(); + + // A homography with typical values + double[] h = {1.2, 0.05, 30.0, -0.02, 1.15, 40.0, 0.0003, 0.0002, 1.0}; + + // 4K resolution offset + int offsetX = 3840; + int offsetY = 2160; + + double[] result = invokeTransformHomography(pipe, h, offsetX, offsetY); + + // Verify no NaN or Inf values + for (int i = 0; i < 9; i++) { + assertFalse(Double.isNaN(result[i]), "Result contains NaN at index " + i); + assertFalse(Double.isInfinite(result[i]), "Result contains Infinity at index " + i); + } + + // Verify the transformation is mathematically correct + double tolerance = 1e-10; + + // Row 0 + assertEquals(h[0] + offsetX * h[6], result[0], tolerance); + assertEquals(h[1] + offsetX * h[7], result[1], tolerance); + assertEquals(h[2] + offsetX * h[8], result[2], tolerance); + // Row 1 + assertEquals(h[3] + offsetY * h[6], result[3], tolerance); + assertEquals(h[4] + offsetY * h[7], result[4], tolerance); + assertEquals(h[5] + offsetY * h[8], result[5], tolerance); + // Row 2 unchanged + assertEquals(h[6], result[6], tolerance); + assertEquals(h[7], result[7], tolerance); + assertEquals(h[8], result[8], tolerance); + + pipe.release(); + } + + // ==================== Parameterized Tests ==================== + + private static Stream offsetTestCases() { + return Stream.of( + Arguments.of(0, 0, "Zero offset"), + Arguments.of(100, 0, "X only"), + Arguments.of(0, 100, "Y only"), + Arguments.of(100, 100, "Equal offsets"), + Arguments.of(640, 480, "VGA resolution"), + Arguments.of(1920, 1080, "1080p resolution")); + } + + /** + * Parameterized test for various offset values. Verifies the transformation formula holds for + * different offset combinations. + */ + @ParameterizedTest(name = "{2}: offset=({0}, {1})") + @MethodSource("offsetTestCases") + public void testTransformHomography_VariousOffsets(int offsetX, int offsetY, String description) + throws Exception { + AprilTagROIDecodePipe pipe = new AprilTagROIDecodePipe(); + + // Use a non-trivial homography + double[] h = {0.85, 0.12, 35.0, -0.08, 0.92, 45.0, 0.0005, 0.0003, 1.0}; + + double[] result = invokeTransformHomography(pipe, h, offsetX, offsetY); + + double tolerance = 1e-10; + + // Verify the transformation formula: T * H + // Row 0: h[i] + offsetX * h[6+i%3] for i in 0,1,2 + assertEquals(h[0] + offsetX * h[6], result[0], tolerance, description + ": h[0] incorrect"); + assertEquals(h[1] + offsetX * h[7], result[1], tolerance, description + ": h[1] incorrect"); + assertEquals(h[2] + offsetX * h[8], result[2], tolerance, description + ": h[2] incorrect"); + + // Row 1: h[i] + offsetY * h[3+i%3] for i in 3,4,5 + assertEquals(h[3] + offsetY * h[6], result[3], tolerance, description + ": h[3] incorrect"); + assertEquals(h[4] + offsetY * h[7], result[4], tolerance, description + ": h[4] incorrect"); + assertEquals(h[5] + offsetY * h[8], result[5], tolerance, description + ": h[5] incorrect"); + + // Row 2: unchanged + assertEquals(h[6], result[6], tolerance, description + ": h[6] should be unchanged"); + assertEquals(h[7], result[7], tolerance, description + ": h[7] should be unchanged"); + assertEquals(h[8], result[8], tolerance, description + ": h[8] should be unchanged"); + + pipe.release(); + } + + /** + * Integration test: Verify that homography transformation produces consistent results when + * comparing full-frame detection vs ROI-based detection with transformation. This extends + * testCoordinateMappingAccuracy to also verify homography values. + */ + @Test + public void testTransformHomography_Integration_HomographyProducesCorrectPoseInput() { + // Load a test image with a known AprilTag + var frameProvider = + new FileFrameProvider( + TestUtils.getApriltagImagePath(TestUtils.ApriltagTestImages.kTag1_640_480, false), + TestUtils.WPI2020Image.FOV, + TestUtils.get2020LifeCamCoeffs(false)); + + var frame = frameProvider.get(); + Mat grayMat = frame.processedImage.getMat(); + + // First, run full-frame detection to get ground truth + AprilTagDetector fullFrameDetector = new AprilTagDetector(); + fullFrameDetector.addFamily(AprilTagFamily.kTag36h11.getNativeName()); + + AprilTagDetection[] fullFrameDetections; + try { + fullFrameDetections = fullFrameDetector.detect(grayMat); + } catch (Exception e) { + assumeTrue(false, "Native AprilTag library not available: " + e.getMessage()); + return; + } + assumeTrue(fullFrameDetections != null, "Native AprilTag detector returned null"); + assumeTrue(fullFrameDetections.length > 0, "Should detect at least one tag in full frame"); + + AprilTagDetection groundTruth = fullFrameDetections[0]; + double[] groundTruthHomography = groundTruth.getHomography(); + + // Create an ROI that contains the tag + double minX = Double.MAX_VALUE, maxX = Double.MIN_VALUE; + double minY = Double.MAX_VALUE, maxY = Double.MIN_VALUE; + for (int i = 0; i < 4; i++) { + minX = Math.min(minX, groundTruth.getCornerX(i)); + maxX = Math.max(maxX, groundTruth.getCornerX(i)); + minY = Math.min(minY, groundTruth.getCornerY(i)); + maxY = Math.max(maxY, groundTruth.getCornerY(i)); + } + + double padding = 20; + RotatedRect simulatedMLBbox = + roiFromXYWH( + minX - padding, minY - padding, maxX - minX + 2 * padding, maxY - minY + 2 * padding); + + // Run the ROI decode pipe + AprilTagROIDecodePipe decodePipe = new AprilTagROIDecodePipe(); + + ROIDecodeParams params = new ROIDecodeParams(); + params.tagFamily = AprilTagFamily.kTag36h11; + params.maxHammingDistance = 0; + params.minDecisionMargin = 35; + decodePipe.setParams(params); + + List rois = new ArrayList<>(); + rois.add(simulatedMLBbox); + + ROIDecodeInput input = new ROIDecodeInput(frame.processedImage, rois); + var result = decodePipe.run(input); + + List roiDetections = result.output; + assertEquals(1, roiDetections.size(), "Should detect exactly one tag in ROI"); + + AprilTagDetection mappedDetection = roiDetections.get(0); + double[] mappedHomography = mappedDetection.getHomography(); + + // Verify that the transformed homography produces the same point projections + // as the ground truth homography for the tag corners + double tolerance = 1.0; // Allow 1 pixel tolerance due to subpixel differences + + // Test projection of normalized tag corners + double[][] tagCorners = {{-1, -1}, {1, -1}, {1, 1}, {-1, 1}}; + for (int i = 0; i < 4; i++) { + double[] gtProjected = + projectPoint(groundTruthHomography, tagCorners[i][0], tagCorners[i][1]); + double[] mappedProjected = projectPoint(mappedHomography, tagCorners[i][0], tagCorners[i][1]); + + assertEquals( + gtProjected[0], + mappedProjected[0], + tolerance, + "Corner " + i + " X projection should match"); + assertEquals( + gtProjected[1], + mappedProjected[1], + tolerance, + "Corner " + i + " Y projection should match"); + } + + fullFrameDetector.close(); + decodePipe.release(); + } + + // ==================== ATR (Adaptive Tag Resizing) Tests ==================== + + /** Helper method to access the private transformHomographyWithScale method via reflection. */ + private double[] invokeTransformHomographyWithScale( + AprilTagROIDecodePipe pipe, double[] h, int offsetX, int offsetY, double S) throws Exception { + Method method = + AprilTagROIDecodePipe.class.getDeclaredMethod( + "transformHomographyWithScale", double[].class, int.class, int.class, double.class); + method.setAccessible(true); + return (double[]) method.invoke(pipe, h, offsetX, offsetY, S); + } + + /** + * Test ATR scale factor calculation: S = min(1.0, T_dim / w) Verifies the formula produces + * correct values for various ROI widths. + */ + @Test + public void testATR_ScaleFactorCalculation() { + int targetDimension = 160; + + // Small tag (w < T_dim): S = 1.0 (no scaling) + double S1 = Math.min(1.0, (double) targetDimension / 100); + assertEquals(1.0, S1, 1e-10, "Small tag should have S=1.0"); + + // Exact match (w == T_dim): S = 1.0 + double S2 = Math.min(1.0, (double) targetDimension / 160); + assertEquals(1.0, S2, 1e-10, "Exact match should have S=1.0"); + + // Large tag (w > T_dim): S < 1.0 + double S3 = Math.min(1.0, (double) targetDimension / 320); + assertEquals(0.5, S3, 1e-10, "320px tag should have S=0.5"); + + // Very large tag + double S4 = Math.min(1.0, (double) targetDimension / 800); + assertEquals(0.2, S4, 1e-10, "800px tag should have S=0.2"); + + // 1000px tag (from documentation example) + double S5 = Math.min(1.0, (double) targetDimension / 1000); + assertEquals(0.16, S5, 1e-10, "1000px tag should have S=0.16"); + } + + /** + * Test ATR coordinate mapping: x_full = (x_scaled / S) + roi_x Verifies corner coordinates are + * correctly mapped from scaled space to full-frame. + */ + @Test + public void testATR_CoordinateMappingWithScale() { + // Simulated scaled detection at (40, 30) in a 160x160 scaled ROI + // Original ROI at (100, 150) with width 400 (S = 0.4) + double scaledX = 40.0; + double scaledY = 30.0; + double S = 0.4; + int roiX = 100; + int roiY = 150; + + // Expected: x_full = (40 / 0.4) + 100 = 100 + 100 = 200 + // y_full = (30 / 0.4) + 150 = 75 + 150 = 225 + double expectedX = (scaledX / S) + roiX; + double expectedY = (scaledY / S) + roiY; + + assertEquals(200.0, expectedX, 1e-10, "X coordinate mapping incorrect"); + assertEquals(225.0, expectedY, 1e-10, "Y coordinate mapping incorrect"); + } + + /** + * Test that transformHomographyWithScale with S=1.0 produces same result as transformHomography. + */ + @Test + public void testATR_TransformHomographyWithScale_NoScaling_MatchesOriginal() throws Exception { + AprilTagROIDecodePipe pipe = new AprilTagROIDecodePipe(); + + double[] h = {0.9, 0.1, 50.0, -0.1, 0.9, 60.0, 0.001, 0.002, 1.0}; + int offsetX = 100; + int offsetY = 150; + double S = 1.0; // No scaling + + double[] resultWithScale = invokeTransformHomographyWithScale(pipe, h, offsetX, offsetY, S); + double[] resultOriginal = invokeTransformHomography(pipe, h, offsetX, offsetY); + + double tolerance = 1e-10; + for (int i = 0; i < 9; i++) { + assertEquals( + resultOriginal[i], + resultWithScale[i], + tolerance, + "Element " + i + " should match original transform when S=1.0"); + } + + pipe.release(); + } + + /** + * Test homography transformation with scaling: H_full = T * S_inv * H_scaled Verifies the + * combined scale+translation transformation is mathematically correct. + */ + @Test + public void testATR_TransformHomographyWithScale_Computation() throws Exception { + AprilTagROIDecodePipe pipe = new AprilTagROIDecodePipe(); + + // Input homography (in scaled ROI space) + double[] h = {0.5, 0.0, 40.0, 0.0, 0.5, 30.0, 0.0, 0.0, 1.0}; + int offsetX = 100; + int offsetY = 150; + double S = 0.5; // ROI was downscaled by 50% + double invS = 1.0 / S; // = 2.0 + + double[] result = invokeTransformHomographyWithScale(pipe, h, offsetX, offsetY, S); + + double tolerance = 1e-10; + + // Expected formula: Row 0: H[i]/S + offsetX * H[6+j] + // result[0] = 0.5 * 2 + 100 * 0 = 1.0 + assertEquals(h[0] * invS + offsetX * h[6], result[0], tolerance, "h[0] incorrect"); + // result[1] = 0.0 * 2 + 100 * 0 = 0.0 + assertEquals(h[1] * invS + offsetX * h[7], result[1], tolerance, "h[1] incorrect"); + // result[2] = 40.0 * 2 + 100 * 1 = 80 + 100 = 180 + assertEquals(h[2] * invS + offsetX * h[8], result[2], tolerance, "h[2] incorrect"); + + // Expected formula: Row 1: H[i]/S + offsetY * H[6+j] + // result[3] = 0.0 * 2 + 150 * 0 = 0.0 + assertEquals(h[3] * invS + offsetY * h[6], result[3], tolerance, "h[3] incorrect"); + // result[4] = 0.5 * 2 + 150 * 0 = 1.0 + assertEquals(h[4] * invS + offsetY * h[7], result[4], tolerance, "h[4] incorrect"); + // result[5] = 30.0 * 2 + 150 * 1 = 60 + 150 = 210 + assertEquals(h[5] * invS + offsetY * h[8], result[5], tolerance, "h[5] incorrect"); + + // Row 2: unchanged + assertEquals(h[6], result[6], tolerance, "h[6] should be unchanged"); + assertEquals(h[7], result[7], tolerance, "h[7] should be unchanged"); + assertEquals(h[8], result[8], tolerance, "h[8] should be unchanged"); + + pipe.release(); + } + + /** + * Test point mapping consistency with ATR scaling. For a point p: project(H_scaled, p) -> scale + * up -> translate == project(H_full, p) + */ + @Test + public void testATR_TransformHomographyWithScale_PointMappingConsistency() throws Exception { + AprilTagROIDecodePipe pipe = new AprilTagROIDecodePipe(); + + // A homography in scaled ROI space + double[] hScaled = {0.8, 0.05, 35.0, -0.03, 0.85, 40.0, 0.0002, 0.0001, 1.0}; + int offsetX = 200; + int offsetY = 150; + double S = 0.4; // 40% scale (large tag was downscaled) + double invS = 1.0 / S; + + double[] hFull = invokeTransformHomographyWithScale(pipe, hScaled, offsetX, offsetY, S); + + // Test points (normalized tag corners) + double[][] testPoints = {{-1.0, -1.0}, {1.0, -1.0}, {1.0, 1.0}, {-1.0, 1.0}, {0.0, 0.0}}; + + double tolerance = 1e-10; + for (double[] point : testPoints) { + // Project with scaled homography, then scale up and translate + double[] scaledProjected = projectPoint(hScaled, point[0], point[1]); + double scaledFullX = scaledProjected[0] * invS + offsetX; + double scaledFullY = scaledProjected[1] * invS + offsetY; + + // Project with full-frame homography + double[] fullProjected = projectPoint(hFull, point[0], point[1]); + + assertEquals( + scaledFullX, + fullProjected[0], + tolerance, + "X coordinate mismatch for point (" + point[0] + ", " + point[1] + ")"); + assertEquals( + scaledFullY, + fullProjected[1], + tolerance, + "Y coordinate mismatch for point (" + point[0] + ", " + point[1] + ")"); + } + + pipe.release(); + } + + /** + * Test ATR with minimum scale factor clamping. Verifies that extreme downscaling is prevented. + */ + @Test + public void testATR_MinScaleFactorClamping() { + int targetDimension = 160; + double minScaleFactor = 0.25; + + // Very large tag that would require extreme scaling + int tagWidth = 2000; + double S = Math.min(1.0, (double) targetDimension / tagWidth); // = 0.08 + S = Math.max(S, minScaleFactor); // Clamp to 0.25 + + assertEquals(0.25, S, 1e-10, "Scale factor should be clamped to minimum"); + } + + /** + * Test ATR enabled vs disabled produces consistent results for small tags. When tag is smaller + * than target dimension, ATR should have no effect. + */ + @Test + public void testATR_DisabledMatchesOriginalBehavior() { + // Load a test image with a known AprilTag + var frameProvider = + new FileFrameProvider( + TestUtils.getApriltagImagePath(TestUtils.ApriltagTestImages.kTag1_640_480, false), + TestUtils.WPI2020Image.FOV, + TestUtils.get2020LifeCamCoeffs(false)); + + var frame = frameProvider.get(); + Mat grayMat = frame.processedImage.getMat(); + + // First, run full-frame detection to find the tag location + AprilTagDetector fullFrameDetector = new AprilTagDetector(); + fullFrameDetector.addFamily(AprilTagFamily.kTag36h11.getNativeName()); + + AprilTagDetection[] fullFrameDetections; + try { + fullFrameDetections = fullFrameDetector.detect(grayMat); + } catch (Exception e) { + assumeTrue(false, "Native AprilTag library not available"); + return; + } + assumeTrue(fullFrameDetections != null && fullFrameDetections.length > 0); + + AprilTagDetection groundTruth = fullFrameDetections[0]; + + // Create an ROI around the tag + double minX = Double.MAX_VALUE, maxX = Double.MIN_VALUE; + double minY = Double.MAX_VALUE, maxY = Double.MIN_VALUE; + for (int i = 0; i < 4; i++) { + minX = Math.min(minX, groundTruth.getCornerX(i)); + maxX = Math.max(maxX, groundTruth.getCornerX(i)); + minY = Math.min(minY, groundTruth.getCornerY(i)); + maxY = Math.max(maxY, groundTruth.getCornerY(i)); + } + + double padding = 20; + RotatedRect simulatedMLBbox = + roiFromXYWH( + minX - padding, minY - padding, maxX - minX + 2 * padding, maxY - minY + 2 * padding); + + // Run with ATR enabled (should not scale since tag is small) + AprilTagROIDecodePipe pipeWithATR = new AprilTagROIDecodePipe(); + ROIDecodeParams paramsWithATR = new ROIDecodeParams(); + paramsWithATR.tagFamily = AprilTagFamily.kTag36h11; + paramsWithATR.atrEnabled = true; + paramsWithATR.atrTargetDimension = 160; // Tag in test image is ~100px, so no scaling + pipeWithATR.setParams(paramsWithATR); + + // Run with ATR disabled + AprilTagROIDecodePipe pipeWithoutATR = new AprilTagROIDecodePipe(); + ROIDecodeParams paramsWithoutATR = new ROIDecodeParams(); + paramsWithoutATR.tagFamily = AprilTagFamily.kTag36h11; + paramsWithoutATR.atrEnabled = false; + pipeWithoutATR.setParams(paramsWithoutATR); + + List rois = new ArrayList<>(); + rois.add(simulatedMLBbox); + + ROIDecodeInput input = new ROIDecodeInput(frame.processedImage, rois); + + var resultWithATR = pipeWithATR.run(input); + var resultWithoutATR = pipeWithoutATR.run(input); + + // Both should detect the same tag with same corners (within tolerance) + assertEquals( + resultWithATR.output.size(), + resultWithoutATR.output.size(), + "Same number of detections expected"); + + if (!resultWithATR.output.isEmpty() && !resultWithoutATR.output.isEmpty()) { + AprilTagDetection detWithATR = resultWithATR.output.get(0); + AprilTagDetection detWithoutATR = resultWithoutATR.output.get(0); + + double tolerance = 0.5; + for (int i = 0; i < 4; i++) { + assertEquals( + detWithATR.getCornerX(i), + detWithoutATR.getCornerX(i), + tolerance, + "Corner " + i + " X should match with/without ATR for small tag"); + assertEquals( + detWithATR.getCornerY(i), + detWithoutATR.getCornerY(i), + tolerance, + "Corner " + i + " Y should match with/without ATR for small tag"); + } + } + + fullFrameDetector.close(); + pipeWithATR.release(); + pipeWithoutATR.release(); + } + + /** + * Test computeHomographyFromCorners produces a valid homography. Verifies that the computed + * homography maps normalized tag coords to corners. + */ + @Test + public void testATR_ComputeHomographyFromCorners() throws Exception { + AprilTagROIDecodePipe pipe = new AprilTagROIDecodePipe(); + ROIDecodeParams params = new ROIDecodeParams(); + params.tagFamily = AprilTagFamily.kTag36h11; + pipe.setParams(params); + + // Access the private method via reflection + Method method = + AprilTagROIDecodePipe.class.getDeclaredMethod( + "computeHomographyFromCorners", org.opencv.core.Point[].class); + method.setAccessible(true); + + // Define test corners (a simple quadrilateral) + org.opencv.core.Point[] corners = + new org.opencv.core.Point[] { + new org.opencv.core.Point(100, 100), // Corner 0: maps from (-1, -1) + new org.opencv.core.Point(200, 100), // Corner 1: maps from ( 1, -1) + new org.opencv.core.Point(200, 200), // Corner 2: maps from ( 1, 1) + new org.opencv.core.Point(100, 200) // Corner 3: maps from (-1, 1) + }; + + double[] homography = (double[]) method.invoke(pipe, (Object) corners); + + assertNotNull(homography, "Homography should not be null"); + assertEquals(9, homography.length, "Homography should have 9 elements"); + + // Verify the homography maps normalized coords to corners + double tolerance = 1.0; // Allow 1 pixel tolerance + + // Test corner 0: (-1, -1) -> (100, 100) + double[] projected0 = projectPoint(homography, -1, -1); + assertEquals(100.0, projected0[0], tolerance, "Corner 0 X projection incorrect"); + assertEquals(100.0, projected0[1], tolerance, "Corner 0 Y projection incorrect"); + + // Test corner 1: (1, -1) -> (200, 100) + double[] projected1 = projectPoint(homography, 1, -1); + assertEquals(200.0, projected1[0], tolerance, "Corner 1 X projection incorrect"); + assertEquals(100.0, projected1[1], tolerance, "Corner 1 Y projection incorrect"); + + // Test corner 2: (1, 1) -> (200, 200) + double[] projected2 = projectPoint(homography, 1, 1); + assertEquals(200.0, projected2[0], tolerance, "Corner 2 X projection incorrect"); + assertEquals(200.0, projected2[1], tolerance, "Corner 2 Y projection incorrect"); + + // Test corner 3: (-1, 1) -> (100, 200) + double[] projected3 = projectPoint(homography, -1, 1); + assertEquals(100.0, projected3[0], tolerance, "Corner 3 X projection incorrect"); + assertEquals(200.0, projected3[1], tolerance, "Corner 3 Y projection incorrect"); + + pipe.release(); + } +} diff --git a/photon-core/src/test/java/org/photonvision/vision/pipeline/AprilTagPipelineMLFlagSafetyTest.java b/photon-core/src/test/java/org/photonvision/vision/pipeline/AprilTagPipelineMLFlagSafetyTest.java new file mode 100644 index 0000000000..1b6b81b5aa --- /dev/null +++ b/photon-core/src/test/java/org/photonvision/vision/pipeline/AprilTagPipelineMLFlagSafetyTest.java @@ -0,0 +1,154 @@ +/* + * Copyright (C) Photon Vision. + * + * This program is free software: you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation, either version 3 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see . + */ + +package org.photonvision.vision.pipeline; + +import static org.junit.jupiter.api.Assertions.*; + +import org.junit.jupiter.api.BeforeEach; +import org.junit.jupiter.api.Test; +import org.photonvision.common.LoadJNI; +import org.photonvision.common.configuration.ConfigManager; +import org.photonvision.common.util.TestUtils; +import org.photonvision.vision.apriltag.AprilTagFamily; +import org.photonvision.vision.camera.QuirkyCamera; +import org.photonvision.vision.frame.provider.FileFrameProvider; +import org.photonvision.vision.pipeline.result.CVPipelineResult; +import org.photonvision.vision.target.TargetModel; + +/** + * Safety tests for the ML-assisted AprilTag pipeline's flag handling. None of these tests actually + * exercise the ML code path — they run on hosts without an NPU, so {@code useMLDetection = true} + * falls through to traditional detection. The tests verify that toggling ML-related flags does not + * crash the pipeline and that ML-related settings serialise correctly. + */ +public class AprilTagPipelineMLFlagSafetyTest { + @BeforeEach + public void setup() { + LoadJNI.loadLibraries(); + ConfigManager.getInstance().load(); + } + + /** + * Test that ML settings are properly handled in pipeline settings. Verifies equals works + * correctly with new ML fields. + */ + @Test + public void testMLSettingsEquality() { + var settings1 = new AprilTagPipelineSettings(); + var settings2 = new AprilTagPipelineSettings(); + + // Both should be equal initially + assertEquals(settings1, settings2); + + // Change ML setting + settings1.useMLDetection = true; + assertNotEquals(settings1, settings2); + + settings2.useMLDetection = true; + assertEquals(settings1, settings2); + + // Change threshold + settings1.mlConfidenceThreshold = 0.7; + assertNotEquals(settings1, settings2); + + settings2.mlConfidenceThreshold = 0.7; + assertEquals(settings1, settings2); + + // Change model name + settings1.mlModelName = "custom-model"; + assertNotEquals(settings1, settings2); + + settings2.mlModelName = "custom-model"; + assertEquals(settings1, settings2); + } + + /** + * Test that the pipeline still detects tags when ML detection is enabled on a platform without + * ML support (e.g., no NPU). In that case the pipeline uses traditional detection. + */ + @Test + public void testMLEnabledOnNonNpuPlatform() { + var pipeline = new AprilTagPipeline(); + + pipeline.getSettings().useMLDetection = true; + pipeline.getSettings().inputShouldShow = true; + pipeline.getSettings().outputShouldDraw = true; + pipeline.getSettings().solvePNPEnabled = true; + pipeline.getSettings().targetModel = TargetModel.kAprilTag6p5in_36h11; + pipeline.getSettings().tagFamily = AprilTagFamily.kTag36h11; + + var frameProvider = + new FileFrameProvider( + TestUtils.getApriltagImagePath(TestUtils.ApriltagTestImages.kTag1_640_480, false), + TestUtils.WPI2020Image.FOV, + TestUtils.get2020LifeCamCoeffs(false)); + frameProvider.requestFrameThresholdType(pipeline.getThresholdType()); + + CVPipelineResult pipelineResult = pipeline.run(frameProvider.get(), QuirkyCamera.DefaultCamera); + + assertFalse(pipelineResult.targets.isEmpty(), "Should detect tag"); + assertEquals(1, pipelineResult.targets.size(), "Should detect exactly one tag"); + + // Verify tag ID + var target = pipelineResult.targets.get(0); + assertEquals(1, target.getFiducialId(), "Should detect tag ID 1"); + + pipeline.release(); + } + + /** + * Test that traditional detection still works when ML is disabled. This ensures the existing + * pipeline behavior is unchanged. + */ + @Test + public void testTraditionalDetectionWhenMLDisabled() { + var pipeline = new AprilTagPipeline(); + + // Explicitly disable ML detection + pipeline.getSettings().useMLDetection = false; + pipeline.getSettings().inputShouldShow = true; + pipeline.getSettings().outputShouldDraw = true; + pipeline.getSettings().solvePNPEnabled = true; + pipeline.getSettings().targetModel = TargetModel.kAprilTag6p5in_36h11; + pipeline.getSettings().tagFamily = AprilTagFamily.kTag36h11; + + var frameProvider = + new FileFrameProvider( + TestUtils.getApriltagImagePath(TestUtils.ApriltagTestImages.kTag1_640_480, false), + TestUtils.WPI2020Image.FOV, + TestUtils.get2020LifeCamCoeffs(false)); + frameProvider.requestFrameThresholdType(pipeline.getThresholdType()); + + CVPipelineResult pipelineResult = pipeline.run(frameProvider.get(), QuirkyCamera.DefaultCamera); + + // Should detect the tag + assertFalse(pipelineResult.targets.isEmpty(), "Should detect tag with traditional detection"); + assertEquals(1, pipelineResult.targets.size(), "Should detect exactly one tag"); + + // Verify pose estimation works + var target = pipelineResult.targets.get(0); + var pose = target.getBestCameraToTarget3d(); + assertNotNull(pose, "Pose should be estimated"); + + // Known approximate values from existing tests + assertEquals(2, pose.getTranslation().getX(), 0.3, "X translation should be approximately 2m"); + + pipeline.release(); + } + +} diff --git a/photon-server/src/main/resources/models/apriltagV4-yolo11.rknn b/photon-server/src/main/resources/models/apriltagV4-yolo11.rknn new file mode 100644 index 0000000000..2cf840c305 Binary files /dev/null and b/photon-server/src/main/resources/models/apriltagV4-yolo11.rknn differ diff --git a/photon-server/src/main/resources/models/apriltagV4-yolo11.tflite b/photon-server/src/main/resources/models/apriltagV4-yolo11.tflite new file mode 100644 index 0000000000..349c553091 Binary files /dev/null and b/photon-server/src/main/resources/models/apriltagV4-yolo11.tflite differ