PhotonVision · DoctorFogarty · Jan 19, 2026 · Jan 19, 2026 · Jan 19, 2026 · Jan 19, 2026
@@ -4,6 +4,7 @@
 about-apriltags
 detector-types
 2D-tracking-tuning
+ml-tag
 3D-tracking
 multitag
 coordinate-systems

@@ -0,0 +1,55 @@
+# ML-Tag (ML AprilTag Acceleration)
+
+## How does it work?
+
+ML-Tag is an acceleration mode within the AprilTag pipeline. Instead of running the AprilTag detector across the full camera frame, an ML model running on the coprocessor's NPU first proposes bounding-box regions where AprilTags appear to be in the image, and the standard AprilTag decoder then runs only inside those regions. On supported hardware this reduces the work the classical detector has to do, leaving more headroom for higher resolutions, higher framerates, and lower latency.
+
+The ML model is purely a localizer: it answers "is there an AprilTag here?" and nothing else. It does not read the tag's ID, identify its family, recover its orientation, or estimate its pose — all of that is still done by the same classical detector used in the standard AprilTag pipeline. ML-Tag is therefore transparent to your robot code: detections come out as the same targets, and {ref}`3D Tracking <docs/apriltag-pipelines/3D-tracking:3D Tracking>` and {ref}`MultiTag <docs/apriltag-pipelines/multitag:MultiTag Localization>` work without any changes.
+
+## Hardware Requirements
+
+ML-Tag runs on the same NPUs PhotonVision uses for object detection:
+
+- RK3588 boards — Orange Pi 5 / 5 Plus, Rock 5C, CoolPi 4B (RKNN backend)
+- QCS6490 boards — Rubik Pi 3 (TFLite backend)
+
+For installation, model conversion, and other platform-specific notes, see the coprocessor pages used for object detection: {ref}`Orange Pi 5 <docs/objectDetection/opi:Orange Pi 5 (and variants) Object Detection>` and {ref}`Rubik Pi 3 <docs/objectDetection/rubik:Rubik Pi 3 Object Detection>`.
+
+:::{note}
+The ML-Tag controls only appear in the AprilTag pipeline tab when a supported NPU backend is detected. On other coprocessors the AprilTag pipeline runs in its standard CPU-only configuration.
+:::
+
+## Enabling ML-Tag
+
+In the AprilTag pipeline tab, scroll past the standard AprilTag tuning settings to the "ML-Tag (ML AprilTag Acceleration)" section. Toggle **Enable ML-Tag**, and the remaining ML-Tag controls will appear below.
+
+## Tuning ML-Tag
+
+The standard AprilTag tuning parameters described in {ref}`2D AprilTag Tuning / Tracking <docs/apriltag-pipelines/2D-tracking-tuning:Tuning AprilTags>` still apply — they control how the detector decodes each region the ML model hands it. The settings below only control the ML localizer stage.
+
+### Model
+
+Selects which ML model the NPU uses to locate AprilTag regions in the image. PhotonVision ships with a YoloV11 AprilTag model for each supported NPU; any compatible models you have uploaded will also appear here. Only models compatible with the detected NPU backend are listed.
+
+### Confidence Threshold
+
+The minimum confidence score (between 0 and 1) the ML model must report for a proposed region before it is passed to the classical decoder. Higher values reject more weak proposals — good for cutting false positives, but at the risk of missing tags the model is less sure about. The default of 0.5 is a reasonable starting point; lower it if tags are being missed and the ROI overlay confirms the model is hesitating on them.
+
+### NMS Threshold
+
+The non-maximum suppression overlap cutoff (between 0 and 1) used to merge overlapping proposals of the same tag. Higher values allow more overlapping boxes through; lower values are stricter about merging duplicates. The default of 0.45 works well in most cases.
+
+### ROI Padding (px)
+
+The number of pixels of padding added around each proposed region before the classical decoder runs on it. Padding is applied in pixels, so it naturally adapts to tag size in the image: small, far-away tags receive proportionally more expansion (helping the decoder see their borders), while large, nearby tags receive less. Raise this if tag corners are being clipped at the edges of the ROIs; lower it if neighboring tags are being merged into a single region.
+
+### Show ROI Boxes
+
+When enabled, draws the ML model's proposed bounding boxes onto the processed stream. This is valuable when tuning Confidence Threshold and ROI Padding — you can see exactly which regions the model is finding and how tightly they fit each tag. Turn this off in competition for a cleaner stream.
+
+## Limitations
+
+- ML-Tag is only available on the supported NPU coprocessors listed above. On any other hardware the AprilTag pipeline behaves in the standard manner.
+- When several overlapping ROIs decode the same tag, only the detection with the highest decision margin is kept. In practice this is the desired behavior, but it does mean alternate solutions from the same tag are not surfaced.
+- The ML model is purely a localizer. If a tag is in the frame but the model fails to propose a region around it, the classical decoder will never see it and the tag will not be detected. When tuning, use **Show ROI Boxes** to confirm the model is finding the tags you care about, and lower **Confidence Threshold** if it is not.
+- The ML model has shown in testing to be able to identify AprilTag at distances where there is not enough pixel density from the region of image to makeout what specific Tag the detector is finding. 
@@ -1,11 +1,13 @@
 <script setup lang="ts">
 import { PipelineType, type AprilTagPipelineSettings, AprilTagFamily } from "@/types/PipelineTypes";
-import PvSelect from "@/components/common/pv-select.vue";
+import PvSelect, { type SelectItem } from "@/components/common/pv-select.vue";
 import PvSlider from "@/components/common/pv-slider.vue";
 import PvSwitch from "@/components/common/pv-switch.vue";
 import { computed } from "vue";
 import { useStateStore } from "@/stores/StateStore";
 import { useCameraSettingsStore } from "@/stores/settings/CameraSettingsStore";
+import { useSettingsStore } from "@/stores/settings/GeneralSettingsStore";
+import type { ObjectDetectionModelProperties } from "@/types/SettingTypes";
 import { useDisplay } from "vuetify";
 
 // TODO fix pipeline typing in order to fix this, the store settings call should be able to infer that only valid pipeline type settings are exposed based on pre-checks for the entire config section
@@ -17,6 +19,34 @@ const { mdAndDown } = useDisplay();
 const interactiveCols = computed(() =>
   mdAndDown.value && (!useStateStore().sidebarFolded || useCameraSettingsStore().isDriverMode) ? 8 : 7
 );
+
+// Check if ML detection is available on this platform
+const mlDetectionAvailable = computed(() => useSettingsStore().general.supportedBackends.length > 0);
+
+const modelWrapper = computed<SelectItem<string>[]>(() =>
+  supportedModels.value.map((model) => ({
+    name: model.nickname,
+    value: model.modelPath
+  }))
+);
+
+const selectedModel = computed<string>({
+  get: () => currentPipelineSettings.value.model?.modelPath ?? "",
+  set: (value) => {
+    const model = supportedModels.value.find((supportedModel) => supportedModel.modelPath === value);
+    if (model) {
+      useCameraSettingsStore().changeCurrentPipelineSetting({ model }, true);
+    }
+  }
+});
+// Filter models to only those supported by available backends
+const supportedModels = computed<ObjectDetectionModelProperties[]>(() => {
+  const { availableModels, supportedBackends } = useSettingsStore().general;
+  const isSupported = (model: ObjectDetectionModelProperties) => {
+    return supportedBackends.some((backend: string) => backend.toLowerCase() === model.family.toLowerCase());
+  };
+  return availableModels.filter(isSupported);
+});
 </script>
 
 <template>
@@ -90,5 +120,74 @@ const interactiveCols = computed(() =>
         (value) => useCameraSettingsStore().changeCurrentPipelineSetting({ refineEdges: value }, false)
       "
     />
+
+    <!-- ML-Tag (ML AprilTag Acceleration) Section -->
+    <v-divider v-if="mlDetectionAvailable" class="mt-3 mb-2" />
+    <div v-if="mlDetectionAvailable" class="ml-settings-section">
+      <p class="text-subtitle-2 mb-2">ML-Tag (ML AprilTag Acceleration)</p>
+      <pv-switch
+        v-model="currentPipelineSettings.useMLDetection"
+        :switch-cols="interactiveCols"
+        label="Enable ML-Tag"
+        tooltip="ML-Tag uses an NPU-accelerated ML model for faster AprilTag detection. Requires compatible hardware (RK3588 or QCS6490)."
+        @update:modelValue="
+          (value) => useCameraSettingsStore().changeCurrentPipelineSetting({ useMLDetection: value }, false)
+        "
+      />
+      <div v-if="currentPipelineSettings.useMLDetection">
+        <pv-select
+          v-model="selectedModel"
+          label="Model"
+          tooltip="The model used for ML-Tag AprilTag detection"
+          :select-cols="interactiveCols"
+          :items="modelWrapper"
+        />
+        <pv-slider
+          v-model="currentPipelineSettings.mlConfidenceThreshold"
+          :slider-cols="interactiveCols"
+          label="Confidence Threshold"
+          tooltip="Minimum confidence score for ML detection (0-1). Higher values reduce false positives"
+          :min="0.1"
+          :max="1.0"
+          :step="0.05"
+          @update:modelValue="
+            (value) => useCameraSettingsStore().changeCurrentPipelineSetting({ mlConfidenceThreshold: value }, false)
+          "
+        />
+        <pv-slider
+          v-model="currentPipelineSettings.mlNmsThreshold"
+          :slider-cols="interactiveCols"
+          label="NMS Threshold"
+          tooltip="Non-maximum suppression threshold for overlapping detections (0-1)"
+          :min="0.1"
+          :max="1.0"
+          :step="0.05"
+          @update:modelValue="
+            (value) => useCameraSettingsStore().changeCurrentPipelineSetting({ mlNmsThreshold: value }, false)
+          "
+        />
+        <pv-slider
+          v-model="currentPipelineSettings.mlRoiPaddingPixels"
+          :slider-cols="interactiveCols"
+          label="ROI Padding (px)"
+          tooltip="Pixels of padding added around each detected region. Naturally adapts: small/far tags get more relative expansion, large/close tags get less"
+          :min="10"
+          :max="150"
+          :step="5"
+          @update:modelValue="
+            (value) => useCameraSettingsStore().changeCurrentPipelineSetting({ mlRoiPaddingPixels: value }, false)
+          "
+        />
+        <pv-switch
+          v-model="currentPipelineSettings.showDetectionBoxes"
+          :switch-cols="interactiveCols"
+          label="Show ROI Boxes"
+          tooltip="Draw the ML model's detected bounding boxes on the processed image for tuning visualization"
+          @update:modelValue="
+            (value) => useCameraSettingsStore().changeCurrentPipelineSetting({ showDetectionBoxes: value }, false)
+          "
+        />
+      </div>
+    </div>
   </div>
 </template>
@@ -276,6 +276,13 @@ export interface AprilTagPipelineSettings extends PipelineSettings {
   tagFamily: AprilTagFamily;
   doMultiTarget: boolean;
   doSingleTargetAlways: boolean;
+  // ML-assisted detection settings
+  useMLDetection: boolean;
+  mlConfidenceThreshold: number;
+  mlNmsThreshold: number;
+  mlRoiPaddingPixels: number;
+  model: ObjectDetectionModelProperties;
+  showDetectionBoxes: boolean;
 }
 export type ConfigurableAprilTagPipelineSettings = Partial<
   Omit<AprilTagPipelineSettings, "pipelineType" | "hammingDist" | "debug">
@@ -299,7 +306,14 @@ export const DefaultAprilTagPipelineSettings: AprilTagPipelineSettings = {
   threads: 4,
   tagFamily: AprilTagFamily.Family36h11,
   doMultiTarget: false,
-  doSingleTargetAlways: false
+  doSingleTargetAlways: false,
+  // ML-assisted detection defaults
+  useMLDetection: false,
+  mlConfidenceThreshold: 0.5,
+  mlNmsThreshold: 0.45,
+  mlRoiPaddingPixels: 40,
+  model: {} as ObjectDetectionModelProperties,
+  showDetectionBoxes: true
 };
 
 export interface ArucoPipelineSettings extends PipelineSettings {

@@ -169,6 +169,16 @@ private NeuralNetworkModelsSettings getShippedProperties(File modelsDirectory) {
                         Family.RKNN,
                         Version.YOLOV11));
 
+        nnProps.addModelProperties(
+                new ModelProperties(
+                        Path.of(modelsDirectory.getAbsolutePath(), "apriltagV4-yolo11.rknn"),
+                        "AprilTag V4",
+                        new LinkedList<String>(List.of("AprilTag")),
+                        640,
+                        640,
+                        Family.RKNN,
+                        Version.YOLOV11));
+
         nnProps.addModelProperties(
                 new ModelProperties(
                         Path.of(modelsDirectory.getAbsolutePath(), "yolov8nCOCO.tflite"),
@@ -189,6 +199,16 @@ private NeuralNetworkModelsSettings getShippedProperties(File modelsDirectory) {
                         Family.RUBIK,
                         Version.YOLOV11));
 
+        nnProps.addModelProperties(
+                new ModelProperties(
+                        Path.of(modelsDirectory.getAbsolutePath(), "apriltagV4-yolo11.tflite"),
+                        "AprilTag V4",
+                        new LinkedList<String>(List.of("AprilTag")),
+                        640,
+                        640,
+                        Family.RUBIK,
+                        Version.YOLOV11));
+
         return nnProps;
     }
 
@@ -311,6 +331,54 @@ public Optional<Model> getDefaultModel() {
         return models.get(supportedBackends.get(0)).stream().findFirst();
     }
 
+    /**
+     * Gets the default AprilTag detection model. Searches for a model with "apriltag" in its nickname
+     * (case-insensitive).
+     *
+     * @return Optional containing the AprilTag model if found
+     */
+    public Optional<Model> getDefaultAprilTagModel() {
+        if (models == null || supportedBackends.isEmpty()) {
+            return Optional.empty();
+        }
+
+        for (Family backend : supportedBackends) {
+            if (models.containsKey(backend)) {
+                Optional<Model> model =
+                        models.get(backend).stream()
+                                .filter(m -> m.getNickname().toLowerCase().contains("apriltag"))
+                                .findFirst();
+                if (model.isPresent()) {
+                    return model;
+                }
+            }
+        }
+        return Optional.empty();
+    }
+
+    /**
+     * Gets a model by its exact nickname.
+     *
+     * @param name The nickname of the model to retrieve
+     * @return Optional containing the model if found
+     */
+    public Optional<Model> getModelByName(String name) {
+        if (models == null || supportedBackends.isEmpty() || name == null) {
+            return Optional.empty();
+        }
+
+        for (Family backend : supportedBackends) {
+            if (models.containsKey(backend)) {
+                Optional<Model> model =
+                        models.get(backend).stream().filter(m -> m.getNickname().equals(name)).findFirst();
+                if (model.isPresent()) {
+                    return model;
+                }
+            }
+        }
+        return Optional.empty();
+    }
+
     // Do checking later on, when we create the model object
     private void loadModel(Path path) {
         if (models == null) {

@@ -17,6 +17,8 @@
 
 package org.photonvision.vision.frame;
 
+import java.util.List;
+import org.opencv.core.RotatedRect;
 import org.photonvision.common.logging.LogGroup;
 import org.photonvision.common.logging.Logger;
 import org.photonvision.common.util.math.MathUtils;
@@ -36,6 +38,9 @@ public class Frame implements Releasable {
 
     public final FrameStaticProperties frameStaticProperties;
 
+    /** Optional ML detection ROI bounding boxes for visualization. Set by ML-assisted pipelines. */
+    public List<RotatedRect> mlDetectionRois = List.of();
+
     public Frame(
             long sequenceID,
             CVMat color,