Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
3a5dd04
Created Apriltag ML assisted Apriltag settings
DoctorFogarty Jan 19, 2026
d05a07c
Small change to stop spamming logs every time a pipe setting check oc…
DoctorFogarty Jan 19, 2026
d5834a7
remove duplicated .tflite, remove tool-versions.yaml
DoctorFogarty Jan 19, 2026
f839059
Homography transform added to requirements
DoctorFogarty Jan 19, 2026
385efbe
Match existing pattern for settingsStore
DoctorFogarty Jan 19, 2026
bb6226e
Minor comment cleanup, added source umich documentation to where matr…
DoctorFogarty Jan 19, 2026
7242710
Unit testing homography transformation
DoctorFogarty Jan 19, 2026
dd405a2
ROI Decimate should always be 1
DoctorFogarty Jan 20, 2026
c1d3558
testing enhancements to ROI size fallback
DoctorFogarty Jan 21, 2026
bab0fdd
Thread pooling
DoctorFogarty Jan 23, 2026
fcb648e
Revert "testing enhancements to ROI size fallback"
DoctorFogarty Jan 23, 2026
caf1ddb
Adaptive Tag Resizing to fix poor near field performance
DoctorFogarty Jan 23, 2026
0a7a283
AprilTags are Y down, UI selector
DoctorFogarty Jan 24, 2026
032a635
Removed Subpix refinement as it was not well informed. Removed auto p…
DoctorFogarty Feb 4, 2026
6188bc1
feat: Model selector on AprilTag screen after choosing AI Acceleration
DoctorFogarty Feb 16, 2026
da50228
Slightly higher atr
DoctorFogarty Feb 20, 2026
33d48bf
AprilTag Pipeline ROI box viewer
DoctorFogarty Feb 20, 2026
b5d40e7
Additive Pixel Padding strategy change
DoctorFogarty Feb 21, 2026
e0a72ca
Synced Changes from 2026.3.2
judsonjames Mar 21, 2026
ff5f586
Add TFLite and RKNN models to source for review
judsonjames Mar 24, 2026
6fb5e95
Removed old model weights.
DoctorFogarty Mar 26, 2026
3ff9b45
Added a configurable multi-tag estimate ambiguity filter
DoctorFogarty Mar 8, 2026
fd40986
lint
samfreund Mar 30, 2026
f2565b3
Move `MLDetectionResult` to standalone record
judsonjames Mar 31, 2026
bd602f3
WPIFormat on MLDetectionResult
judsonjames Mar 31, 2026
a609bcf
backend 27 migratoin
samfreund May 6, 2026
588de64
migrate to proper model references
samfreund May 6, 2026
7db30e1
Remove the single tag high ambiguity rejection controls from multi-ta…
DoctorFogarty May 12, 2026
63f6848
Make the ML AprilTag pipeline use the UI settings instead of hard cod…
DoctorFogarty May 13, 2026
3da4a13
Removed fallback to traditional pipeline flag since it's existance wa…
DoctorFogarty May 13, 2026
2257007
Documentation
DoctorFogarty May 13, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/source/docs/apriltag-pipelines/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
about-apriltags
detector-types
2D-tracking-tuning
ml-tag
3D-tracking
multitag
coordinate-systems
Expand Down
55 changes: 55 additions & 0 deletions docs/source/docs/apriltag-pipelines/ml-tag.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# ML-Tag (ML AprilTag Acceleration)

## How does it work?

ML-Tag is an acceleration mode within the AprilTag pipeline. Instead of running the AprilTag detector across the full camera frame, an ML model running on the coprocessor's NPU first proposes bounding-box regions where AprilTags appear to be in the image, and the standard AprilTag decoder then runs only inside those regions. On supported hardware this reduces the work the classical detector has to do, leaving more headroom for higher resolutions, higher framerates, and lower latency.

The ML model is purely a localizer: it answers "is there an AprilTag here?" and nothing else. It does not read the tag's ID, identify its family, recover its orientation, or estimate its pose — all of that is still done by the same classical detector used in the standard AprilTag pipeline. ML-Tag is therefore transparent to your robot code: detections come out as the same targets, and {ref}`3D Tracking <docs/apriltag-pipelines/3D-tracking:3D Tracking>` and {ref}`MultiTag <docs/apriltag-pipelines/multitag:MultiTag Localization>` work without any changes.

## Hardware Requirements

ML-Tag runs on the same NPUs PhotonVision uses for object detection:

- RK3588 boards — Orange Pi 5 / 5 Plus, Rock 5C, CoolPi 4B (RKNN backend)
- QCS6490 boards — Rubik Pi 3 (TFLite backend)

For installation, model conversion, and other platform-specific notes, see the coprocessor pages used for object detection: {ref}`Orange Pi 5 <docs/objectDetection/opi:Orange Pi 5 (and variants) Object Detection>` and {ref}`Rubik Pi 3 <docs/objectDetection/rubik:Rubik Pi 3 Object Detection>`.

:::{note}
The ML-Tag controls only appear in the AprilTag pipeline tab when a supported NPU backend is detected. On other coprocessors the AprilTag pipeline runs in its standard CPU-only configuration.
:::

## Enabling ML-Tag

In the AprilTag pipeline tab, scroll past the standard AprilTag tuning settings to the "ML-Tag (ML AprilTag Acceleration)" section. Toggle **Enable ML-Tag**, and the remaining ML-Tag controls will appear below.

## Tuning ML-Tag

The standard AprilTag tuning parameters described in {ref}`2D AprilTag Tuning / Tracking <docs/apriltag-pipelines/2D-tracking-tuning:Tuning AprilTags>` still apply — they control how the detector decodes each region the ML model hands it. The settings below only control the ML localizer stage.

### Model

Selects which ML model the NPU uses to locate AprilTag regions in the image. PhotonVision ships with a YoloV11 AprilTag model for each supported NPU; any compatible models you have uploaded will also appear here. Only models compatible with the detected NPU backend are listed.

### Confidence Threshold

The minimum confidence score (between 0 and 1) the ML model must report for a proposed region before it is passed to the classical decoder. Higher values reject more weak proposals — good for cutting false positives, but at the risk of missing tags the model is less sure about. The default of 0.5 is a reasonable starting point; lower it if tags are being missed and the ROI overlay confirms the model is hesitating on them.

### NMS Threshold

The non-maximum suppression overlap cutoff (between 0 and 1) used to merge overlapping proposals of the same tag. Higher values allow more overlapping boxes through; lower values are stricter about merging duplicates. The default of 0.45 works well in most cases.

### ROI Padding (px)

The number of pixels of padding added around each proposed region before the classical decoder runs on it. Padding is applied in pixels, so it naturally adapts to tag size in the image: small, far-away tags receive proportionally more expansion (helping the decoder see their borders), while large, nearby tags receive less. Raise this if tag corners are being clipped at the edges of the ROIs; lower it if neighboring tags are being merged into a single region.

### Show ROI Boxes

When enabled, draws the ML model's proposed bounding boxes onto the processed stream. This is valuable when tuning Confidence Threshold and ROI Padding — you can see exactly which regions the model is finding and how tightly they fit each tag. Turn this off in competition for a cleaner stream.

## Limitations

- ML-Tag is only available on the supported NPU coprocessors listed above. On any other hardware the AprilTag pipeline behaves in the standard manner.
- When several overlapping ROIs decode the same tag, only the detection with the highest decision margin is kept. In practice this is the desired behavior, but it does mean alternate solutions from the same tag are not surfaced.
- The ML model is purely a localizer. If a tag is in the frame but the model fails to propose a region around it, the classical decoder will never see it and the tag will not be detected. When tuning, use **Show ROI Boxes** to confirm the model is finding the tags you care about, and lower **Confidence Threshold** if it is not.
- The ML model has shown in testing to be able to identify AprilTag at distances where there is not enough pixel density from the region of image to makeout what specific Tag the detector is finding.
101 changes: 100 additions & 1 deletion photon-client/src/components/dashboard/tabs/AprilTagTab.vue
Original file line number Diff line number Diff line change
@@ -1,11 +1,13 @@
<script setup lang="ts">
import { PipelineType, type AprilTagPipelineSettings, AprilTagFamily } from "@/types/PipelineTypes";
import PvSelect from "@/components/common/pv-select.vue";
import PvSelect, { type SelectItem } from "@/components/common/pv-select.vue";
import PvSlider from "@/components/common/pv-slider.vue";
import PvSwitch from "@/components/common/pv-switch.vue";
import { computed } from "vue";
import { useStateStore } from "@/stores/StateStore";
import { useCameraSettingsStore } from "@/stores/settings/CameraSettingsStore";
import { useSettingsStore } from "@/stores/settings/GeneralSettingsStore";
import type { ObjectDetectionModelProperties } from "@/types/SettingTypes";
import { useDisplay } from "vuetify";

// TODO fix pipeline typing in order to fix this, the store settings call should be able to infer that only valid pipeline type settings are exposed based on pre-checks for the entire config section
Expand All @@ -17,6 +19,34 @@ const { mdAndDown } = useDisplay();
const interactiveCols = computed(() =>
mdAndDown.value && (!useStateStore().sidebarFolded || useCameraSettingsStore().isDriverMode) ? 8 : 7
);

// Check if ML detection is available on this platform
const mlDetectionAvailable = computed(() => useSettingsStore().general.supportedBackends.length > 0);

const modelWrapper = computed<SelectItem<string>[]>(() =>
supportedModels.value.map((model) => ({
name: model.nickname,
value: model.modelPath
}))
);

const selectedModel = computed<string>({
get: () => currentPipelineSettings.value.model?.modelPath ?? "",
set: (value) => {
const model = supportedModels.value.find((supportedModel) => supportedModel.modelPath === value);
if (model) {
useCameraSettingsStore().changeCurrentPipelineSetting({ model }, true);
}
}
});
// Filter models to only those supported by available backends
const supportedModels = computed<ObjectDetectionModelProperties[]>(() => {
const { availableModels, supportedBackends } = useSettingsStore().general;
const isSupported = (model: ObjectDetectionModelProperties) => {
return supportedBackends.some((backend: string) => backend.toLowerCase() === model.family.toLowerCase());
};
return availableModels.filter(isSupported);
});
</script>

<template>
Expand Down Expand Up @@ -90,5 +120,74 @@ const interactiveCols = computed(() =>
(value) => useCameraSettingsStore().changeCurrentPipelineSetting({ refineEdges: value }, false)
"
/>

<!-- ML-Tag (ML AprilTag Acceleration) Section -->
<v-divider v-if="mlDetectionAvailable" class="mt-3 mb-2" />
<div v-if="mlDetectionAvailable" class="ml-settings-section">
<p class="text-subtitle-2 mb-2">ML-Tag (ML AprilTag Acceleration)</p>
<pv-switch
v-model="currentPipelineSettings.useMLDetection"
:switch-cols="interactiveCols"
label="Enable ML-Tag"
tooltip="ML-Tag uses an NPU-accelerated ML model for faster AprilTag detection. Requires compatible hardware (RK3588 or QCS6490)."
@update:modelValue="
(value) => useCameraSettingsStore().changeCurrentPipelineSetting({ useMLDetection: value }, false)
"
/>
<div v-if="currentPipelineSettings.useMLDetection">
<pv-select
v-model="selectedModel"
label="Model"
tooltip="The model used for ML-Tag AprilTag detection"
:select-cols="interactiveCols"
:items="modelWrapper"
/>
<pv-slider
v-model="currentPipelineSettings.mlConfidenceThreshold"
:slider-cols="interactiveCols"
label="Confidence Threshold"
tooltip="Minimum confidence score for ML detection (0-1). Higher values reduce false positives"
:min="0.1"
:max="1.0"
:step="0.05"
@update:modelValue="
(value) => useCameraSettingsStore().changeCurrentPipelineSetting({ mlConfidenceThreshold: value }, false)
"
/>
<pv-slider
v-model="currentPipelineSettings.mlNmsThreshold"
:slider-cols="interactiveCols"
label="NMS Threshold"
tooltip="Non-maximum suppression threshold for overlapping detections (0-1)"
:min="0.1"
:max="1.0"
:step="0.05"
@update:modelValue="
(value) => useCameraSettingsStore().changeCurrentPipelineSetting({ mlNmsThreshold: value }, false)
"
/>
<pv-slider
v-model="currentPipelineSettings.mlRoiPaddingPixels"
:slider-cols="interactiveCols"
label="ROI Padding (px)"
tooltip="Pixels of padding added around each detected region. Naturally adapts: small/far tags get more relative expansion, large/close tags get less"
:min="10"
:max="150"
:step="5"
@update:modelValue="
(value) => useCameraSettingsStore().changeCurrentPipelineSetting({ mlRoiPaddingPixels: value }, false)
"
/>
<pv-switch
v-model="currentPipelineSettings.showDetectionBoxes"
:switch-cols="interactiveCols"
label="Show ROI Boxes"
tooltip="Draw the ML model's detected bounding boxes on the processed image for tuning visualization"
@update:modelValue="
(value) => useCameraSettingsStore().changeCurrentPipelineSetting({ showDetectionBoxes: value }, false)
"
/>
</div>
</div>
</div>
</template>
16 changes: 15 additions & 1 deletion photon-client/src/types/PipelineTypes.ts
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,13 @@ export interface AprilTagPipelineSettings extends PipelineSettings {
tagFamily: AprilTagFamily;
doMultiTarget: boolean;
doSingleTargetAlways: boolean;
// ML-assisted detection settings
useMLDetection: boolean;
mlConfidenceThreshold: number;
mlNmsThreshold: number;
mlRoiPaddingPixels: number;
model: ObjectDetectionModelProperties;
showDetectionBoxes: boolean;
}
export type ConfigurableAprilTagPipelineSettings = Partial<
Omit<AprilTagPipelineSettings, "pipelineType" | "hammingDist" | "debug">
Expand All @@ -299,7 +306,14 @@ export const DefaultAprilTagPipelineSettings: AprilTagPipelineSettings = {
threads: 4,
tagFamily: AprilTagFamily.Family36h11,
doMultiTarget: false,
doSingleTargetAlways: false
doSingleTargetAlways: false,
// ML-assisted detection defaults
useMLDetection: false,
mlConfidenceThreshold: 0.5,
mlNmsThreshold: 0.45,
mlRoiPaddingPixels: 40,
model: {} as ObjectDetectionModelProperties,
showDetectionBoxes: true
};

export interface ArucoPipelineSettings extends PipelineSettings {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,16 @@ private NeuralNetworkModelsSettings getShippedProperties(File modelsDirectory) {
Family.RKNN,
Version.YOLOV11));

nnProps.addModelProperties(
new ModelProperties(
Path.of(modelsDirectory.getAbsolutePath(), "apriltagV4-yolo11.rknn"),
"AprilTag V4",
new LinkedList<String>(List.of("AprilTag")),
640,
640,
Family.RKNN,
Version.YOLOV11));

nnProps.addModelProperties(
new ModelProperties(
Path.of(modelsDirectory.getAbsolutePath(), "yolov8nCOCO.tflite"),
Expand All @@ -189,6 +199,16 @@ private NeuralNetworkModelsSettings getShippedProperties(File modelsDirectory) {
Family.RUBIK,
Version.YOLOV11));

nnProps.addModelProperties(
new ModelProperties(
Path.of(modelsDirectory.getAbsolutePath(), "apriltagV4-yolo11.tflite"),
"AprilTag V4",
new LinkedList<String>(List.of("AprilTag")),
640,
640,
Family.RUBIK,
Version.YOLOV11));

return nnProps;
}

Expand Down Expand Up @@ -311,6 +331,54 @@ public Optional<Model> getDefaultModel() {
return models.get(supportedBackends.get(0)).stream().findFirst();
}

/**
* Gets the default AprilTag detection model. Searches for a model with "apriltag" in its nickname
* (case-insensitive).
*
* @return Optional containing the AprilTag model if found
*/
public Optional<Model> getDefaultAprilTagModel() {
if (models == null || supportedBackends.isEmpty()) {
return Optional.empty();
}

for (Family backend : supportedBackends) {
if (models.containsKey(backend)) {
Optional<Model> model =
models.get(backend).stream()
.filter(m -> m.getNickname().toLowerCase().contains("apriltag"))
.findFirst();
if (model.isPresent()) {
return model;
}
}
}
return Optional.empty();
}

/**
* Gets a model by its exact nickname.
*
* @param name The nickname of the model to retrieve
* @return Optional containing the model if found
*/
public Optional<Model> getModelByName(String name) {
if (models == null || supportedBackends.isEmpty() || name == null) {
return Optional.empty();
}

for (Family backend : supportedBackends) {
if (models.containsKey(backend)) {
Optional<Model> model =
models.get(backend).stream().filter(m -> m.getNickname().equals(name)).findFirst();
if (model.isPresent()) {
return model;
}
}
}
return Optional.empty();
}

// Do checking later on, when we create the model object
private void loadModel(Path path) {
if (models == null) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@

package org.photonvision.vision.frame;

import java.util.List;
import org.opencv.core.RotatedRect;
import org.photonvision.common.logging.LogGroup;
import org.photonvision.common.logging.Logger;
import org.photonvision.common.util.math.MathUtils;
Expand All @@ -36,6 +38,9 @@ public class Frame implements Releasable {

public final FrameStaticProperties frameStaticProperties;

/** Optional ML detection ROI bounding boxes for visualization. Set by ML-assisted pipelines. */
public List<RotatedRect> mlDetectionRois = List.of();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frame isn't the right place to maintain this state. can it move to the pipeline result?


public Frame(
long sequenceID,
CVMat color,
Expand Down
Loading
Loading