-
Notifications
You must be signed in to change notification settings - Fork 299
AI-Based AprilTag Pipeline Acceleration #2410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
DoctorFogarty
wants to merge
31
commits into
PhotonVision:main
Choose a base branch
from
DoctorFogarty:apriltag-ml-experimental-sync
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
3a5dd04
Created Apriltag ML assisted Apriltag settings
DoctorFogarty d05a07c
Small change to stop spamming logs every time a pipe setting check oc…
DoctorFogarty d5834a7
remove duplicated .tflite, remove tool-versions.yaml
DoctorFogarty f839059
Homography transform added to requirements
DoctorFogarty 385efbe
Match existing pattern for settingsStore
DoctorFogarty bb6226e
Minor comment cleanup, added source umich documentation to where matr…
DoctorFogarty 7242710
Unit testing homography transformation
DoctorFogarty dd405a2
ROI Decimate should always be 1
DoctorFogarty c1d3558
testing enhancements to ROI size fallback
DoctorFogarty bab0fdd
Thread pooling
DoctorFogarty fcb648e
Revert "testing enhancements to ROI size fallback"
DoctorFogarty caf1ddb
Adaptive Tag Resizing to fix poor near field performance
DoctorFogarty 0a7a283
AprilTags are Y down, UI selector
DoctorFogarty 032a635
Removed Subpix refinement as it was not well informed. Removed auto p…
DoctorFogarty 6188bc1
feat: Model selector on AprilTag screen after choosing AI Acceleration
DoctorFogarty da50228
Slightly higher atr
DoctorFogarty 33d48bf
AprilTag Pipeline ROI box viewer
DoctorFogarty b5d40e7
Additive Pixel Padding strategy change
DoctorFogarty e0a72ca
Synced Changes from 2026.3.2
judsonjames ff5f586
Add TFLite and RKNN models to source for review
judsonjames 6fb5e95
Removed old model weights.
DoctorFogarty 3ff9b45
Added a configurable multi-tag estimate ambiguity filter
DoctorFogarty fd40986
lint
samfreund f2565b3
Move `MLDetectionResult` to standalone record
judsonjames bd602f3
WPIFormat on MLDetectionResult
judsonjames a609bcf
backend 27 migratoin
samfreund 588de64
migrate to proper model references
samfreund 7db30e1
Remove the single tag high ambiguity rejection controls from multi-ta…
DoctorFogarty 63f6848
Make the ML AprilTag pipeline use the UI settings instead of hard cod…
DoctorFogarty 3da4a13
Removed fallback to traditional pipeline flag since it's existance wa…
DoctorFogarty 2257007
Documentation
DoctorFogarty File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -4,6 +4,7 @@ | |
| about-apriltags | ||
| detector-types | ||
| 2D-tracking-tuning | ||
| ml-tag | ||
| 3D-tracking | ||
| multitag | ||
| coordinate-systems | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| # ML-Tag (ML AprilTag Acceleration) | ||
|
|
||
| ## How does it work? | ||
|
|
||
| ML-Tag is an acceleration mode within the AprilTag pipeline. Instead of running the AprilTag detector across the full camera frame, an ML model running on the coprocessor's NPU first proposes bounding-box regions where AprilTags appear to be in the image, and the standard AprilTag decoder then runs only inside those regions. On supported hardware this reduces the work the classical detector has to do, leaving more headroom for higher resolutions, higher framerates, and lower latency. | ||
|
|
||
| The ML model is purely a localizer: it answers "is there an AprilTag here?" and nothing else. It does not read the tag's ID, identify its family, recover its orientation, or estimate its pose — all of that is still done by the same classical detector used in the standard AprilTag pipeline. ML-Tag is therefore transparent to your robot code: detections come out as the same targets, and {ref}`3D Tracking <docs/apriltag-pipelines/3D-tracking:3D Tracking>` and {ref}`MultiTag <docs/apriltag-pipelines/multitag:MultiTag Localization>` work without any changes. | ||
|
|
||
| ## Hardware Requirements | ||
|
|
||
| ML-Tag runs on the same NPUs PhotonVision uses for object detection: | ||
|
|
||
| - RK3588 boards — Orange Pi 5 / 5 Plus, Rock 5C, CoolPi 4B (RKNN backend) | ||
| - QCS6490 boards — Rubik Pi 3 (TFLite backend) | ||
|
|
||
| For installation, model conversion, and other platform-specific notes, see the coprocessor pages used for object detection: {ref}`Orange Pi 5 <docs/objectDetection/opi:Orange Pi 5 (and variants) Object Detection>` and {ref}`Rubik Pi 3 <docs/objectDetection/rubik:Rubik Pi 3 Object Detection>`. | ||
|
|
||
| :::{note} | ||
| The ML-Tag controls only appear in the AprilTag pipeline tab when a supported NPU backend is detected. On other coprocessors the AprilTag pipeline runs in its standard CPU-only configuration. | ||
| ::: | ||
|
|
||
| ## Enabling ML-Tag | ||
|
|
||
| In the AprilTag pipeline tab, scroll past the standard AprilTag tuning settings to the "ML-Tag (ML AprilTag Acceleration)" section. Toggle **Enable ML-Tag**, and the remaining ML-Tag controls will appear below. | ||
|
|
||
| ## Tuning ML-Tag | ||
|
|
||
| The standard AprilTag tuning parameters described in {ref}`2D AprilTag Tuning / Tracking <docs/apriltag-pipelines/2D-tracking-tuning:Tuning AprilTags>` still apply — they control how the detector decodes each region the ML model hands it. The settings below only control the ML localizer stage. | ||
|
|
||
| ### Model | ||
|
|
||
| Selects which ML model the NPU uses to locate AprilTag regions in the image. PhotonVision ships with a YoloV11 AprilTag model for each supported NPU; any compatible models you have uploaded will also appear here. Only models compatible with the detected NPU backend are listed. | ||
|
|
||
| ### Confidence Threshold | ||
|
|
||
| The minimum confidence score (between 0 and 1) the ML model must report for a proposed region before it is passed to the classical decoder. Higher values reject more weak proposals — good for cutting false positives, but at the risk of missing tags the model is less sure about. The default of 0.5 is a reasonable starting point; lower it if tags are being missed and the ROI overlay confirms the model is hesitating on them. | ||
|
|
||
| ### NMS Threshold | ||
|
|
||
| The non-maximum suppression overlap cutoff (between 0 and 1) used to merge overlapping proposals of the same tag. Higher values allow more overlapping boxes through; lower values are stricter about merging duplicates. The default of 0.45 works well in most cases. | ||
|
|
||
| ### ROI Padding (px) | ||
|
|
||
| The number of pixels of padding added around each proposed region before the classical decoder runs on it. Padding is applied in pixels, so it naturally adapts to tag size in the image: small, far-away tags receive proportionally more expansion (helping the decoder see their borders), while large, nearby tags receive less. Raise this if tag corners are being clipped at the edges of the ROIs; lower it if neighboring tags are being merged into a single region. | ||
|
|
||
| ### Show ROI Boxes | ||
|
|
||
| When enabled, draws the ML model's proposed bounding boxes onto the processed stream. This is valuable when tuning Confidence Threshold and ROI Padding — you can see exactly which regions the model is finding and how tightly they fit each tag. Turn this off in competition for a cleaner stream. | ||
|
|
||
| ## Limitations | ||
|
|
||
| - ML-Tag is only available on the supported NPU coprocessors listed above. On any other hardware the AprilTag pipeline behaves in the standard manner. | ||
| - When several overlapping ROIs decode the same tag, only the detection with the highest decision margin is kept. In practice this is the desired behavior, but it does mean alternate solutions from the same tag are not surfaced. | ||
| - The ML model is purely a localizer. If a tag is in the frame but the model fails to propose a region around it, the classical decoder will never see it and the tag will not be detected. When tuning, use **Show ROI Boxes** to confirm the model is finding the tags you care about, and lower **Confidence Threshold** if it is not. | ||
| - The ML model has shown in testing to be able to identify AprilTag at distances where there is not enough pixel density from the region of image to makeout what specific Tag the detector is finding. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frame isn't the right place to maintain this state. can it move to the pipeline result?