feat(data-processing): add isolation forest anomaly detection#468
Open
David-patrick-chuks wants to merge 3 commits intoPulsefy:mainfrom
Open
feat(data-processing): add isolation forest anomaly detection#468David-patrick-chuks wants to merge 3 commits intoPulsefy:mainfrom
David-patrick-chuks wants to merge 3 commits intoPulsefy:mainfrom
Conversation
Contributor
|
Please resolve conflicts |
…-forest # Conflicts: # apps/data-processing/src/main.py
bac686d to
296973e
Compare
Author
|
@Cedarich Done |
Contributor
|
Kindly fix workflow |
Contributor
|
Kindly fix failing workflow |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaced the existing Z-score-only anomaly detector with an ML-based approach using
scikit-learn's Isolation Forest inapps/data-processing/src/anomaly_detector.py. The update keeps the legacy Z-score logic available for comparison, adds configurablecontaminationsupport, and updates the pipeline flow so new samples are evaluated against the existing rolling baseline before being added.To preserve prior spike-detection behavior while meeting the new ML requirement, the detector now uses Isolation Forest as the primary multi-feature model and still surfaces strong legacy Z-score anomalies as part of the final decision. This allows comparison against the old logic while avoiding regressions on obvious spike cases.
Linked Issue
Closes #453
Type of Change
Validation
Commands Run
Results
python3 -m compileall apps/data-processing/src/anomaly_detector.py apps/data-processing/src/main.py apps/data-processing/src/scheduler.pycompleted successfully.python -m pytest --versionreturnedpytest 9.0.2.python -m pytest tests/test_anomaly_detector.pywas executed locally.Screenshots / Test Evidence
Attach terminal screenshots here.
[attach here][attach here][attach here]Documentation
Documentation note: N/A for separate docs files; this is a backend/data-processing change.
Screenshots/videos note: N/A for UI, but terminal validation screenshots are attached in the Validation section.
Checklist
feat/,fix/, ordocs/