-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Replace SFA with SFAFast in REDCOMETS #2418
base: main
Are you sure you want to change the base?
Conversation
Thank you for contributing to
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, could you do a small evaluation on how this impacts the words output and the overall accuracy/speed of the classifier? Some of the smaller UCR datasets should be fine.
May be of interest @zy18811. Don't need to do so if they are fine with the change, but wouldn't want to make a big change without either them knowing or evaluating.
Hi, very cool. I am a bit surprised that there are no changes required on the test cases for REDCOMETS. @MatthewMiddlehurst is there a way to check, if the |
@itsdivya1309 thanks for doing this PR. My (very brief) look suggests that the word outputs are identical between SFA and SFAFast in the case of the whole series transform that REDCOMETS uses so absolutely no difference in accuracy (hence the old test cases still pass). It seems to give a small speedup most of the time, but is sometimes slower - not sure if this is additonal numba overhead or something else. It could do with a little bit more thorough of an analysis to confirm SFAFast is an improvement overall wrt. speed. I also remember last time I looked at doing this there were some issues with particular alphabet sizes and word lengths on some datasets due to how SFA/SFAFast stores the words and numba having a max integer size. I'll have a look and see if I can replicate any of this - but I can see that @patrickzib has made a number of improvements to SFAFast since I last looked so might be fixed anyway. |
@zy18811 the class will still be limited in sizes due to the 64 bit limit for words required in the functions. The original does not have this. |
I've done some quick analysis, 40 datasets 10 resamples, and it looks like SFA Fast has identical accuracy with a speedup. Regarding the numba errors discussed above, I think I ran into the issues when experimenting with different alphabet size ranges but the current range should have no problems I believe. Overall, PR looks good to me - thanks @itsdivya1309. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks. Wonder if the same can be done with TDE?
That is very cool! |
Reference Issues/PRs
Closes: #1742
What does this implement/fix? Explain your changes.
Replace SFA with the faster version, SFAFast.
Does your contribution introduce a new dependency? If yes, which one?
None
Any other comments?
PR checklist
For all contributions