Can classifier update() be faster than training from scratch? #123

DSA101 · 2016-04-15T18:54:28Z

I am building a dataset and am training NaiveBayesClassifier as the dataset grows. Instead of retraining the classifier every time after adding few new entries, I was hoping to use the update() method just to add new entries and retrain the model with them, in order to cut training time when new data added. What I discovered that loading a pickled trained classifier and updating it just with new entries is not faster than re-training it from scratch. Re-reading the docs they do say that update() "Update the classifier with new training data and re-trains the classifier", which implies re-training on the entire data set...

Question: is there such thing as incremental re-training, or realistically it is processing the entire dataset from scratch, every time I want to update the classifier with new data?

IvRRimum · 2016-06-06T14:52:59Z

Hey, i have a question. How do you save the classifications ? I have sqlite database with string and status, but how to save the classifications itself ?

sloria · 2017-08-16T03:47:09Z

#136 is now merged and released.

DSA101 changed the title ~~Can classifier update() be faster than train from scratch?~~ Can classifier update() be faster than training from scratch? Apr 15, 2016

This was referenced Sep 1, 2016

NaiveBayesClassifier taking too long #63

Closed

Attempting to fix slow NaiveBayes #136

Merged

sloria closed this as completed Aug 16, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can classifier update() be faster than training from scratch? #123

Can classifier update() be faster than training from scratch? #123

DSA101 commented Apr 15, 2016

IvRRimum commented Jun 6, 2016

sloria commented Aug 16, 2017

Can classifier update() be faster than training from scratch? #123

Can classifier update() be faster than training from scratch? #123

Comments

DSA101 commented Apr 15, 2016

IvRRimum commented Jun 6, 2016

sloria commented Aug 16, 2017