If you are building Naive Bayes classifiers using packages such as NLTK, you may notice that if you have a large training set that it can take hours to run. In order to not lose these results between work sessions, you can save the results of your classifier training to a disk file using the pickle commands list further below.
This may be useful if you ‘train’ your classifier on one day and then want to use it to ‘predict’ classification results on a new test data set on another day. You can just re-load your classifier from disk into memory without having to re-build it. Note: these pickle file sizes may get extremely large.
The examples below store a Python classifier object called ‘classifier’ into a pickle file called ‘my_classifier.pickle’.
To Save
import pickle
f = open('my_classifier.pickle', 'wb')
pickle.dump(classifier, f)
f.close()
To Load Later
import pickle
f = open('my_classifier.pickle')
classifier = pickle.load(f)
f.close()
Brought to you by Tank Brigade.