Skip to content

0.8.0

Compare
Choose a tag to compare
@steppi steppi released this 16 Nov 00:20
· 93 commits to master since this release

This release fixes several bugs and makes some small updates.

Fixes have been made for

  1. A bug in AdeftMiner.prune that broke this method but was undiscovered due to lack of testing. The bug has been fixed and a test has been added.
  2. Training adeft models throwing an error for the edge case where there are more than two labels with only one positive label.
  3. The longform scorer throwing an error when there are punctuation characters in the shortform.
  4. The GUI not working when the multiprocessing start method is set to spawn. This caused the GUI to fail on windows, where fork is unavailable. This should resolve issue #49.
  5. The deprecated parameter iid has been removed from internal use of Scikit-learn's GridSearchCV, removing a deprecation warning.

The following other changes have been made

  1. AdeftLabeler now requires unique identifiers along with the texts passed into process_texts. Instead of passing in a list of texts, the process_texts method now takes a list of tuples of the form (text, identifier). The output list now contains tuples of the form (text, label, identifier). This is useful for mapping back from texts in the generated corpus to texts in the input. Texts without defining patterns are filtered out completely and those with defining patterns have the defining patterns replaced with only the shortform, making mapping backwards nontrivial without the identifiers.
  2. Adeft's home folder can now be specified by setting the environment variable ADEFT_HOME in the user's profile. The default is now the hidden folder ".adeft" in the users home directory with subfolders for different adeft versions.
  3. The parameter class_weight from Scikit-learn's implementation of logistic regression is now exposed as a parameter of AdeftClassifier. This allows for provided different weights in the loss function for different class labels.