Skip to content

Latest commit

 

History

History
85 lines (71 loc) · 3.74 KB

README.md

File metadata and controls

85 lines (71 loc) · 3.74 KB

Oh, My Trees

Oh, My Trees (ohmt) is a library for hyperplane-based Decision Tree induction, which allows you to induce both Univariate (e.g., CART, C4.5) and Multivariate (OC1, Geometric) Decision Trees. It currently supports single-class classification trees, and does not support categorical variables as they don't play well with hyperplanes.

Quickstart

Installation

Installation through git:

git clone https://github.com/msetzu/oh-my-trees
mkvirtualenv -p python3.11 omt  # optional, creates virtual environment

cd oh-my-trees
pip install -r src/requirements.txt

or directly through pip:

pip install ohmt

Training trees

OMT follows the classic sklearn fit/predict interface.
You can find a full example in the examples notebook notebooks/examples.ipynb.

from ohmt.trees.multivariate import OmnivariateDT

dt = OmnivariateDT()
x = ...
y = ...

# trees all follow a similar sklearn-like training interface, with max_depth, min_samples, and min_eps as available parameters
dt.fit(x, y, max_depth=4)

OMT also offers a pruning toolkit, handled by trees.pruning.Gardener, which allows you to prune the inducted Tree. Find out more in the example notebook.

Induction algorithms

OMT offers several Tree induction algorithms

Algorithm Type Reference Info
C4.5 Univariate
CART Univariate
DKM Univariate
OC1 Multivariate Paper
Geometric Multivariate Paper Only traditional SVM cut
Omnivariate Multivariate Test all possible splits, pick the best one
Model tree Multivariate Paper
Linear tree Multivariate Paper
Optimal trees* Multivariate Paper Mirror of Interpretable AI's implementation

*As mirror of Interpretable AI's implementation, you need to install the appropriate license to use Optimal trees

Using Trees

You can get an explicit view of a tree by accessing:

  • tree.nodes: Dict[int, Node] its nodes,
  • tree.parent: Dict[int, int], tree.ancestors: Dict[int, List[int]] its parent and ancestors,
  • tree.descendants: Dict[int, List[int] its descendants,
  • tree.depth: Dict[int, int]: the depth of its nodes.

Trees can also be JSONized:

tree.json()

Growing your own Tree

Greedy trees follow the basic algorithmic core of

  • learning step: induce a node
  • if shall continue:
    • generate two children
    • recurse on the given children

We incorporate this algorithm in Tree, where step implements the node induction, thus, most greedy induction algorithms can implemented by simply overriding the step function:

    def step(self, parent_node: Optional[InternalNode],
             data: numpy.ndarray, labels: numpy.ndarray, classes: numpy.ndarray,
             direction: Optional[str] = None, depth: int = 1,
             min_eps: float = 0.000001, max_depth: int = 16, min_samples: int = 10,
             node_fitness_function: Optional[Callable] = None,
             node_hyperparameters: Optional[Dict] = None, **step_hyperparameters) -> Optional[Node]