(Terence teaches in University of San Francisco's MS in Data Science program and Prince is an alumnus. You might know Terence as the creator of the AN

1 How to visualize decision trees

submited by
Style Pass
2021-09-28 09:30:04

(Terence teaches in University of San Francisco's MS in Data Science program and Prince is an alumnus. You might know Terence as the creator of the ANTLR parser generator.)

Update July 2020 Tudor Lapusan has become a major contributor to dtreeviz and, thanks to his work, dtreeviz can now visualize XGBoost and Spark decision trees as well as sklearn. Beyond what is described in this article, the library now also includes the following features. See dtreeviz_sklearn_visualisations.ipynb for examples.

Decision trees are the fundamental building block of gradient boosting machines and Random Forests™, probably the two most popular machine learning models for structured data. Visualizing decision trees is a tremendous aid when learning how these models work and when interpreting models. Unfortunately, current visualization packages are rudimentary and not immediately helpful to the novice. For example, we couldn't find a library that visualizes how decision nodes split up the feature space. It is also uncommon for libraries to support visualizing a specific feature vector as it weaves down through a tree's decision nodes; we could only find one image showing this.

So, we've created a general package for scikit-learn decision tree visualization and model interpretation, which we'll be using heavily in an upcoming machine learning book (written with Jeremy Howard). Here's a sample visualization for a tiny decision tree (click to enlarge):

Leave a Comment