Incremental decision tree

An incremental decision tree algorithm is an online machine learning algorithm that outputs a decision tree. Many decision tree methods, such as C4.5, construct a tree using a complete dataset. Incremental decision tree methods allow an existing tree to be updated using only new individual data instances, without having to re-process past instances. This may be useful in situations where the entire dataset is not available when the tree is updated, the original data set is too large to process or the characteristics of the data change over time.

Applications

On-line learning
Data streams
Concept drift
Data which can be modeled well using a hierarchical model.
Systems where a user-interpretable output is desired.
Methods

Here is a short list of incremental decision tree methods, organized by their parent algorithms.

CART family

is a nonincremental decision tree inducer for both classification and regression problems. developed in the mathematics and statistics communities. CART traces its roots to AID

incremental CART Crawford modified CART to incorporate data incrementally.
ID3/C4.5 family

and C4.5 were developed by Quinlan and have roots in Hunt's Concept Learning System The ID3 family of tree inducers was developed in the engineering and computer science communities.

ID3' was suggested by Schlimmer and Fisher. It was a brute-force method to make ID3 incremental; after each new data instance is acquired, an entirely new tree is induced using ID3.
ID4 could incorporate data incrementally. However, certain concepts were unlearnable, because ID4 discards subtrees when a new test is chosen for a node.
ID5 didn't discard subtrees, but also did not guarantee that it would produce the same tree as ID3.
ID5R output the same tree as ID3 for a dataset regardless of the incremental training order. This was accomplished by recursively updating the tree's subnodes. It did not handle numeric variables, multiclass classification tasks, or missing values.
ID6MDL an extended version of the ID3 or ID5R algorithms.
ITI is an efficient method for incrementally inducing decision trees. The same tree is produced for a dataset regardless of the data's presentation order, or whether the tree is induced incrementally or non incrementally. It can accommodate numeric variables, multiclass tasks, and missing values. Code is available on the web.

note: ID6NB is not incremental.

Other Incremental Learning Systems

There were several incremental concept learning systems that did not build decision trees, but which predated and influenced the development of the earliest incremental decision tree learners, notably ID4. Notable among these was Schlimmer and Granger's STAGGER, which learned disjunctive concepts incrementally. STAGGER was developed to examine concepts that changed over time. Prior to STAGGER, Michalski and Larson investigated an incremental variant of AQ, a supervised system for learning concepts in disjunctive normal form. Experience with these earlier systems and others, to include incremental tree-structured unsupervised learning, contributed to a conceptual framework for evaluating incremental decision tree learners specifically, and incremental concept learning generally, along four dimensions that reflect the inherent tradeoffs between learning cost and quality: cost of knowledge base update, the number of observations that are required to converge on a knowledge base with given characteristics, the total effort that a system exerts, and the quality of the final knowledge base. Some of the historical context in which incremental decision tree learners emerged is given in Fisher and Schlimmer, and which also expands on the four factor framework that was used to evaluate and design incremental learning systems.

VFDT Algorithm

Very Fast Decision Trees learner reduces training time for large incremental data sets by subsampling the incoming data stream.

VFDT
CVFDT can adapt to concept drift, by using a sliding window on incoming data. Old data outside the window is forgotten.
VFDTc extends VFDT for continuous data, concept drift, and application of Naive Bayes classifiers in the leaves.
VFML is a toolkit and available on the web. . It was developed by the creators of VFDT and CVFDT.
EFDT Algorithm

The Extremely Fast Decision Tree learner is statistically more powerful than VFDT, allowing it to learn more detailed trees from less data. It differs from VFDT in the method for deciding when to insert a new branch into the tree. VFDT waits until it is confident that the best available branch is better than any alternative. In contrast, EFDT splits as soon as it is confident that the best available branch is better than the current alternative. Initially, the current alternative is no branch. This allows EFDT to insert branches much more rapidly than VFDT. During incremental learning this means that EFDT can deploy useful trees much sooner than VFDT.
However, the new branch selection method greatly increases the likelihood of selecting a suboptimal branch. In consequence, EFDT keeps monitoring the performance of all branches and will replace a branch as soon as it is confident there is a better alternative.

OLIN and IFN

OLIN
IOLIN - based on Info-Fuzzy Network

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...

Incremental decision tree

Applications

Methods

CART family

ID3/C4.5 family

Other Incremental Learning Systems

VFDT Algorithm

EFDT Algorithm

OLIN and IFN