Inductive logic programming

Inductive logic programming is a subfield of symbolic artificial intelligence which uses logic programming as a uniform representation for examples, background knowledge and hypotheses. Given an encoding of the known background knowledge and a set of examples represented as a logical database of facts, an ILP system will derive a hypothesised logic program which entails all the positive and none of the negative examples.

Schema: positive examples + negative examples + background knowledge ⇒ hypothesis.

Inductive logic programming is particularly useful in bioinformatics and natural language processing. Gordon Plotkin and Ehud Shapiro laid the initial theoretical foundation for inductive machine learning in a logical setting. Shapiro built their first implementation in 1981: a Prolog program that inductively inferred logic programs from positive and negative examples. The term Inductive Logic Programming was first introduced in a paper by Stephen Muggleton in 1991. Muggleton also founded the annual international conference on Inductive Logic Programming, introduced the theoretical ideas of Predicate Invention, Inverse resolution, and Inverse entailment. Muggleton implemented Inverse entailment first in the PROGOL system. The term "inductive" here refers to philosophical rather than mathematical induction.

Formal definition

The background knowledge is given as a logic theory, commonly in the form of Horn clauses used in logic programming.
The positive and negative examples are given as a conjunction and of unnegated and negated ground literals, respectively.
A correct hypothesis is a logic proposition satisfying the following requirements.
"Necessity" does not impose a restriction on, but forbids any generation of a hypothesis as long as the positive facts are explainable without it.
"Sufficiency" requires any generated hypothesis to explain all positive examples.
"Weak consistency" forbids generation of any hypothesis that contradicts the background knowledge.
"Strong consistency" also forbids generation of any hypothesis that is inconsistent with the negative examples, given the background knowledge ; it implies "Weak consistency"; if no negative examples are given, both requirements coincide. Džeroski requires only "Sufficiency" and "Strong consistency".

Example

The following well-known example about learning definitions of family relations uses the abbreviations
It starts from the background knowledge
the positive examples
and the trivial proposition
to denote the absence of negative examples.
Plotkin's "relative least general generalization " approach to inductive logic programming shall be used to obtain a suggestion about how to formally define the daughter relation.
This approach uses the following steps.

Relativize each positive example literal with the complete background knowledge:
:,
Convert into clause normal form:
:,
Anti-unify each compatible pair of literals:
* from and,
* from and,
* from and,
* from and, similar for all other background-knowledge literals
* from and, and many more negated literals
Delete all negated literals containing variables that don't occur in a positive literal:
*after deleting all negated literals containing other variables than, only remains, together with all ground literals from the background knowledge
Convert clauses back to Horn form:
*

The resulting Horn clause is the hypothesis obtained by the rlgg approach. Ignoring the background knowledge facts, the clause informally reads " is called a daughter of if is the parent of and is female", which is a commonly accepted definition.
Concerning the [|above] requirements, "Necessity" was satisfied because the predicate doesn't appear in the background knowledge, which hence cannot imply any property containing this predicate, such as the positive examples are.
"Sufficiency" is satisfied by the computed hypothesis, since it, together with from the background knowledge, implies the first positive example, and similarly and from the background knowledge implies the second positive example. "Weak consistency" is satisfied by, since holds in the Herbrand structure described by the background knowledge; similar for "Strong consistency".
The common definition of the grandmother relation, viz., cannot be learned using the above approach, since the variable occurs in the clause body only; the corresponding literals would have been deleted in the 4th step of the approach. To overcome this flaw, that step has to be modified such that it can be parametrized with different literal post-selection heuristics. Historically, the GOLEM implementation is based on the rlgg approach.

Inductive Logic Programming system

Inductive Logic Programming system is a program that takes as an input logic theories and outputs a correct hypothesis wrt theories An algorithm of an ILP system consists of two parts: hypothesis search and hypothesis selection. First a hypothesis is searched with an inductive logic programming procedure, then a subset of the found hypotheses is chosen by a selection algorithm. A selection algorithm scores each of the found hypotheses and returns the ones with the highest score. An example of score function include minimal compression length where a hypothesis with a lowest Kolmogorov complexity has the highest score and is returned. An ILP system is complete iff for any input logic theories any correct hypothesis wrt to these input theories can be found with its hypothesis search procedure.

Hypothesis search

Modern ILP systems like Progol, Hail and Imparo find a hypothesis using the principle of the inverse entailment for theories,, :. First they construct an intermediate theory called a bridge theory satisfying the conditions and. Then as, they generalize the negation of the bridge theory with the anti-entailment. However, the operation of the anti-entailment since being highly non-deterministic is computationally more expensive. Therefore, an alternative hypothesis search can be conducted using the operation of the inverse subsumption instead which is less non-deterministic than anti-entailment.
Questions of completeness of a hypothesis search procedure of specific ILP system arise. For example, Progol's hypothesis search procedure based on the inverse entailment inference rule is not complete by Yamamoto's example. On the other hand, Imparo is complete by both anti-entailment procedure and its extended inverse subsumption procedure.

Implementations

Imparo
MIS by Ehud Shapiro
Warmr

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...