Keyword extraction


Keyword extraction is tasked with the automatic identification of terms that best describe the subject of a document.
Key phrases, key terms, key segments or just keywords are the terminology which is used for defining the terms that represent the most relevant information contained in the document. Although the terminology is different, function is the same: characterization of the topic discussed in a document. The task of keyword extraction is an important problem in Text Mining, Information Retrieval and Natural Language Processing.

Keyword assignment vs. extraction

Keyword assignment methods can be roughly divided into:
Methods for automatic keyword extraction can be supervised, semi-supervised, or unsupervised. Unsupervised methods can be further divided into simple statistics, linguistics or graph-based, or ensemble methods that combine some or most of these methods.