Active learning (machine learning)


Active learning is a special case of machine learning in which a learning algorithm can interactively query a user to label new data points with the desired outputs. In statistics literature, it is sometimes also called optimal experimental design. The information source is also called teacher or oracle.
There are situations in which unlabeled data is abundant but manual labeling is expensive. In such a scenario, learning algorithms can actively query the user/teacher for labels. This type of iterative supervised learning is called active learning. Since the learner chooses the examples, the number of examples to learn a concept can often be much lower than the number required in normal supervised learning. With this approach, there is a risk that the algorithm is overwhelmed by uninformative examples. Recent developments are dedicated to multi-label active learning, hybrid active learning and active learning in a single-pass context, combining concepts from the field of machine learning with adaptive, incremental learning policies in the field of online machine learning.

Definitions

Let be the total set of all data under consideration. For example, in a protein engineering problem, would include all proteins that are known to have a certain interesting activity and all additional proteins that one might want to test for that activity.
During each iteration,, is broken up into three subsets
  1. : Data points where the label is known.
  2. : Data points where the label is unknown.
  3. : A subset of that is chosen to be labeled.
Most of the current research in active learning involves the best method to choose the data points for.

Scenarios

Algorithms for determining which data points should be labeled can be organized into a number of different categories, based upon their purpose:
A wide variety of algorithms have been studied that fall into these categories.

Minimum Marginal Hyperplane

Some active learning algorithms are built upon support-vector machines and exploit the structure of the SVM to determine which data points to label. Such methods usually calculate the margin,, of each unlabeled datum in and treat as an -dimensional distance from that datum to the separating hyperplane.
Minimum Marginal Hyperplane methods assume that the data with the smallest are those that the SVM is most uncertain about and therefore should be placed in to be labeled. Other similar methods, such as Maximum Marginal Hyperplane, choose data with the largest. Tradeoff methods choose a mix of the smallest and largest s.

Meetings