Algorithm selection

Algorithm selection is a meta-algorithmic technique to choose an algorithm from a portfolio on an instance-by-instance basis. It is motivated by the observation that on many practical problems, algorithms have different performances. That is, while one algorithm performs well on some instances, it performs poorly on others and vice versa for another algorithm. If we can identify when to use which algorithm, we can get the best of both worlds and improve overall performance. This is what algorithm selection aims to do. The only prerequisite for applying algorithm selection techniques is that there exists a set of complementary algorithms.

Definition

Given a portfolio of algorithms, a set of instances and a cost metric, the algorithm selection problem consists of finding a mapping from instances to algorithms such that the cost across all instances is optimized.

Examples

Boolean satisfiability problem (and other hard combinatorial problems)

A well-known application of algorithm selection is the Boolean satisfiability problem. Here, the portfolio of algorithms is a set of SAT solvers, the instances are Boolean formulas, the cost metric is for example average runtime or number of unsolved instances. So, the goal is to select a well-performing SAT solver for each individual instance. In the same way, algorithm selection can be applied to many other -hard problems. Competition-winning systems in SAT are SATzilla, 3S and CSHC

Machine learning

In machine learning, algorithm selection is better known as meta-learning. The portfolio of algorithms consists of machine learning algorithms, the instances are data sets and the cost metric is for example the error rate. So, the goal is to predict which machine learning algorithm will have a small error on each data set.

Instance features

The algorithm selection problem is mainly solved with machine learning techniques. By representing the problem instances by numerical features, algorithm selection can be seen as a multi-class classification problem by learning a mapping for a given instance.
Instance features are numerical representations of instances. For example, we can count the number of variables, clauses, average clause length for Boolean formulas, or number of samples, features, class balance for ML data sets to get an impression about their characteristics.

Static vs. probing features

We distinguish between two kinds of features:

Static features are in most cases some counts and statistics. These features ranges from very cheap features to very complex features.
Probing features are computed by running some analysis of algorithm behavior on an instance. These feature often cost more than simple static features.
Feature costs

Depending on the used performance metric, feature computation can be associated with costs.
For example, if we use running time as performance metric, we include the time to compute our instance features into the performance of an algorithm selection system.
SAT solving is a concrete example, where such feature costs cannot be neglected, since instance features for CNF formulas can be either very cheap or very expensive.
It is important to take the overhead of feature computation into account in practice in such scenarios; otherwise a misleading impression of the performance of the algorithm selection approach is created. For example, if the decision which algorithm to choose can be made with perfect accuracy, but the features are the running time of the portfolio algorithms, there is no benefit to the portfolio approach. This would not be obvious if feature costs were omitted.

Approaches

Regression approach

One of the first successful algorithm selection approaches predicted the performance of each algorithm and selected the algorithm with the best predicted performance for an instance.

Clustering approach

A common assumption is that the given set of instances can be clustered into homogeneous subsets
and for each of these subsets, there is one well-performing algorithm for all instances in there.
So, the training consists of identifying the homogeneous clusters via an unsupervised clustering approach and associating an algorithm with each cluster.
A new instance is assigned to a cluster and the associated algorithm selected.
A more modern approach is cost-sensitive hierarchical clustering using supervised learning to identify the homogeneous instance subsets.

Pairwise cost-sensitive classification approach

A common approach for multi-class classification is to learn pairwise models between every pair of classes
and choose the class that was predicted most often by the pairwise models.
We can weight the instances of the pairwise prediction problem by the performance difference between the two algorithms.
This is motivated by the fact that we care most about getting predictions with large differences correct, but the penalty for an incorrect prediction is small if there is almost no performance difference.
Therefore, each instance for training a classification model vs is associated with a cost.

Requirements

The algorithm selection problem can be effectively applied under the following assumptions:

The portfolio of algorithms is complementary with respect to the instance set, i.e., there is no single algorithm that dominates the performance of all other algorithms over .
In some application, the computation of instance features is associated with a cost. For example, if the cost metric is running time, we have also to consider the time to compute the instance features. In such cases, the cost to compute features should not be larger than the performance gain through algorithm selection.
Application domains

Algorithm selection is not limited to single domains but can be applied to any kind of algorithm if the above requirements are satisfied.
Application domains include:

hard combinatorial problems: SAT, Mixed Integer Programming, CSP, AI Planning, TSP, MAXSAT, QBF and Answer Set Programming
combinatorial auctions
in machine learning, the problem is known as meta-learning
software design
black-box optimization
multi-agent systems
numerical optimization
linear algebra, differential equations
evolutionary algorithms
vehicle routing problem
power systems

For an extensive list of literature about algorithm selection, we refer to a literature overview.

Variants of algorithm selection

Online selection

Online algorithm selection refers to switching between different algorithms during the solving process. This is useful as a hyper-heuristic. In contrast, offline algorithm selection selects an algorithm for a given instance only once and before the solving process.

Computation of schedules

An extension of algorithm selection is the per-instance algorithm scheduling problem, in which we do not select only one solver, but we select a time budget for each algorithm on a per-instance base. This approach improves the performance of selection systems in particular if the instance features are not very informative and a wrong selection of a single solver is likely.

Selection of parallel portfolios

Given the increasing importance of parallel computation,
an extension of algorithm selection for parallel computation is parallel portfolio selection,
in which we select a subset of the algorithms to simultaneously run in a parallel portfolio.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...