Conjunctive query

In database theory, a conjunctive query is a restricted form of first-order queries using the logical conjunction operator. Many first-order queries can be written as conjunctive queries. In particular, a large part of queries issued on relational databases can be expressed in this way. Conjunctive queries also have a number of desirable theoretical properties that larger classes of queries do not share.

Definition

The conjunctive queries are simply the fragment of first-order logic given by the set of
formulae that can be constructed from atomic formulae using conjunction ∧ and
existential quantification ∃, but not using disjunction ∨, negation ¬,
or universal quantification ∀.
Each such formula can be rewritten into an equivalent formula in prenex normal form, thus this form is usually simply assumed.
Thus conjunctive queries are of the following general form:
with the free variables being called distinguished variables, and the bound variables being called undistinguished variables. are atomic formulae.
As an example of why the restriction to domain independent first-order logic is important, consider, which is not domain independent; see Codd's theorem. This formula cannot be implemented in the select-project-join fragment of relational algebra, and hence should not be considered a conjunctive query.
Conjunctive queries can express a large part of queries, which are frequently issued on relational databases. To give an example, imagine a relational database for storing information about students, their address, the courses they take and their gender. Finding all male students and their addresses who attend a course that is also attended by a female student is expressed by the following conjunctive query:
. ∃ .
attends ∧ gender ∧
attends ∧
gender ∧ lives
Note that since the only entity of interest is the male student and his address, these are the only distinguished variables, while the variables course, student2 are only existentially quantified, i.e. undistinguished.

Fragments

Conjunctive queries without distinguished variables are called boolean conjunctive queries. Conjunctive queries where all variables are distinguished are called equi-join queries, because they are the equivalent, in the relational calculus, of the equi-join queries in the relational algebra.

Relationship to other query languages

Conjunctive queries also correspond to select-project-join queries in relational algebra and to select-from-where queries in SQL in which the where-condition uses exclusively conjunctions of atomic equality conditions, i.e. conditions constructed from column names and constants using no comparison operators other than "=", combined using "and". Notably, this excludes the use of aggregation and subqueries. For example, the above query can be written as an SQL query of the conjunctive query fragment as

select l.student, l.address
from attends a1, gender g1,
attends a2, gender g2,
lives l
where a1.student = g1.student and
a2.student = g2.student and
l.student = g1.student and
a1.course = a2.course and
g1.gender = 'male' and
g2.gender = 'female';

Datalog

Besides their logical notation, conjunctive queries can also be written as Datalog rules. Many authors in fact prefer the following Datalog notation for the query above:

result :- attends, gender,
attends, gender,
lives.

Although there are no quantifiers in this notation, variables appearing in the head of the rule are still implicitly universally quantified, while variables only appearing in the body of the rule are still implicitly existentially quantified.
While any conjunctive query can be written as a Datalog rule, not every Datalog program can be written as a conjunctive query. In fact, only single rules over extensional predicate symbols can be easily rewritten as an equivalent conjunctive query. The problem of deciding whether for a given Datalog program there is an equivalent nonrecursive program is known as the Datalog boundedness problem and is undecidable.

Extensions

Extensions of conjunctive queries capturing more expressive power include:

unions of conjunctive queries, which are equivalent to positive relational algebra
conjunctive queries extended by union and negation, which by Codd's theorem correspond to relational algebra and first-order logic
conjunctive queries with built-in predicates, e.g., arithmetic predicates
conjunctive queries with aggregate functions.

The formal study of all of these extensions is justified by their application in relational databases and is in the realm of database theory.

Complexity

For the study of the computational complexity of evaluating conjunctive queries, two problems have to be distinguished. The first is the problem of evaluating a conjunctive query on a relational database where both the query and the database are considered part of the input. The complexity of this problem is usually referred to as combined complexity, while the complexity of the problem of evaluating a query on a relational database, where the query is assumed fixed, is
called data complexity.
Conjunctive queries are NP-complete with respect to combined complexity, while the data complexity of conjunctive queries is very low, in the parallel complexity class AC0, which is contained in LOGSPACE and thus in polynomial time. The NP-hardness of conjunctive queries may appear surprising, since relational algebra and SQL strictly subsume the conjunctive queries and are thus at least as hard. However, in the usual application scenario, databases are large, while queries are very small, and the data complexity model may be appropriate for studying and describing their difficulty.
The problem of listing all answers to a non-Boolean conjunctive query has been studied in the context of enumeration algorithms, with a characterization of the queries for which enumeration can be performed with linear time preprocessing and constant delay between each solution. Specifically, these are the acyclic conjunctive queries which also satisfy a free-connex condition.

Formal properties

Conjunctive queries are one of the great success stories of database theory in that many interesting problems that are computationally hard or undecidable for larger classes of queries are feasible for conjunctive queries. For example, consider the query containment problem. We write for two database relations of the same schema if and only if each tuple occurring in also occurs in. Given a query and a relational database instance, we write the result relation of evaluating the query on the instance simply as. Given two queries and and a database schema, the query containment problem is the problem of deciding whether for all possible database instances over the input database schema,. The main application of query containment is in query optimization: Deciding whether two queries are equivalent is possible by simply checking mutual containment.
The query containment problem is undecidable for relational algebra and SQL but is decidable and NP-complete for conjunctive queries. In fact, it turns out that the query containment problem for conjunctive queries is exactly the same problem as the query evaluation problem. Since queries tend to be small, NP-completeness here is usually considered acceptable. The query containment problem for conjunctive queries is also equivalent to the constraint satisfaction problem.
An important class of conjunctive queries that have polynomial-time combined complexity are the acyclic conjunctive queries. The query evaluation, and thus query containment, is LOGCFL-complete and thus in polynomial time. Acyclicity of conjunctive queries is a structural property of queries that is defined with respect to the query's hypergraph: a conjunctive query is acyclic if and only if it has hypertree-width 1. For the special case of conjunctive queries in which all relations used are binary, this notion corresponds to the treewidth of the dependency graph of the variables in the query and the conjunctive query is acyclic if and only if its dependency graph is acyclic.
An important generalization of acyclicity is the notion of bounded hypertree-width, which is a measure of how close to acyclic a hypergraph is, analogous to bounded treewidth in graphs. Conjunctive queries of bounded tree-width have LOGCFL combined complexity.
Unrestricted conjunctive queries over tree data have polynomial time combined complexity.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...