Dialogue system


A dialogue system, or conversational agent, is a computer system intended to converse with a human. Dialogue systems employed one or more of text, speech, graphics, haptics, gestures, and other modes for communication on both the input and output channel.
The elements of a dialogue system are not defined, however they are different from chatbot.. The typical GUI wizard engages in a sort of dialog, but it includes very few of the common dialogue system components, and dialog state is trivial.

Background

After dialogue systems based only on written text processing starting from the early Sixties, the first speaking dialogue system was issued by the DARPA Project in the USA in 1977. After the end of this 5-year project, some European projects issued the first dialogue system able to speak many languages. Those first systems were used in the telecom industry to provide phone various services in specific domains, e.g. automated agenda and train tables service.

Components

What sets of components are included in a dialogue system, and how those components divide up responsibilities differs from system to system. Principal to any dialogue system is the dialog manager, which is a component that manages the state of the dialog, and dialog strategy. A typical activity cycle in a dialogue system contains the following phases:
  1. The user speaks, and the input is converted to plain text by the system's input recognizer/decoder, which may include:
  2. * automatic speech recognizer
  3. * gesture recognizer
  4. * handwriting recognizer
  5. The text is analyzed by a Natural language understanding unit, which may include:
  6. *Proper Name identification
  7. * part of speech tagging
  8. * Syntactic/semantic parser
  9. The semantic information is analyzed by the dialog manager, that keeps the history and state of the dialog and manages the general flow of the conversation.
  10. Usually, the dialog manager contacts one or more task managers, that have knowledge of the specific task domain.
  11. The dialog manager produces output using an output generator, which may include:
  12. * natural language generator
  13. * gesture generator
  14. * layout manager
  15. Finally, the output is rendered using an output renderer, which may include:
  16. * text-to-speech engine
  17. * talking head
  18. * robot or avatar
Dialogue systems that are based on a text-only interface contain only stages 2–5.

Types of systems

Dialogue systems fall into the following categories, which are listed here along a few dimensions. Many of the categories overlap and the distinctions may not be well established.
"A Natural Dialogue System is a form of dialogue system that tries to improve usability and user satisfaction by imitating human behaviour" . It addresses the features of a human-to-human dialog and aims to integrate them into dialogue systems for human-machine interaction. Often, dialogue systems require the user to adapt to the system because the system is only able to understand a very limited vocabulary, is not able to react on topic changes, and does not allow the user to influence the dialogue flow. Mixed-initiative is a way to enable the user to have an active part in the dialogue instead of only answering questions. However, the mere existence of mixed-initiative is not sufficient to be classified as natural dialogue system. Other important aspects include:
Although most of these aspects are issues of many different research projects, there is a lack of tools that support the development of dialogue systems addressing these topics. Apart from VoiceXML that focuses on interactive voice response systems and is the basis for many spoken dialogue systems in industry and AIML that is famous for the A.L.I.C.E. chatbot, none of these integrate linguistic features like dialog acts or language generation. Therefore, gives an idea how to fill that gap and combines some of the aforementioned aspects like natural language generation, adaptive formulation and sub dialogues.

Performance

Some authors measure the dialogue system's performance in term of percentage of sentences completely right, by comparing model of sentences.

Applications

Dialogue systems can support a broad range of applications in business enterprises, education, government, healthcare, and entertainment. For example:
In some cases, conversational agents can interact with users using artificial characters. These agents are then referred to as embodied agents.

Toolkits and architectures

A survey of current frameworks, languages and technologies for defining dialogue systems.
Name & LinksSystem TypeDescriptionAffiliationEnvironmentComments
AIMLChatterbot languageXML dialect for creating natural language software agentsRichard Wallace, Pandorabots, Inc.
ChatScriptChatterbot languageLanguage/Engine for creating natural language software agentsBruce Wilcox
CSLU Toolkit
a state-based speech interface prototyping environmentOGI School of Science and Engineering
M. McTear
Ron Cole
are from 1999.
Domain-independent toolkitcomplete multilingual framework for building natural language user interface systemsLinguaSysout-of-box support of mixed-initiative dialogs
Olympuscomplete framework for implementing spoken dialogue systemsCarnegie Mellon University
Multimodal PlatformPlatform for developing multimodal software applications. Based on State Chart XML Ponvia Technology, Inc.
VXML
Voice XML
Spoken dialogmultimodal dialog markup languagedeveloped initially by AT&T then administered by an industry consortium and finally a W3C specificationExampleprimarily for telephony.
SALTmarkup languagemultimodal dialog markup languageMicrosoft"has not reached the level of maturity of VoiceXML in the standards process".
Quack.com - QXMLDevelopment Environmentcompany bought by AOL
Domain-independent toolkithybrid symbolic/statistical framework for spoken dialogue systems, implemented in JavaUniversity of Oslo
dialog engine and dialog modelingCreating natural dialogs / dialogue systems. Supports dialogue acts, mixed initiative, NLG. Implemented in Java.Markus M. Bergcreate XML-based dialog files, no need to specify grammars, publications are from 2014