Message passing


In computer science, message passing is a technique for invoking behavior on a computer. The invoking program sends a message to a process and relies on that process and its supporting infrastructure to then select and run some appropriate code. Message passing differs from conventional programming where a process, subroutine, or function is directly invoked by name. Message passing is key to some models of concurrency and object-oriented programming.
Message passing is used ubiquitously in modern computer software. It is used as a way for the objects that make up a program to work with each other and as a means for objects and systems running on different computers to interact. Message passing may be implemented by various mechanisms, including channels.

Overview

Message passing is a technique for invoking behavior on a computer. In contrast to the traditional technique of calling a program by name, message passing uses an object model to distinguish the general function from the specific implementations. The invoking program sends a message and relies on the object to select and execute the appropriate code. The justifications for using an intermediate layer essentially falls into two categories: encapsulation and distribution.
Encapsulation is the idea that software objects should be able to invoke services on other objects without knowing or caring about how those services are implemented. Encapsulation can reduce the amount of coding logic and make systems more maintainable. E.g., rather than having IF-THEN statements that determine which subroutine or function to call a developer can just send a message to the object and the object will select the appropriate code based on its type.
One of the first examples of how this can be used was in the domain of computer graphics. There are various complexities involved in manipulating graphic objects. For example, simply using the right formula to compute the area of an enclosed shape will vary depending on if the shape is a triangle, rectangle, ellipse, or circle. In traditional computer programming this would result in long IF-THEN statements testing what sort of object the shape was and calling the appropriate code. The object-oriented way to handle this is to define a class called Shape with subclasses such as Rectangle and Ellipse and then to simply send a message to any Shape asking it to compute its area. Each Shape object will then invoke the subclass's method with the formula appropriate for that kind of object.
Distributed message passing provides developers with a layer of the architecture that provides common services to build systems made up of sub-systems that run on disparate computers in different locations and at different times. When a distributed object is sending a message, the messaging layer can take care of issues such as:

Synchronous message passing

Synchronous message passing occurs between objects that are running at the same time. It is used by object-oriented programming languages such as Java and Smalltalk.
Synchronous messaging is analogous to a synchronous function call; just as the function caller waits until the function completes, the sending process waits until the receiving process completes. This can make synchronous communication unworkable for some applications. For example, large, distributed systems may not perform well enough to be usable. Such large, distributed systems may need to operate while some of their subsystems are down for maintenance, etc.
Imagine a busy business office having 100 desktop computers that send emails to each other using synchronous message passing exclusively. One worker turning off their computer can cause the other 99 computers to freeze until the worker turns their computer back on to process a single email.

Asynchronous message passing

With asynchronous message passing the receiving object can be down or busy when the requesting object sends the message. Continuing the function call analogy, it is like a function call that returns immediately, without waiting for the called function to complete. Messages are sent to a queue where they are stored until the receiving process requests them. The receiving process processes its messages and sends results to a queue for pickup by the original process.
Asynchronous messaging requires additional capabilities for storing and retransmitting data for systems that may not run concurrently, and are generally handled by an intermediary level of software ; a common type being Message-oriented middleware.
The buffer required in asynchronous communication can cause problems when it is full. A decision has to be made whether to block the sender or whether to discard future messages. A blocked sender may lead to deadlock. If messages are dropped, communication is no longer reliable.

Hybrids

Synchronous communication can be built on top of asynchronous communication by using a Synchronizer. For example, the α-Synchronizer works by ensuring that the sender always waits for an acknowledgement message from the receiver. The sender only sends the next message after the acknowledgement has been received. On the other hand, asynchronous communication can also be built on top of synchronous communication. For example, modern microkernels generally only provide a synchronous messaging primitive and asynchronous messaging can be implemented on top by using helper threads.

Distributed objects

Message-passing systems use either distributed or local objects. With distributed objects the sender and receiver may be on different computers, running different operating systems, using different programming languages, etc. In this case the bus layer takes care of details about converting data from one system to another, sending and receiving data across the network, etc. The Remote Procedure Call protocol in Unix was an early example of this. Note that with this type of message passing it is not a requirement that sender nor receiver use object-oriented programming. Procedural language systems can be wrapped and treated as large grained objects capable of sending and receiving messages.
Examples of systems that support distributed objects are: Emerald, ONC RPC, CORBA, Java RMI, DCOM, SOAP,.NET Remoting, CTOS, QNX Neutrino RTOS, OpenBinder and D-Bus. Distributed object systems have been called "shared nothing" systems because the message passing abstraction hides underlying state changes that may be used in the implementation of sending messages.

Message-passing versus calling

Distributed, or asynchronous, message-passing has additional overhead compared to calling a procedure. In message-passing, arguments must be copied to the new message. Some arguments can contain megabytes of data, all of which must be copied and transmitted to the receiving object.
Traditional procedure calls differ from message-passing in terms of memory usage, transfer time and locality. Arguments are passed to the receiver typically by general purpose registers requiring no additional storage nor transfer time, or in a parameter list containing the arguments' addresses. Address-passing is not possible for distributed systems since the systems use separate address spaces.
Web browsers and web servers are examples of processes that communicate by message-passing. A URL is an example of referencing a resource without exposing process internals.
A subroutine call or method invocation will not exit until the invoked computation has terminated. Asynchronous message-passing, by contrast, can result in a response arriving a significant time after the request message was sent.
A message-handler will, in general, process messages from more than one sender. This means its state can change for reasons unrelated to the behavior of a single sender or client process. This is in contrast to the typical behavior of an object upon which methods are being invoked: the latter is expected to remain in the same state between method invocations. In other words, the message-handler behaves analogously to a volatile object.

Mathematical models

The prominent mathematical models of message passing are the Actor model and Pi calculus. In mathematical terms a message is the single means to pass control to an object. If the object responds to the message, it has a method for that message.
Alan Kay has argued that message passing is more important than objects in OOP, and that objects themselves are often over-emphasized. The live distributed objects programming model builds upon this observation; it uses the concept of a distributed data flow to characterize the behavior of a complex distributed system in terms of message patterns, using high-level, functional-style specifications.

Examples