Social sequence analysis


Social sequence analysis is a special application of sequence analysis, a set of methods that were originally designed in bioinformatics to analyze DNA, RNA, and peptide sequences. Social sequence analysis involves the examination of ordered social processes, ranging from microsocial interaction patterns and interpersonal contact dynamics to the development of social hierarchies and macrosocial temporal patterns. The analysis of such patterns can involve descriptive accounts of sequence patterns, statistical event history analysis, optimal matching analysis, narrative or event structure analysis, and dynamic social network sequencing. After being introduced to the social sciences in the 1980s and a period of slow growth during the 1990s, social sequence methods have become increasingly prevalent.

History

Sequence analysis methods were first imported into the social sciences from the biological sciences by the University of Chicago sociologist Andrew Abbott in the 1980s, and they have since developed in ways that are unique to the social sciences. Scholars in psychology, economics, anthropology, demography, communication, political science, and especially sociology have been using sequence methods ever since.
Psychologists have used those methods to study how the order of information affects learning, and to identify structure in interactions between individuals. In sociology, sequence techniques are most commonly employed in studies of patterns of life-course development, cycles, and life histories. There has been a great deal of work on the sequential development of careers, and there is increasing interest in how career trajectories intertwine with life-course sequences. Many scholars have used sequence techniques to model how work and family activities are linked in household divisions of labor and the problem of schedule synchronization within families. The study of interaction patterns is increasingly centered on sequential concepts, such as turn-taking, the predominance of reciprocal utterances, and the strategic solicitation of preferred types of responses. Social network analysts have begun to turn to sequence methods and concepts to understand how social contacts and activities are enacted in real time, and to model and depict how whole networks evolve. Social network epidemiologists have begun to examine social contact sequencing to better understand the spread of disease.
The use of sequence methods was initially met with criticism by sociologists who objected to the descriptive and data-reducing orientation of early sequence methods, as well as to a lack of fit between bioinformatic sequence methods and uniquely social phenomena. Since 2000 there has been a surge of interest in refining sequence methods, leading to some major improvements in social sequence methods. Many of the methodological developments in social sequence analysis came on the heels of a 2000 special issue devoted to the topic in Sociological Methods & Research, which hosted a debate over methods for comparing sequences. That debate has given rise to several methodological innovations that address limitations of sequence comparison methods that were developed in the 20th century. In a 2006 article in the American Journal of Sociology, David Stark and Balazs Vedre proposed the term "social sequence analysis" to distinguish the approach from bioinformatic sequence analysis. Sociological Methods & Research organized another special issue on social sequence analysis in 2010, leading to what scholars have dubbed the “second wave” of sequence analysis. Scholars who have made important contributions to social sequence analysis theory, techniques, data collection, or software development include Andrew Abbott, Roger Bakeman, Peter Bearman, Benjamin Cornwell, John Gottman, Laurent Lesnard, Christian Brzinsky-Fay and Ulrich Kohler, Cees Elzinga, Jonathan Gershuny, David Heise, Raffaella Piccarreta, and Katherine Stovel.

Theoretical foundations

The analysis of social sequence patterns has foundations in sociological theories that emerged in the middle of the 20th century. Structural theorists argued that society is a system that is characterized by regular patterns. Even seemingly trivial social phenomena are ordered in highly predictable ways. This idea serves as an implicit motivation behind social sequence analysts’ use of optimal matching, clustering, and related methods to identify common “classes” of sequences at all levels of social organization, a form of pattern search. This focus on regularized patterns of social action has become an increasingly influential framework for understanding microsocial interaction and contact sequences, or “microsequences.” This is closely related to Anthony Giddens’s theory of structuration, which holds that social actors’ behaviors are predominantly structured by routines, and which in turn provides predictability and a sense of stability in an otherwise chaotic and rapidly moving social world. This idea is also echoed in Pierre Bourdieu’s concept of habitus, which emphasizes the emergence and influence of stable worldviews in guiding everyday action and thus produce predictable, orderly sequences of behavior. The resulting influence of routine as a structuring influence on social phenomena was first illustrated empirically by Pitirim Sorokin, who led a 1939 study that found that daily life is so routinized that a given person is able to predict with about 75% accuracy how much time they will spend doing certain things the following day. Talcott Parsons’s argument that all social actors are mutually oriented to their larger social systems through social roles also underlies social sequence analysts’ interest in the linkages that exist between different social actors’ schedules and ordered experiences, which has given rise to a considerable body of work on synchronization between social actors and their social contacts and larger communities. All of these theoretical orientations together warrant critiques of the general linear model of social reality, which as applied in most work implies that society is either static or that it is highly stochastic in a manner that conforms to Markov processes. This concern inspired the initial framing of social sequence analysis as an antidote to general linear models. It has also motivated recent attempts to model sequences of activities or events in terms as elements that link social actors in non-linear network structures. This work, in turn, is rooted in Georg Simmel’s theory that experiencing similar activities, experiences, and statuses serves as a link between social actors.

Concepts and methods

Sequence analysis techniques are increasingly used to study a wide variety of sequenced, or ordered, social phenomena. Sequences usually refer to phenomena that are ordered temporally, but a sequence may also reflect spatial order, preference order, hierarchical order, logical order, or other types of order. A variety of techniques have been designed to describe, quantify, and predict these orders.

Step-by-step sequence processes

Numerous texts that are devoted to the examination of social sequence phenomena have been published. Some have focused on describing the structure of stochastic microsocial processes - such as turn-taking during conversations - while others have focused on methods for detecting sequential patterns in social behavior. These approaches usually adopt a "step-by-step" perspective on sequential phenomena by focusing on how a given social act or phenomenon is shaped by a preceding event or experience. In this work, analysts are interested in particular types of social transitions or first-order dependence between states. Here, researchers often rely on methods that are designed to identify causal processes, especially time-series, Markov, and event-history and duration regression methods. Research that adopts step-by-step sequence analysis approaches treats a given transition as the outcome of a short causal chain with antecedent causes. This work therefore provides insight into common relationships between social states or statuses.

Whole sequences

A limitation of step-by-step approaches is that they divorce transitions from the larger chain of events that precede and follow those transitions. To draw on an example from the Chicago school of sociology, Robert E. Park’s “race-relations cycle” argued that conflict often results when two race groups are exposed to each other. Over time, this often gives way to peaceful coexistence. But this is more likely if inter-group relations involve a gradual development from competition and conflict to accommodation and assimilation. It is less likely if relations fluctuate between positive and negative, as this yields instability and distrust. Only by studying the “whole” sequence of relations is it possible to understand the social context in which future social phenomena may unfold. The whole sequence provides a richer historical context than information about a single antecedent state can provide.
Following this logic, social sequence analysts increasingly view social phenomena as components of larger and more gradual processes, which may or may not exhibit tightly linked sequential elements. This area of social sequence analysis has focused on developing data reduction methods that can detect patterns that underlie complex streams of social phenomena. Andrew Abbott argued that sequence alignment methods in biology and information theory and computer science provided useful models. Both fields had developed combinations of sequence alignment operations to facilitate the comparison of whole sequences. Social scientists adapted these methods in the form of optimal matching analysis, often in conjunction with cluster analysis techniques to aid in the identification of common sequence pattern classes. Sequence alignment methods were first used by social scientists with the goal of identifying commonalities among sequence patterns, categorizing individuals with respect to the classes or “types” of sequences they exhibited. In recent years, sequence analysts have begun to turn to social network analysis methods to depict and measure complex sequence concepts and processes, such as the manner in which sequence states link different social actors together.
These techniques have proved valuable in a variety of contexts. In life-course research, for example, research has shown that retirement plans are affected not just by the last year or two of one’s life, but instead how one’s work and family careers unfolded over a period of several decades. People who followed an “orderly” career path retired earlier than others, including people who had intermittent careers, those who entered the labor force late, as well as those who enjoyed regular employment but who made numerous lateral moves across organizations throughout their careers. In the field of economic sociology, research has shown that firm performance depends not just on a firm’s current or recent social network connectedness, but also the durability or stability of their connections to other firms. Firms that have more “durably cohesive” ownership network structures attract more foreign investment than less stable or poorly connected structures. Research has also used data on everyday work activity sequences to identify classes of work schedules, finding that the timing of work during the day significantly affects workers' abilities to maintain connections with the broader community, such as through community events. More recently, social sequence analysis has been proposed as a meaningful approach to study trajectories in the domain of creative enterprise, allowing the comparison among the idiosyncrasies of unique creative careers. While other methods for constructing and analyzing whole sequence structure have been developed during the past three decades, including event structure analysis, OM and other sequence comparison methods form the backbone of research on whole sequence structures.

''Measures''

Social sequence analysts are interested in a number of quantifiable properties of ordered social processes, and measured have been developed to reflect these. They include measures of stochasticity or sequential connection, the extent to which states are logically prerequisites of each other, stationarity, the presence of spells, and homogeneity.

''Visualization''

An empirical strength of sequence analysis is its emphasis on methods for visualizing otherwise seemingly overly complex social phenomena. A variety of visual aids – especially graphs and network diagrams – make it easier to detect sequence patterns. One visual aid, known as a transition plot, replace the numbers in the cells of a transition matrix with a visual symbol that reflects the magnitude of the relationship between two given states or phenomena. In this kind of graph, the symbols size or shape varies with the corresponding transition probabilities. Transitions that occur within a set of sequences can also be depicted using a network-like diagram called a state transition diagram, which displays elements as nodes in a network. This way, relationships between elements can be emphasized using graphical aids, such as by adjusting the thickness of lines between states. Transition plots and state transition diagrams are useful for depicting patterns of first-order transitions. They do not provide information about when transitions occur or overall sequence patterns. One visual aid that is useful in both of these respects is the sequence index plot, an example of which is provided on the right side of this page. This kind of graph displays every sequence in the sample. The y-axis includes all of the observations, stacked on top of each other. The x-axis depicts the sequence positions in order. The observations in the sequence index plot are arranged such that cases with the same sequence order are grouped adjacent to each other on the y-axis. A similar graph, called the state distribution graph, can be used to simplify the patterns that are latent in sequence index plots. Like sequence index plots, state distribution graphs array sequence positions in order along the x-axis. The main difference is that the y-axis contains not individual cases, but the prevalence of each element at each position on the x-axis. A special type of state distribution graph is the tempogram, which is designed specifically for temporally ordered sequence data. Finally, sequences are often depicted as networks, in which multiple subjects’ sequences are shown to intersect with each other art specific events or instances. This approach is most common in analyses of sequence networks, especially narrative networks.

Institutional Development

The first international conference dedicated to social-scientific research that uses sequence analysis methods – the Lausanne Conference on Sequence Analysis, or LaCOSA – was held in Lausanne, Switzerland in June 2012. A second conference was held in Lausanne in June 2016. The Sequence Analysis Association was founded at the International Symposium on Sequence Analysis and Related Methods, in October 2018 at Monte Verità, TI, Switzerland. The SAA is an international organization whose goal is to organize events such as symposia and training courses and related events, and to facilitate scholars' access to sequence analysis resources.