Moderation (statistics)

In statistics and regression analysis, moderation occurs when the relationship between two variables depends on a third variable. The third variable is referred to as the moderator variable or simply the moderator. The effect of a moderating variable is characterized statistically as an interaction; that is, a categorical or quantitative variable that affects the direction and/or strength of the relation between dependent and independent variables. Specifically within a correlational analysis framework, a moderator is a third variable that affects the zero-order correlation between two other variables, or the value of the slope of the dependent variable on the independent variable. In analysis of variance terms, a basic moderator effect can be represented as an interaction between a focal independent variable and a factor that specifies the appropriate conditions for its operation.

Example

Moderation analysis in the behavioral sciences involves the use of linear multiple regression analysis or causal modelling. To quantify the effect of a moderating variable in multiple regression analyses, regressing random variable Y on X, an additional term is added to the model. This term is the interaction between X and the proposed moderating variable.
Thus, for a response Y and two variables x₁ and moderating variable x₂,:
In this case, the role of x₂ as a moderating variable is accomplished by evaluating b₃, the parameter estimate for the interaction term. See linear regression for discussion of statistical evaluation of parameter estimates in regression analyses.

Multicollinearity in moderated regression

In moderated regression analysis, a new interaction predictor is calculated. However, the new interaction term will be correlated with the two main effects terms used to calculate it. This is the problem of multicollinearity in moderated regression. Multicollinearity tends to cause coefficients to be estimated with higher standard errors and hence greater uncertainty.
Mean-centering has been suggested as a remedy for multicollinearity. However, mean-centering is unnecessary in any regression analysis, as one uses a correlation matrix and the data are already centered after calculating correlations. Correlations are derived from the cross-product of two standard scores or statistical moments. Also see the article by Kromrey & Foster-Johnson on "Mean-centering in Moderated Regression: Much Ado About Nothing".

Post-hoc probing of interactions

Like simple main effect analysis in ANOVA, in post-hoc probing of interactions in regression, we are examining the simple slope of one independent variable at the specific values of the other independent variable. Below is an example of probing two-way interactions.
In what follows the regression equation with two variables A and B and an interaction term A*B,
will be considered.

Two categorical independent variables

If both of the independent variables are categorical variables, we can analyze the results of the regression for one independent variable at a specific level of the other independent variable. For example, suppose that both A and B are single dummy coded variables, and that A represents ethnicity and B represents the condition in the study. Then the interaction effect shows whether the effect of condition on the dependent variable Y is different for European Americans and East Asians and whether the effect of ethnic status is different for the two conditions.
The coefficient of A shows the ethnicity effect on Y for the control condition, while the coefficient of B shows the effect of imposing the experimental condition for European American participants.
To probe if there is any significant difference between European Americans and East Asians in the experimental condition, we can simply run the analysis with the condition variable reverse-coded, so that the coefficient for ethnicity represents the ethnicity effect on Y in the experimental condition. In a similar vein, if we want to see whether the treatment has an effect for East Asian participants, we can reverse code the ethnicity variable.

One categorical and one continuous independent variable

If the first independent variable is a categorical variable and the second is a continuous variable, then b₁ represents the difference in the dependent variable between males and females when life satisfaction is zero. However, a zero score on the Satisfaction With Life Scale is meaningless as the range of the score is from 7 to 35. This is where centering comes in. If we subtract the mean of the SWLS score for the sample from each participant's score, the mean of the resulting centered SWLS score is zero. When the analysis is run again, b₁ now represents the difference between males and females at the mean level of the SWLS score of the sample.
Cohen et al. recommended using the following to probe the simple effect of gender on the dependent variable at three levels of the continuous independent variable: high, moderate, and low. If the scores of the continuous variable are not standardized, one can just calculate these three values by adding or subtracting one standard deviation of the original scores; if the scores of the continuous variable are standardized, one can calculate the three values as follows: high = the standardized score minus 1, moderate, low = the standardized score plus 1. Then one can explore the effects of gender on the dependent variable at high, moderate, and low levels of the SWLS score. As with two categorical independent variables, b₂ represents the effect of the SWLS score on the dependent variable for females. By reverse coding the gender variable, one can get the effect of the SWLS score on the dependent variable for males.

Coding in moderated regression

When treating categorical variables such as ethnic groups and experimental treatments as independent variables in moderated regression, one needs to code the variables so that each code variable represents a specific setting of the categorical variable. There are three basic ways of coding: Dummy-variable coding, Effects coding, and Contrast coding. Below is an introduction to these coding systems.
Dummy coding is used when one has a reference group or one condition in particular that is to be compared to each of the other experimental groups. In this case, the intercept is the mean of the reference group, and each of the unstandardized regression coefficients is the difference in the dependent variable between one of the treatment groups and the mean of the reference group. This coding system is similar to ANOVA analysis, and is appropriate when researchers have a specific reference group and want to compare each of the other groups with it.
Effects coding is used when one does not have a particular comparison or control group and does not have any planned orthogonal contrasts. The intercept is the grand mean. The regression coefficient is the difference between the mean of one group and the mean of all the group means. This coding system is appropriate when the groups represent natural categories.
Contrast coding is used when one has a series of orthogonal contrasts or group comparisons that are to be investigated. In this case, the intercept is the unweighted mean of the individual group means. The unstandardized regression coefficient represents the difference between the unweighted mean of the means of one group and the unweighted mean of another group, where A and B are two sets of groups in the contrast. This coding system is appropriate when researchers have an a priori hypothesis concerning the specific differences among the group means.

Two continuous independent variables

If both of the independent variables are continuous, it is helpful for interpretation to either center or standardize the independent variables, X and Z. By centering or standardizing the independent variables, the coefficient of X or Z can be interpreted as the effect of that variable on Y at the mean level of the other independent variable.
To probe the interaction effect, it is often helpful to plot the effect of X on Y at low and high values of Z. Often values of Z that are one standard deviation above and below the mean are chosen for this, but any sensible values can be used. The plot is usually drawn by evaluating the values of Y for high and low values of both X and Z, and creating two lines to represent the effect of X on Y at the two values of Z. Sometimes this is supplemented by simple slope analysis, which determines whether the effect of X on Y is statistically significant at particular values of Z. Various internet-based tools exist to help researchers plot and interpret such two-way interactions.

Higher-level interactions

The principles for two-way interactions apply when we want to explore three-way or higher-level interactions. For instance, if we have a three-way interaction between A, B, and C, the regression equation will be as follows:

Spurious higher-order effects

It is worth noting that the reliability of the higher-order terms depends on the reliability of the lower-order terms. For example, if the reliability for variable A is 0.70, and reliability for variable B is 0.80, then the reliability for the interaction variable A * B is 0.70 × 0.80 = 0.56. In this case, low reliability of the interaction term leads to low power; therefore, we may not be able to find the interaction effects between A and B that actually exist. The solution for this problem is to use highly reliable measures for each independent variable.
Another caveat for interpreting the interaction effects is that when variable A and variable B are highly correlated, then the A * B term will be highly correlated with the omitted variable A²; consequently what appears to be a significant moderation effect might actually be a significant nonlinear effect of A alone. If this is the case, it is worth testing a nonlinear regression model by adding nonlinear terms in individual variables into the moderated regression analysis to see if the interactions remain significant. If the interaction effect A*B is still significant, we will be more confident in saying that there is indeed a moderation effect; however, if the interaction effect is no longer significant after adding the nonlinear term, we will be less certain about the existence of a moderation effect and the nonlinear model will be preferred because it is more parsimonious.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...