I have just finished teaching a two-day multivariate modelling course in Brisbane for AMSRS (the Australian market research society), and one topic which led to an interesting discussion was SEM (Structual Equation Modelling) and causality. As Jon Pinnell, of MarketVision, says, SEM is “A source of excitement to some and great frustration to others.”.
At one level SEM is very appealing in that it appears to offer a grand theory of everything. The user creates a model, linking inputs to latent variables, to outputs, and then calculates all the relationships and errors in one go.
However, it seems there are two, large, problems with SEM.
- To use SEM properly you should already have the model, before you have the data. SEM can then be used to see whether this set of data (e.g. this company, this situation, this process) confirms to the model. If the model correctly ascribes causality, and the data fit the model, then we can accept the causality in the SEM process. However, if the model has been developed from the data available, SEM should not really be run until a new set of data has been acquired.
- The second problem, and it is one that applies to many other techniques in addition to SEM, is the dependence on assumptions about linear relationships between inputs and outputs. In the real world, we know many things are not linear; for example, a service can get a bit worse and nothing happens, then a bit worse and nothing happens, then a tiny bit worse and the contract is cancelled.
In reviewing the current state of thinking I came across the quote below about SEM, which I think sums up the causality issue quite well.
“The term causal model must be understood to mean: ‘a model that conveys causal assumptions,’ not necessarily a model that produces validated causal conclusions.”