Gorilla in the room - Beware of snake oil sellers when trying to measure System 1

Dr Ali Goode’s contribution to the debate was featured on Research Live alongside our friends at Simpson Carpenter.

Daniel Kahneman’s fast and slow thinking popularised the age-old dual-process theory. Within months of its publication, the prevailing opinion was that marketers were ‘missing the other half of the coin’. Where is the insight into gut reactions, the unconscious and automatic? Quantitative researchers looked to techniques such as the implicit association test (IAT) to reveal what customers were ‘really’ thinking.

IAT is built on the premise that the harder one must work to rectify gut reactions, the slower the response or the more likely one is to make an error. The years have not been kind to this otherwise credible idea. An analysis of 492 studies subsequently found scant evidence linking differences in IAT scores with related behaviours. Being expensive to administer, exhausting to complete, or lacking in benchmarks, implicit techniques with potential had already lost commercial footing.

Implicit reaction time (IRT) won out. Easy to administer, IRT addresses lots of attributes, and is fun to do. When Simpson Carpenter trialled IRT in 2012, results were initially encouraging and plausible. Upon the re-asking of identical questionnaires, less so. Results differed so substantially, the data so notably indisposed to severe cleaning and mathematical torture, that IRT was put on ice. Test-retest reliability is a prerequisite for validity. Plainly said, if one cannot return a similar result on the second time of asking, one is not measuring anything real. This is quite problematic because there is absolutely no getting around this, no matter how insightful results may appear.

While it is conceivable that response time can be used to detect duplicity, this is not the context in which it is applied within IRT. Rather, IRT has no rules or incorrect answers to navigate, no struggle between head and heart. So it is hardly surprising to discover that evidence in support of IRT needs to cite experiments that are not based on reaction time.

Paddling the keyboard or screen as fast as one can go, as IRT instructs, still means that answers are subject to highly variable degrees of logical, deliberate filtering, at which point IRT has no advantage over any other form of stated response. Furthermore, dual-process theory maintains that just because mental processing is fast does not mean that System 1 is at work.

More recently, I’m guided by the bold and comprehensive GfK paper, which concluded:

The higher differentiation seen in IRT was attributed to higher error rather than reliable differences
IRT suffered low reliability and validity
Stated measures showed stronger relationships with purchase likelihood and recommendation

At best, IRT is too blunt an instrument, at worst, a reading or concentration test, perhaps the drunk cousin of direct questioning; incoherent, fumbling up the stairs, falling at every hurdle. Even if this weren’t so, our hasty judgements, mental blocks, consideration sets, impulse buys or instincts about brands are not functions of reaction time. In this sense, IRT is akin to using a stopwatch, instead of map, to plot a route.

More troubling is that IRT’s popularity has perpetuated the illusion that these faster, sometimes faulty, sometimes genius, emotionally tinged inclinations hide ‘under the skin’ as clear, stable and distinct concepts, with shared expression across situations and individuals. The hope is that if enough layers were to be peeled away, these inclinations can be revealed, then quantified. Of course, neurological and psychological literature strongly refute this possibility. ‘From everything we know, humans definitely do not work like that,’ would be the polite summary.

Skip forward to the IJMR lecture on Monday ( 8 July) in which the once leading advocate of IRT within the UK, Dr Ali Goode added: “The safest claim is that IRT is a measure of top of mind confidence – it does not measure the unconscious and it’s unhelpful to suggest it does anything of the sort. Given that reaction time adds so much noise, IRT should demonstrate that there are no confounding variables at play, and more importantly, why it should differ from self-reporting… It’s probably time to call time on IRT, or at least be extremely careful how the data is interpreted. It has to be used under advisement depending on the task that it is being applied to. There are better solutions.”

This is not to say that reaction time has no place in modern market research – a case can always be made but it would be limited. The burden of proof is quite high and too easily challenged. To my mind, it just seems like a lot of faff to collect data which we cannot, even under ideal conditions, be near confident in using. Ultimately, this is not how the industry creates value.

So, it is time to hand IRT a pint of water, put it to bed and find consolation in GfK’s third conclusion above; the strength of sober, reflective questioning. Post-rationalised answers still do an impressive job predicting and diagnosing, measuring much of what is possible to know. Goodnight IRT. It was fun.

Ryan Howard is director advanced analytics at Simpson Carpenter

Beware of snake oil sellers when trying to measure System 1.

Recent Posts

Recent Comments

Archives

Categories