Computer-aided decision making and human-machine diversity:
The Mammography Case Study
diverse redundancy, decision making, decision support,
computer-aided detection, mammography, breast cancer screening,
probabilistic modelling, ethnography, automation bias
Computer aided decision making involves human operators
supported by automated devices that offer advice, filter or enhance
information, issue alerts and prompts. It thus involves a human-machine
system whose dependability is affected by the design of the decision
support tools, the procedures for their deployment and use, the training
of operators and the definition of their roles.
The intended role of automation is often purely auxiliary:
the human user retains the authority and responsibility for decisions. The
design intention is to exploit protective redundancy with diversity:
the machine protects against some human errors, and vice versa.
The assumption that "computer support may improve things
but not make them worse" would be very helpful in decisions about
deploying a system. So, it is important to ask whether the assumption
holds in a specific case; in which cases it is likely not to hold; and how
to judge e.g. how unreliable a support tool must be for it to stop helping and
start damaging human decision making.
As an example of such systems, DIRC studied the
dependability of Computer Aided Detection
(CAD) in mammography. This is a decision system formed by
clinicians using computer support (CAD tools) in reading X-ray images
(mammograms) for breast cancer screening.
The main features of this case study are the following:
we used an interdisciplinary
approach which combined insights from: reliability engineering,
psychology, human factors, ethnography;
we obtained intriguing, hitherto unreported, findings about
potentially detrimental effects of CAD on human decisions for some
categories of cases and users (which we summarise below);
our findings are unexpected and potentially important and
have been acknowledged as such by mammography practitioners: one of our
et al., 2004] won the 2005 Best Clinical Paper award from the
Association of University
these results have practical implications
for the design, deployment and evaluation of CAD for mammography and
computer aided decision making in general.
CAD for mammography:
CAD tools are designed to help mammogram readers not to
overlook details that they ought to examine to reach their decisions
The starting point for DIRC's involvement with this problem
was an earlier
study carried out by University College London (UCL) for the NHS
Health Technology Assessment (HTA) programme, which compared mammogram
readers' performance with and without CAD
We collected new data specifically to investigate how
clinicians react to incorrect output from the computer aid;
We conducted more detailed
statistical analyses than used in previous statistical studies,
highlighting how the machine's effect varies between mammograms and
between clinicians and as a function of whether the machine provides
correct advice or not;
We used insight from work on software systems
with diverse redundancy to drive probabilistic models; these
underline that: variation and co-variation of "difficulty" of input
cases for different system components affect the dependability of the
overall system substantially; focusing on the average probabilities of
the failures of the components, or assuming statistical independence
among their failures, can be misleading;
We exploited direct ethnographic observations
of users at work in real and experimental settings, to
provide checks on the realism of experiments and to help identify limits
to the generalisability of results, as well as suggesting possible
phenomena and causal mechanisms.
Fig. 1. Impact of CAD on different categories of cases.
Fig. 2. Impact of CAD on different categories of readers.
Even though the HTA study
showed no significant overall effect of CAD on human
performance, our analyses suggest that CAD did
affect decisions of the clinicians who participated in the
trial: CAD was useful for some classes of cases and readers (mostly when
it provided correct advice); but also had a detrimental effect for other
categories of cases and readers (mostly when it provided incorrect
advice). Figure 1 & Figure 2 illustrate the kind of effects observed;
Our new experiments on expert readers also indicated that
incorrect output from CAD may have detrimental
effects on the decisions of some of its human users, even if
well-experienced and confident, for some categories of mammograms;
Ethnographic observation highlighted how
readers reasoned about the machine's behaviour through the explanations
they gave of and for its patterns of error and successes: it showed how
readers anticipated the system's responses, made inferences about how it
worked, and formed judgements as to its effectiveness. Ethnography also
identified ways that the natural behaviour of readers may violate the
intended use of the CAD tool.
The way the effect of CAD varies over case-reader pairs
may affect its overall effect on a patient population more than its
average rates of false negative and false positive errors [Strigini
et al., 2003];this
complicates the assessment of a new technology, but may open
cost-effective ways of achieving better overall quality of decisions
[Alberdi et al., 2005(b)].
Our interdisciplinary approach has helped in conjecturing
specific cognitive processes that account for the detected behavioural
patterns; this, in turn, may further help designers to avoid undesired
et al., 2005(a)].
Mammography Case Study at CSR, City
et al., 2005(a)] Alberdi A, Povyakalo AA, Strigini L, Ayton P. Use
of computer-aided detection (CAD) tools in screening mammography: a
multidisciplinary investigation. Br J Radiol. 2005;78 Spec No
[Alberdi et al., 2005(b)] Alberdi, E, Ayton, P, Povyakalo, AA, Strigini, L. Automation bias and system design:
A case study in a medical application. Proc. IEE People & Systems Symposium, London, November, 2005.
et al., 2004] Povyakalo AA, Alberdi E, Strigini L, Ayton P.
Evaluating 'Human + Advisory computer' system: A case study.
HCI2004,18th British HCI Group Annual Conf. Vol 2. Leeds: British
HCI Group; 2004. p. 93-6.
et al., 2004] Alberdi A, Povyakalo A, Strigini L, Ayton P.
Effects of incorrect computer-aided detection (CAD) output on human
decision-making in mammography. Acad Radiol. 2004
et al., 2003] Hartswood, M, Procter, R, Rouncefield, M, Slack, R
et al. 'Repairing' the Machine: A Case Study of the Evaluation of
Computer-Aided Detection Tools in Breast Screening. Proc. Eighth
European Conf. on Computer Supported Cooperative Work (ECSCW 2003),
et al., 2003] Strigini L, Povyakalo AA, Alberdi E. Human-Machine
Diversity in the Use of Computerised Advisory Systems: A Case Study.
dsn, vol. 00, no. , p. 249, 2003 2003.
A complete list of papers is available here.
and Mark Hartswood