This research theme deals with the use of diversity -between
systems and between the processes by which they are created and validated-
to reduce risk. Fault tolerance - protective redundancy - is an essential
component of dependability for all kinds of processes and systems. But
within the problem of designing or assessing fault tolerance we can see
- redundancy of structure :
ensuring that, given a potential error there is someone/something (redundancy)
in charge of dealing with it
- diversity: containing the risk of the protective elements
failing together with the ones to be protected.
The use of diversity to make things dependable is age-old
and embedded in our notions of common sense: two heads are better than
one, don't put all your eggs in one basket. But beyond the common sense
appeal, difficult questions have to be answered, concerning the two aspects:
- assessment: what we want is a low probability of the
fault-tolerant design being defeated by common failures of the redundant
(protective and protected) components. How can we estimate this in practice?
- achievement: we can pursue low failure correlation
via various common-sense practices meant to
but how can we choose effective and cost-effective
combinations of such means?
- isolate the redundant components against common causes
- diversify them, in the aspects that have effects
on their failure behaviour,
The second question implies asking
the first question with respect to predicting - with some inevitable
uncertainty - the dependability of the various alternative systems one
could choose to build, rather than for a specific existing system. For
instance, these issues arise when addressing possible trade-offs between
pursuing high dependability of the individual components, and high diversity
and thus effectiveness of fault tolerance.
In computing, diversity has historically been studied as
a topic in its own right with regard to "multiple version software"
2]. But the principle and the important questions
are relevant for all kinds of applications: for error detection, containment,
recovery processes, whether in a running system or in the process producing
it, and irrespective of whether the fault tolerance functions are performed
by people or machines.
DIRC has set itself the goal of widening the scope of rigorous
knowledge about diversity to a broader range of applications, and to enrich
it by improving the integration of knowledge from engineering disciplines
and from psychology and sociology.
DIRC's work on diversity is probably the most concentrated
research effort anywhere in the world on this important subject. It has
taken advantage of links to previous and ongoing work, and encompassed
both probabilistic modelling and experimental work. The advances produced
extend the scope of knowledge on diversity to cover wider ranges of both
system types and ways that diversity is pursued or threatened. Examples
- a case study on the effect of
a decision aid computer system for medical use. The diversity viewpoint
provides important new insight about limits to the effectiveness of
the tool in reducing errors (or whether it might even cause some errors),
and for directions for improving and correctly assessing this benefit;
probabilistic modelling of how factors that may violate the separation
between development efforts of diverse designs affect the benefit produced
by diversity. These models give a unified view of many issues affecting
project management decisions in this area; and they provide a useful
modelling template, and some very general insight, about factors that
affect diversity of failures in any system with diverse redundancy;
modelling the effects of setting complementary, diverse
goals to the members of a development team, to identify under which
conditions this will lead to reducing flaws in the product (this has
been the subject of PhD research at Newcastle: a link to the thesis
will be included here when this has been accepted);
- applying diversity in the arguments
that support dependability claims, and studying how much this diversity
will improve confidence in the claims. This ongoing work aims to bring
clarity to the vague, although intuitively plausible, demands
for such "argument diversity" in current practice and even written guidelines;
- modelling the effects of increasing diversity
of usage in the testing process of a system on the reliability
growth observed by different users of the system, a question originated
by some claims about the merits of open source project management methods;
- addressing the issue of the extent to which diversity
can bring benefits in security,
a controversial topic in the security community. A DIRC position paper
discusses a more rigorous, probabilistic treatment of the issue, taking
advantage of the results developed in the reliability area;
- exploring the extent of
diversity "naturally" present in off-the-shelf products that nominally
implement a common standard: SQL database servers. This work has highlighted
both the potential and a probable need for adding diversity to the current
solution for fault tolerance in databases;
- exploring the
degree and type of
spontaneously arising diversity among programs
produced to the same
specification, for several sets of programs submitted to a programming
 B. Littlewood, P. Popov, L. Strigini, "Modelling
software design diversity - a review", ACM Computing Surveys, vol.
33, no. 2, 2001, pp.177-208.
 L. Strigini. "Fault
Tolerance Against Design Faults", in Dependable Computing Systems:
Paradigms, Performance Issues, and Applications (Hassan Diab and Albert
Zomaya, Eds.), J. Wiley & Sons, 2005.
Other selected papers from this theme
(more at the links for the individual outcomes
E. Alberdi, A.A. Povyakalo, L. Strigini, P. Ayton, M. Hartswood, R. Procter, R. Slack,
"Use of computer-aided detection (CAD) tools in screening mammography: a multidisciplinary investigation", British Journal of Radiology, vol. 78, 2005, pp.S31-S40. Abstract
Other papers on the DIRC
mammography case study (human-machine diversity in medical systems)
- M.J.P. van der Meulen, L. Strigini,
M. Revilla, "On the Effectiveness of Run-Time Checks", Proc SAFECOMP
2005, 28-30 September 2005, Halden, Norway, (Gran, B.A., Winther, R.,
Eds.), in print. Lecture Notes in Computer Science, Springer-Verlag.
- P. Popov and B. Littlewood, "The
effect of testing on the reliability of fault-tolerant software", Proc
International Conference on Dependable Systems and Networks (DSN2004),
pp. 265-274, IEEE Computer Society, 2004. Abstract
- I. Gashi, P. Popov, V. Stankovic
and L. Strigini, "On Designing Dependable Services with Diverse
Off-The-Shelf SQL Servers", in Architecting Dependable Systems, (R.
de Lemos, C. Gacek and A. Romanovsky, Eds.), pp. 196-220, Lecture Notes
in Computer Science, Springer-Verlag
- B. Littlewood and L. Strigini, "Redundancy
and diversity in security", ESORICS 2004, 9th European Symposium on
Research in Computer Security, Sophia Antipolis, France, Springer-Verlag
LNCS 3193, 2004, pp. 423-438.
- L. Strigini, A. Povyakalo and E.
Alberdi. "Human-machine diversity in the use of computerised advisory
systems: a case study", Proc. DSN 2003, International Conference on
Dependable Systems and Networks, San Francisco, U.S.A., 2003, pp. 249-258.
- R. Bloomfield and B. Littlewood, "Multi-legged
arguments: the impact of diversity upon confidence in dependability
arguments", Proc. DSN 2003, International Conference on Dependable Systems
and Networks, San Francisco, U.S.A., IEEE Computer Society, 2003, pp.
- P. Popov, "Reliability Assessment of Legacy
Safety-Critical Systems Upgraded with Off-the-Shelf Components",
SAFECOMP'2002, September 2002, Catania, Italy, Lecture Notes in Computer
- D. Bosio, B. Littlewood, M.J.
Newby and L. Strigini. "Advantages of open source processes
for reliability: clarifying the issues", Presented at Workshop on Open
Source Software Development, Newcastle upon Tyne, February 2002. Abstract
related research at City University