Submission to the Workshop on Evaluation of Virtual Environments.


A Spectrum of VR Assessment: From Outside-In to Inside-Out

John A. Waterworth

Department of Informatics,
Umeå University,
S-901 87 UMEÅ, Sweden
jwworth@informatik.umu.se

Abstract

VR comprises a revolution in how people interact with technology, information, and each other. But its potential is so wide that it is impossible to characterise a typical VR application and its usage. As someone once said, "We don't even know what animals are out there, so how can we start doing botany?". In some highly focused, "professional" settings, evaluation is relatively unproblematic. But in a whole host of others, we are still trying to work out what's going on, and are thus some way from being able to measure it. I will discuss this situation as a spectrum ranging from outside-in observation, to inside-out reports of immersive experiences.

Keywords:

Professionals, Artists, Tasks, Pleasure, Experience, Concrete, Consciousness, Animals

Position Statement

The range of applications currently addressed through VR stretches from very constrained practical tasks for dedicated professionals - such as surgeons wanting to plan a minimally-invasive route for brain surgery - to artistic experiences that radically, if temporarily, change how people feel about themselves, their bodies, and the world around them. It makes little sense to try to apply the same evaluation techniques across this broad range. In HCI studies, it has been traditional to think of evaluation as something that revolves around users' purposes in terms of the tasks they wish to perform. However, this is too narrow a focus for recent technologies such as multimedia and VR.

To address this broad range of applications of VR, we can put forward a spectrum of techniques for evaluation, ranging from objective outside-in assessments, to subjective inside-out experiences. I will briefly illustrate this spectrum with a few examples in the following text, and the main points are summarised in Table 1 below.

The most distant, outside-in perspective on VR evaluation is typified by an HCI or other expert not involved in developing or designing a particular VR application making an expert assessment. This is somewhat analogous to an outsider evaluating a factory or university on the basis of short reports from the institution. The expert does not experience the VR, does not observe others experiencing the VR, and makes no objective measures of performance in the VR - although he may rely on reports of such measurements carried out by others. This sort of approach underlies at least parts of a recent survey of applications of VR in medicine [1].

Attempts to extract objective measures of performance in virtual reality represent a step inwards from the expert perspective, above. These are often carried out by the designers and developers themselves, to evaluate the success or otherwise of a particular design or implementation. In [2], we describe the design and evaluation of two types of selector tool for use in medical VR applications (HCI aspects of the environment are described more fully in [3]). In this account, we provide details of the background behind the research and focus on design aspects of selector tools for use in a particular task context, that of medical surgery planning. The evaluation is typical of a traditional HCI approach, since it assumes that there are criteria - such as time and accuracy - against which performance can be measured.

The range of VR, and of related technologies such as hypermedia, goes beyond tasks, however, and seems to imply a more subjective approach to evaluation, not least because pleasurable experiences are seldom assessed along the same dimensions as work-related tasks. To put it simply, it's generally good to complete work tasks quickly, but often preferable for pleasant experiences to last as long as possible.VR can serve as both a task-related tool and a provider of experiences, along with other fairly recent innovations such as educational games, Web-surfing, and socialising in MUDs. In these examples, it is very hard to correlate the quality of the environment or tool with time or other performance measures. While it might be good to explore an information space very briefly, because that means you quickly found what you wanted (your purpose was satisfied), a long browse session might mean you have found lots of interesting stuff you weren't even looking for (you had no definite purpose, but you had a satisfying experience). I suggested the potential value of "traveller's tales" for evaluation of interactive experiences where a task-based approach is not appropriate [4].

But the problem of evaluation is more than just tasks versus experiences. In artistic and some recreational applications of VR, the effect may be to render the immersant more or less incapable of giving a coherent account of his or her experience. People who have experienced emotionally very powerful installations such as Char Davies' Osmose will testify to the difficulties of expressing the nature of the event. In this respect, VR is rather like a recreational drug, and the more powerful and effective it is the less likely we are to be able to give coherent subjective accounts. There are, however, some possibilites for objective measures, such as skin resistance, heart and breathing rate, pupil dilation, but these are almost never unambiguous.

Table 1 - Spectrum of VE Applications and Evaluation Types*

Nature of VE Measures taken Evaluation type
Outside-in environment for work "objective" descriptions expert assessment of suitability for work
Outside-in professional tool/trainer speed, accuracy, errors on tasks objective assessment of task performance
Inside-out educational/recreational stories, tests and questionnaires subjective accounts of experiences
Inside-out psychotropic/artistic skin resistance, breath, heart rate, etc. physiological assessment of bodily responses

*Note that this does not pretend to be comprehensive, and there will be much combining and crossing-over between cells; for example, physiological measures might be disambiguated with subjective reports of experienced duration of a session, and so on.

VR is the culmination of our recent, technologically-motivated, cultural progression towards the concrete (the bodily) and away from the abstract (the mental) [5]. Look at television advertising. Concepts which were once conveyed by a human expert (in a white coat) explaining the advanced science that had gone into, say, the manufacture of a soap powder and the way in which the science works, are now communicated by apparently-3D animated characters representing both the wonderful qualities of the soap powder and the various different kinds of dirt and other stains that bring out the heroic qualities of the personified powder. To move from a focus on the abstract to the concrete is to change one's level of consciousness, and this perspective provides an indirect route to inside-out evaluation, via assessment of internal states of being. Subjective estimates of duration, for example, can help disambiguate reports of experiences and more objective measures of the same events. This may be appropriate, for example, in some psychotherapeutic applications of VR.

Now we create interactive virtual worlds where things which appear to be objects stand for abstract entities and processes. This means that we do not have to do much mental work ourselves to make the abstract concrete and, hence, comprehensible. Since animals share, and often exceed, our concrete skills but not our capacity for abstract thought, this could be seen as developing a style of information representation applicable in principle to animals as readily as to people. A good inside-out test of a VR of this type would be whether it can convince and be used by a fairly smart mammal - say a dog, a cat or a pig [6].

References

[1] Waterworth, J A (1998) - VR in Medicine: aSurvey of the State of the Art.

[2] Serra, L and Waterworth, J A - 'Designing Virtual Selectors for Surgeons'. Applied Ergonomics, 28 (4), 1996, 269-275.

[3] Serra, L, Poston, T, Ng, H, Chua, B C and Waterworth, J A - 'Interaction techniques for a virtual workspace'. Paper and video presented at the International Conference on Artificial Reality and Tele-Existence (ICAT) /Conference on Virtual Reality Software and Technology (VRST) '95, Makuhari Messe, Japan, November 1995.

[4] Waterworth, J A - 'Personal Spaces: 3D Spatial Worlds for Information Exploration, Organisation and Communication'. In R. Earnshaw and J. Vince (eds.): 'The Internet in 3D: Information, Images, and Interaction'. San Diego, USA: Academic Press, 1997.

[5] Waterworth, J A - 'Technology in Support of Returning - From Conscious Doing to Consciously Being'. International Conference on "Science and the Primacy of Consciousness". Lisbon, Portugal, April 1998.

[6]Waterworth, J A - 'VR for Animals'. Presented at Ciber@RT'96, Valencia, Spain, November 1996.


Author Biography

John Waterworth is a Senior Researcher at the Department of Informatics, Umeå University, Sweden He has a PhD in Experimental Psychology and for the last 18 years has worked mainly on HCI - including eight years with British Telecom Labs and six at the Institute of Systems Science, National University of Singapore. His HCI research has been on three main topics: speech-based interaction, multimedia/CAL, and the design of virtual realities (and VR tools). While at ISS in Singapore he helped design and evaluate interface features for the Virtual Workbench and other VR and multimedia applications. John has over 60 publications in these fields, including four books and several book chapters. and has acted as HCI consultant for a wide variety of organisations, including the Singapore Ministry of Defence, the National Computer Board, the British Library, Hewlett Packard, and Yamaha Music. His current research focuses on the design of multimedia environments, and on psychological aspects of VR and cyberspace.
Also see John's Web page