Submission to the Workshop on Evaluation of Virtual Environments.
John A. Waterworth
VR comprises a revolution in how people interact with technology, information, and each other. But its potential is so wide that it is impossible to characterise a typical VR application and its usage. As someone once said, "We don't even know what animals are out there, so how can we start doing botany?". In some highly focused, "professional" settings, evaluation is relatively unproblematic. But in a whole host of others, we are still trying to work out what's going on, and are thus some way from being able to measure it. I will discuss this situation as a spectrum ranging from outside-in observation, to inside-out reports of immersive experiences.
Professionals, Artists, Tasks, Pleasure, Experience, Concrete, Consciousness, Animals
The range of applications currently addressed through VR stretches from very constrained practical tasks for dedicated professionals - such as surgeons wanting to plan a minimally-invasive route for brain surgery - to artistic experiences that radically, if temporarily, change how people feel about themselves, their bodies, and the world around them. It makes little sense to try to apply the same evaluation techniques across this broad range. In HCI studies, it has been traditional to think of evaluation as something that revolves around users' purposes in terms of the tasks they wish to perform. However, this is too narrow a focus for recent technologies such as multimedia and VR.
To address this broad range of applications of VR, we can put forward a spectrum of techniques for evaluation, ranging from objective outside-in assessments, to subjective inside-out experiences. I will briefly illustrate this spectrum with a few examples in the following text, and the main points are summarised in Table 1 below.
The most distant, outside-in perspective on VR evaluation is typified by an HCI or other expert not involved in developing or designing a particular VR application making an expert assessment. This is somewhat analogous to an outsider evaluating a factory or university on the basis of short reports from the institution. The expert does not experience the VR, does not observe others experiencing the VR, and makes no objective measures of performance in the VR - although he may rely on reports of such measurements carried out by others. This sort of approach underlies at least parts of a recent survey of applications of VR in medicine [1].
Attempts to extract objective measures of performance in virtual reality represent a step inwards from the expert perspective, above. These are often carried out by the designers and developers themselves, to evaluate the success or otherwise of a particular design or implementation. In [2], we describe the design and evaluation of two types of selector tool for use in medical VR applications (HCI aspects of the environment are described more fully in [3]). In this account, we provide details of the background behind the research and focus on design aspects of selector tools for use in a particular task context, that of medical surgery planning. The evaluation is typical of a traditional HCI approach, since it assumes that there are criteria - such as time and accuracy - against which performance can be measured.
The range of VR, and of related technologies such as hypermedia, goes beyond tasks, however, and seems to imply a more subjective approach to evaluation, not least because pleasurable experiences are seldom assessed along the same dimensions as work-related tasks. To put it simply, it's generally good to complete work tasks quickly, but often preferable for pleasant experiences to last as long as possible.VR can serve as both a task-related tool and a provider of experiences, along with other fairly recent innovations such as educational games, Web-surfing, and socialising in MUDs. In these examples, it is very hard to correlate the quality of the environment or tool with time or other performance measures. While it might be good to explore an information space very briefly, because that means you quickly found what you wanted (your purpose was satisfied), a long browse session might mean you have found lots of interesting stuff you weren't even looking for (you had no definite purpose, but you had a satisfying experience). I suggested the potential value of "traveller's tales" for evaluation of interactive experiences where a task-based approach is not appropriate [4].
But the problem of evaluation is more than just tasks versus experiences. In artistic and some recreational applications of VR, the effect may be to render the immersant more or less incapable of giving a coherent account of his or her experience. People who have experienced emotionally very powerful installations such as Char Davies' Osmose will testify to the difficulties of expressing the nature of the event. In this respect, VR is rather like a recreational drug, and the more powerful and effective it is the less likely we are to be able to give coherent subjective accounts. There are, however, some possibilites for objective measures, such as skin resistance, heart and breathing rate, pupil dilation, but these are almost never unambiguous.
Table 1 - Spectrum of VE Applications and Evaluation Types*
| Nature of VE | Measures taken | Evaluation type | |
| Outside-in | environment for work | "objective" descriptions | expert assessment of suitability for work |
| Outside-in | professional tool/trainer | speed, accuracy, errors on tasks | objective assessment of task performance |
| Inside-out | educational/recreational | stories, tests and questionnaires | subjective accounts of experiences |
| Inside-out | psychotropic/artistic | skin resistance, breath, heart rate, etc. | physiological assessment of bodily responses |
*Note that this does not pretend to be comprehensive, and there will be much combining and crossing-over between cells; for example, physiological measures might be disambiguated with subjective reports of experienced duration of a session, and so on.
VR is the culmination of our recent, technologically-motivated, cultural progression towards the concrete (the bodily) and away from the abstract (the mental) [5]. Look at television advertising. Concepts which were once conveyed by a human expert (in a white coat) explaining the advanced science that had gone into, say, the manufacture of a soap powder and the way in which the science works, are now communicated by apparently-3D animated characters representing both the wonderful qualities of the soap powder and the various different kinds of dirt and other stains that bring out the heroic qualities of the personified powder. To move from a focus on the abstract to the concrete is to change one's level of consciousness, and this perspective provides an indirect route to inside-out evaluation, via assessment of internal states of being. Subjective estimates of duration, for example, can help disambiguate reports of experiences and more objective measures of the same events. This may be appropriate, for example, in some psychotherapeutic applications of VR.
Now we create interactive virtual worlds where things which appear to be objects stand for abstract entities and processes. This means that we do not have to do much mental work ourselves to make the abstract concrete and, hence, comprehensible. Since animals share, and often exceed, our concrete skills but not our capacity for abstract thought, this could be seen as developing a style of information representation applicable in principle to animals as readily as to people. A good inside-out test of a VR of this type would be whether it can convince and be used by a fairly smart mammal - say a dog, a cat or a pig [6].
[1] Waterworth, J A (1998) - VR in Medicine: aSurvey of the State of the Art.
[2] Serra, L and Waterworth, J A - 'Designing Virtual Selectors for Surgeons'. Applied Ergonomics, 28 (4), 1996, 269-275.
[3] Serra, L, Poston, T, Ng, H, Chua, B C and Waterworth, J A - 'Interaction techniques for a virtual workspace'. Paper and video presented at the International Conference on Artificial Reality and Tele-Existence (ICAT) /Conference on Virtual Reality Software and Technology (VRST) '95, Makuhari Messe, Japan, November 1995.
[4] Waterworth, J A - 'Personal Spaces: 3D Spatial Worlds for Information Exploration, Organisation and Communication'. In R. Earnshaw and J. Vince (eds.): 'The Internet in 3D: Information, Images, and Interaction'. San Diego, USA: Academic Press, 1997.
[5] Waterworth, J A - 'Technology in Support of Returning - From Conscious Doing to Consciously Being'. International Conference on "Science and the Primacy of Consciousness". Lisbon, Portugal, April 1998.
[6]Waterworth, J A - 'VR for Animals'. Presented at Ciber@RT'96, Valencia, Spain, November 1996.