[Back to CONTENTS Page - Survey of Medical VR]


2. Techniques of Medical VR:

details of available approaches


John Waterworth

Informatik

last update: July 1999



CONTENTS


2.1 Virtual Bodies and Body Parts

2.2 Modelling Objects and Simulating Behaviours

2.3 Display and Interaction Technologies

2.4 Augmented Reality

2.5 Challenges



2.0 Introduction


What is medical virtual reality (VR)? Simply, VR with a medical application. It involves at a minimum the visualisation of data (which is usually anatomical) in three dimensions, and interaction with that visualised data. In other words, the 3D visualisation must be manipulable through user interaction with the computer system providing the visualisations.

Using VR involves interactive examination and manipulation of data about 3D reality, presented as a simulated 3D reality. Additional important features are the simulation of behaviour of the visualised data (increasingly including behaviour as a result of interaction), and feedback in modalities other than vision (sound, force, touch, smell). The data then are not only merely visualised, and it is useful to think in terms of computer "realisation" of data in forms that can engage several of the user's senses.

Weaknesses of current medical VRs include low fidelity of the simulations (level of detail needed to distinguish different tissue types, for example) as compared with the real thing, speed of interaction (which trades off against fidelity) and lack of realism (restricted modalities, unrealistic feedback from interactions).

Another aspect of realism and fidelity that needs further development for surgical applications is the co-registering of data from different sources (MRI and CAT, for example) to produce an accurate, composite realisation with which surgeons can interact in real time. There has also been relatively little work on the human-computer design aspects of medical VR. A focus on these (including more user trials with medical practitioners) would improve usability in actual medical settings.


2.1 Virtual Bodies and Body Parts


Surgery is heavily dependent on patient data. In surgery planning, surgeons interact with models of patient anatomy. In surgery training, they operate on a model that is built from patient data. This obviously requires that the models be as accurate as possible for the available data.

Patient data used in VR may come from several sources familiar to medical practitioners:

Computer (Aided) Tomography (CT) (CAT)

Magnetic Resonance Imaging (MRI)

Ultrasound

Physiological Imaging (PET) (SPECT)

Others: range finders, etc..

Imaging involves the collection of anatomical or physiological data from the patient. Computer graphics techniques - rendering and modelling - are then used to display that data as (part of) a virtual body so that it can be examined and manipulated. The rest of this section deals with imaging and rendering techniques. The next section (2.2) treats the modelling of objects and the simulation of their behaviour.


2.1.1 Volume Imaging Technologies

Data sets originate as voxels (volume elements) derived from the imaging of anatomy: CT, MRI, MRA, MRV, Ultrasound; or of physiological function: PET, SPECT, fMRI

Visible Human as 2D slices (a stack of CT scans)

With current (pre-VR) technology, surgeons view what is essentially 3D information as large sets of 2D "slices". One of the many skills that must be developed by the surgeon is that of mentally imagining how the 2D pictures relate to the 3D anatomical reality. Put very simply, VR does the job of realising data in 3D that was previously done only in the surgeon's imagination. This 3D VR representation can then be examined in detail, shared and discussed with others, and related precisely to physical reality (for example, in stereotactic surgery).

Another advantage of VR is that data from different source - CT, MRI, MRA, X-ray, for example, can be combined (or registered together) in the same 3D realisation. This is extremely useful in understanding how different aspects of anatomy relate to each other - blood vessels and bones, for example.

Voxelman showing registration of several data sources

Computer Tomography (CT or CAT), from tomo, meaning cut (volume), and graph, meaning display, is based on low intensity X-Ray projections shot from many angles. A computer then creates a graphical representation of the slice. CT works by absorption of rays and is particularly good for displaying bones. Variations include Spiral CT and Open CT

CT scans from the Visible Human

Nuclear Magnetic Resonance Imaging (NMR, MRI) despite its name, is based on magnetism, not harmful radiation (unlike X-rays). It relies on the differential decay and recovery characteristics of the proton NMR signal. It gives high contrast among various soft tissues and organs, and so is good for the head, spine and joints. Variations include tagged MRI, MRA (arteries), MRV (veins), and Open MR

Magnetic resonance techniques are particularly useful in visualising temperature changes for so-called thermal surgery procedures. MR can show changes in tissues resulting from heating, and so can be used to monitor these procedures.

MRA

MRI stack from the Visible Human

Ultrasound is based on echo imaging. An acoustic wave is launched, which interacts with tissue and blood and some of the energy returns. There is no ionizing radiation, and acquisition is in real time. Ultrasound equipment is also relatively inexpensive. Applications of ultrasound include cardiology, neurosurgery, gynaecology, abdominal imaging, and vascular imaging. Because it is harmless and instant, it is often used as a real-time guiding tool in surgical interventions.

Ultrasound views from Fraunhofer

Physiological (Functional) imaging is a form of nuclear medicine: imaging the decay of radio-isotopes bound to molecules with known biological properties, using a rotation camera. The amount of brain activity determines the level of absorption.

With SPECT (Single Photon Emission Computer Tomography), a gamma-ray emitting radio-isotope chemical is administered, and the photons emitted by the decaying isotope are recorded.

SPECT image

PET: Positron-Emission Tomography is more precise but also more expensive., since an on-site cyclotron is needed to provide positron-emitting isotopes. Scanners for PET are also more expensive than single-photon cameras.

PET scans


2.1.2 Surface Rendering and Hybrid Models

Surface rendering converts volumes into geometric primitives using a form of isocontouring. Isocontouring relies on "thresholding", which requires knowledge of data, since noise blurs the boundaries between regions. This, of course, results in a significant loss of information from the original volumetric data. It is popular because it is highly tractable in computer graphics terms. It capitalises on known polygonal geometry implemented in readily available hardware and software. It allows shading and other calculations, including deformations of the surfaces. The complexity of data is reduced, and so its rendering is speeded up, because the surfaces have no contents within them.

Visible Productions skull

From Erlangen: skull and mesh

From PTI

Hybrid Models

Hybrid models are currently very widely used and are composed of polygonal surface models with 2D textures "mapped" onto them, This can produce a quite realistic effect, although inaccuracies can be seen when object are views from different angles - because the surface textures applied tend to be uniform and/or are not changed according to viewing angle or distance.

EVL eye

The most successful surgery trainers - particularly for endoscopic surgery, but also for some (relatively simple) open surgery procedures - use such hybrid models.

Boston Dynamics Suture Trainer


2.1.3 Volume Rendering

Volume rendering depends on ray casting from the volume in question to the eye. The advantages lie in the immediate conversion of volumetric data into a truly 3D rendering. This supports such features as selective transparency, cut views, and so on. Stereoscopic vision is important here to assist in clarifying relative positions of rendered features. The disadvantages revolve around the huge amount of data that must be handled, which requires specialised and expensive hardware.

volume rendering from CT, MRI data

Volume rendering from MRA, MRI data

Volume rendering works directly on the volumetric data, and renders by traversing the whole data set every time. Surface rendering avoids that by first extracting a surface out of the volumetric data, and then manipulating the surface. This is fine if the surgeon only wants to looks at the surface of things (for example, looking at a skeleton), but whenever he wants to look inside, he needs the data from within to be rendered. Volume rendering does that, but the processing is demanding (data sets can go up to hundreds of megabytes) - which is why surface rendering was invented in the first place - to only ever display part of the data.

The other problem with surface rendering is that extracting the surface is not easy, since a threshold that determines inside and outside must be calculated and identified. That can take some time, of the order of hours. Every time the surface is created (using a well-established computer graphics technique called "Marching Cubes") millions of polygons are generated. In other words, although rendering the surface is fast, because graphics machines deal with millions of polygons in a few seconds or less, creating the surfaces takes much longer.

To work on volumes directly, many techniques have been devised and the field is already 15 years old. The trick is to do it in real time, and that is where the Silicon Graphics "volume rendering using 3D textures" technique is useful, since it takes advantages of their specialised hardware to render volumes fast. There are currently several companies trying to do volume rendering in real time, with and without special hardware. VOXAR is one. There is a German company called VolumeGraphics that is developing software for the PC that displays volumes using four CPUs.

In summary, volume rendering is recommended because it is faithful to the original data (no thresholding that empties the data), and thus avoids the step of thresholding, but it requires a lot of power to transfer 256x256x100 data elements to the screen. Volume rendering using 3D Textures (the name of the SGI technique) is restricted to expensive SGI machines, although the prices are becoming more affordable with the introduction of the Octane series.


2.2 Modelling Objects and Simulating Behaviours


The modelling of objects is the first step in adding meaning to what the computer scans and displays. The problem is initially one of identification - which are the objects in a set of volumetric data? Once the objects are known, the simulation of behaviours associated with different objects becomes possible. That means, in a simple case, that if an object is known and not just shown, the surgeon can interact with it (moving one highlighted structure relative to the others, for example).

Many electronic anatomical atlases exist, mostly based on the Visible Human or on standard anatomical atlases used in surgery

The Visible Human

Segmented and labelled at KRDL, Singapore, from Visible Human data

Talairach and Tournoux Atlas (left) and Schaltenbrand and Wahren Atlas (right) - both courtesy of KRDL

Volume rendering is also a natural way of combining data from different sources - i.e. more than one image modality (CT, MRI, etc...) - a process known as "registration".

VIVIAN from KRDL

Voxelman from University of Hamburg

Having identified objects, it is possible to do things with them, such as highlight them, render them transparent, rotate or otherwise move them. Further than that, through physically-based modelling, it becomes possible to model behaviours inherent in the objects themselves - such as the way they compact or bend when pressure is applied in particular ways.

Catheter simulation (da Vinci) from KRDL

A big advantage of object and behaviour simulation is that it allows prediction of outcomes, not just planning the interventions. For example, a face can be visualised after reconstructive plastic surgery to access the appeal of the results.

Craniofacial surgery simulation from Erlangen, Germany


2.3 Display and Interaction Technologies


Interaction techniques for VR involve a close relationship between output to the user and input from the user. Often, they have been dealt with separately (with different devices), but increasingly it is impossible to separate the two. In this section, we deal with both input and output, sometimes separately, sometimes in an integrated way. For example, the next subsection on visual display is mostly about output from the system, whereas the subsection on tracking is about input, and the subsection on force feedback combines both.


2.3.1 Visual Displays

The centre of any Virtual Reality today is the visual display. VR developed out of computer graphics, and is still largely concerned with how data can best be presented visually to impart a sense of realism. An important aspect of this is how a sense of 3D is conveyed, the most common approach being to mimic binocular vision by rendering two slightly different displays and, by some means, presenting them separately to the two eyes to give a stereoscopic effect.

When the head is tracked and an appropriate display is used, head motion parallax is also used as a stereoscopic vision cue. Opinions vary about how useful head tracking is. Some studies have shown little benefit, and the computational costs are high. Object motion parallax also provides a cue to depth, but only when the objects move relative to each other. Other stereo cues contributing to human depth perception (accommodation, muscle contraction, eyeball shape and pressure, texture flows, etc..) are generally not used.

Other issues include the resolution of the display (how much detail it can show) and whether accurate manipulation of displayed objects is needed.

Resolution is higher with screen-based, desktop VR, but the sense of immersion is less strong and head-tracking is often considered to be less useful. Because the user is not immersed in the space produced, he cannot conveniently reach into the space to manipulate objects - either his hands get in the way of the display if the stereo image appears in front of the screen, or the screen gets in the way if the volume is presented as if behind the screen. These problems arise, of course, because the apparently 3D space is really produced on a 2D screen. This can be avoided by using a mirror to produce an apparent 3D volume in a place where the hands can physically reach.

Head-mounted displays avoid this problem in another way, by placing the images directly in front of the eyes, tracking the head position and altering the display appropriately. The user can then manipulate things displayed in an apparent 3D space without obstruction by the hands and without the space being out of reach behind the screen. But resolution is low, and tracking head and unconstrained hand positions is inaccurate and computationally expensive - and therefore slow. Typically, users of such immersive VRs report some nausea due to slow updating of the images when they move their heads. The accuracy of head tracking also tends to reduce with time. However, HMD technology is improving all the time, and new techiques - such as Retinal Laser Scanning and Light Pipes - show considerable promise.

HMD from Virtual Research

Vista Medical: HMD for operating theatre, microscope image

The main problem with Head Up Displays, where views of the world (usually video images) are combined with views of data in a form of Augmented Reality, is registration of the different sources. They also tend to be extremely cumbersome.

HUD from University of North Carolina

Projection Systems, using walls and tables, are useful for teaching groups or for collaborative discussions, but are less suitable for individual examination of data or for training of surgical skillls in a realistic way.

The CAVE from EVL, University of Illinois

Immersadesk, aka Immersive Workdesk from Pyramid Systems

Responsive Workbench, from GMD, Germany

Mechanically tethered displays attempt to avoid the problems of low resolution and tracking head position inherent in HMDs, by attaching the screen to the front of the face and tracking its position mechanically. This again is rather cumbersome, as shown in the pictures of the BOOM from Fakespace Labs, who also produce a smaller, desktop version called the PUSH.

The BOOM from Fakespace Labs

Reflection systems use the virtual image in a mirror to produce an apparent space with high resolution and into which the user can reach.

Boston Dynamics (left) and CMU Enhanced Reality (right)

KRDL Virtual Workbench

Penn State University mirror display


2.3.2 Tracking

Tracking concerns the various ways in which the hands, the head, and occasionally the rest of the body can be used to manipulate or inspect the data. In other words, it is about detecting the position of parts of the body, so that the display can be changed to reflect user actions.

Trackers in common use include various "wired" gloves, props, joysticks, and specialised trackers with buttons

When used in surgery trackers report back 3D position and orientation in space. They tend to rely on connection by RF(Radio Frequencies) signals rather than wires, since this gives a wide range of action, and allows unencumbered interaction. General purpose trackers, such as the Polhemus stylus and the Ascension Bird, are often used. Coils are used to track catheters. Other connection methods include ultrasound, which is low-cost but suffers from blind areas; mechanical coupling, which is accurate, but bulky and also has blind areas; and infrared, which is low cost, but requires a clear line-of-site and so also results in blind areas.

Gloves report finger position and angle and provide a very natural interface. They also require a tracker to determine the absolute position of the hand. However, they are hard to calibrate (and the calibration slips progressively with time) and cumbersome to put on.

Wired glove from UC Berkeley Robotics Department

University of Virginia "Props" Interface

Bat (U. Alberta)

Joysticks (Division)

Polhemus stylus


2.3.3 Tactile Feedback

The purpose of tactile feedback is to convey a sense of the feel of an object or surface - its texture, weight, response to pressure, etc..

Vibro-tactile devices depend on either voice coils or piezo-electric vibrators to vibrate a surface against a finger tip at various frequencies. Similarly, electro-tactile devices create vibrations in the finger tip - they function like the pads that physiotherapists use to stimulate the muscles electrically.

Micro pin arrays are more sophisticated and can give an impression of complex surface textures. They involve a matrix of tiny pins each of which can apply pressure to the skin, reflecting the texture of the surface. Unlike the vibration-based devices, they can also provide information about edges.

Pneumatic systems go a stage further, allowing shape as well as textures and edges to be conveyed. They usually work in combination with a glove, by dynamically filling pockets in the glove with air to convey the feel of the object being displayed. However, they do not deal with forces, and so the illusion is destroyed when pressure is applied by the fingers, which then seem to pass directly through the object.

UC Berkeley, Robotics and Intelligent Machines Laboratory


2.3.4 Force Feedback

Rutger's Force Feedback Glove

Force feedback systems combine output of forces from the system, with input of positions and forces to the system. This means that the user feels the force of objects in response to the forces he applies. Objects have apparent weight and inertia. To work, they require a structure again which forces can be generated. This is sometimes achieved by means of an exoskeleton that fits over the hand or glove, or by means of a specialised "gantry" through which all manipulations must be made.

Finger Force Feedback - SensAble`s PHANToM

Boston Dynamics Surgical Skills Simulator Immersion Laparoscopic Impulse Engine

To prevent the hand going through the object if large forces are applied, the exoskeleton or gantry must be bolted onto a more-or-less immoveable object, such as a heavy desk or the floor.

As well as the manipulation of data, such devices can form a component in robotic surgery.

SRI Robotic Input Device

Force feedback systems are good for certain types of application, notably endosopy simulation, but they are still too crude to convey subtle tactile differences between soft tissue types.


2.3.5 Auditory Displays

Sound is currently used only to signal relatively simple information - when a certain structure has been selected, for example - or to attract attention. However, the potential of auditory displays is much greater. People are very good at detecting changes in an auditory signal, and this allows sounds to be used to convey a range of subtle variations in texture. Sounds can also convey position information and provide feedback on whether a path is being accurately followed.

Auditory displays have the potential to be used in all areas of medical VR - from endoscopic trainers to complex surgery planners. The main drawback is that they are unnatural, but it remains to be tested how important this is in practice.


2.4 Augmented Reality


Augmented reality refers to the blending of the simulated virtual world (from medical data) with the real world (from normal vision or video). One popular application is in obstetrics, using data obtained from real-time ultrasound scans.

University of North Carolina

Ultrasound data seen in HUD superimposed on real world

The main problem is that of registering the two or more sources together accurately. There is also a suggestion that surgeons are unhappy with augmented reality because it implies a degradation of their direct view of the patient. They may prefer to switch between the two, rather than have the two fused together.


2.5 Challenges




See also: Section 4 Resources and Sources in Medical VR


[Back to TOP]

[On to NEXT - Section 3 Applications]

[Back to CONTENTS Page - Survey of Medical VR]