DyPERS: A Dynamic Personal Enhanced Reality System
DyPERS, 'Dynamic Personal Enhanced Reality System', uses
augmented reality and computer vision to autonomously retrieve 'media
memories' based on associations with real objects the user
encounters. These are evoked as audio and video clips relevant for the
user and overlayed on top of real objects the user encounters. The
system utilizes an adaptive, audio-visual learning system on a
tetherless wearable computer. The user's visual and auditory scene is
stored in real-time by the system (upon request) and is then
associated (by user input) with a snap shot of a visual object. The
object acts as a key such that when the real-time vision system
detects its presence in the scene again, DyPERS plays back the
appropriate audio-visual sequence.
DyPERS MPEG, October 1998 (27 Meg)
DyPERS MPEG, October 1998 (COMPRESSED
6 Meg)
System's Overview
As depicted in the figure below, the system consists of three main
components:
- The generic object recognition system
- A simple wearable computer interface (consisting of a
three-button mouse)
- An audio-visual assosiative memory
- A wearable system including a microphone, a head-mouted camera,
and a head-mounted display.
The object recognition system is a
probabilistic algorithm which is capable of discriminating between
hundreds of everyday objects under varying viewing conditions
(lighting, view changes, etc.). See for a more details about the
recognition
system. The system currently runs at a rate of
approximately 10Hz on a SGI O2.
Once an audio-visual clip is stored,
the vision system automatically recalls it and plays it back when it
detects the object that the user wished to use to remind him of the
sequence. At the moment the assocation strategy is very simple (only
multiple objects to one sequence assocations areallowed). However,
using strategies developed in the conext of the remembrance agent,
more complex assocations will be addressed.
The current interface consists of three mouse buttons: one for
recodording the audio/visual sequence, one for assocating an object
with the sequence and one to create the-rest-of-the-world model, in
the following called garbage class. This latter class enables the user
to tell the system when it should not playback a sequence.
Hardware
In the current hardware setup images are transmitted wirelessly to an
SGI O2 workstation. The code is currently ported to an ordinary PC.
Possible applications of the system
The current system has been used in a museum tour scenario:
A small gallery was created using 20 poster-sized images of various
famous works ranging from the early 16th century to contemporary
art. Three classes of users in different interaction modes were asked
to walk through the gallery while a guide was reading a script that
described the paintings individually. The guide presented
biographical, stylistic and other
information for each of the paintings while the subjects
either used DyPERS (group A), took notes (group B) or
simply listened (group C) to the explanations.
After the completion of the guide's presentation, the subjects were
required to take a 20-question multiple-choice test containing one
query per painting presented. The following table indicates that user
of the DyPERS system attain slightly higher accuracy for the
multiple-choice test.
DyPERS user (group A) |
92.5% |
with notepat (group B) |
83.75% |
without any tool (groub C) |
79.0% |
Further possible application scenarios, which have not been explored
yet, include the following:
-
Recollection - Remembrance Aid:
One possibility is to use the system for remembering day-to-day
information in an active setting. This could include daily scheduling
and to-do list encoding. The user may record his/her calendar
or some notes indicating important things that need
to be attended to. This recording can then be associated with the
visual snapshot of the user's watch or a clock and would trigger
playback.
-
Education:
DyPERS has several interesting educational applications.
These introduce an interesting variation to the system's
usual operation since here the recordings are
performed by an expert while the learner
uses the system in playback
mode. For instance, the expert could be an individual with knowledge
of a foreign language (i.e. French), who would use DyPERS to record a
variety of audio pronunciations of everyday objects and to associate them
with visual snapshots of the objects. Thus, a
novice French learner could hear the audio playback whenever
facing an object of interest and hear the corresponding
French phrase.
-
Online Instruction - procedural information:
Consider the completion of an activity or operation which
involves many sequential steps and their corresponding actions.
DyPERS could be trained by an expert to show a novice how to
perform the complex activity online and interactively.
At each landmark in the activity, the expert would record the next
required sub-action (which would bring the user to the
following state or landmark). For instance, consider the assembly of
some pre-packaged furniture.
-
Augmented Perception:
This category includes the variety of further
sensory dimensions we may wish to incorporate to the inanimate objects we
encounter. For instance, a compact disc could be associated with a
small clip of the music it contains;
a person with poor vision could benefit by listening
to an audio description of the objects in his/her field
of view
-
Virtual advertising:
one could associate everyday objects with sales pitch and in entertainment
objects could be made come to life
Associating Audio/Visual Sequences to Objects
In the following example associations are shown:
- association of a personal object/a business card with the
corresponding person
- association of the wrist watch with the computer
stored shedule
- and associations of several images to the garbage class,
in order to teach the system when not to playback a sequence
Context and Links
DyPERS is part of our wearable computer projects and a continuation of
a row of efforts including the Rembrance Agent. The ultimate goal is
to enable computers to act like invisible bulters.
Contributors
References
Back up to the Demos menu
Bernt Schiele,
bernt@media.mit.edu
Last modified: Mon Jul 13 10:05:36 EST 1998