Design Goals
In the original design a play-based environment was considered in which the child could interact with the system so that the child would drive the interaction. In this design concept, the doll initialized the system when picked up. Picking up the doll would activate the doll to express itself by making an affective sound, lighting up, or jiggling. These doll mechanisms were to reinforce the affective expression on the doll’s face and demonstrate the emotional quality of that doll. Picking up the doll would establish a feedback loop between the doll and the system and retrieve a video clip to match that emotion. When the doll was set down in front of the screen, the video was to play a scene where an actor or animated character would express the same emotion as the doll. Each time the character on the screen would evoke an emotion, the doll would express that same emotion as well, for example when the character on the screen would giggle the doll would giggle too. Each time the doll expressed itself, a new scene would emerge on the screen showing another way that that emotion could be shown. When a new character would appear on the screen, the doll would express itself again, a new scene would appear, and so on, thus completing a system loop.
The advantage of this design was the child-driven approach and the entertaining way the system interacted because of the child’s selection. Though this approach could be fun, there was concern that this approach might create confusion for an autistic child. Also an autistic child’s ability to recognize emotion using this style of interaction could not be measured. Because meaningful data could not be collected on how well the child distinguished the different basic emotions, a different approach was implemented.
ASQ displays an animated show and offers pedagogical picture cues -- the dwarf’s face, word, and Mayer-Johnson icon -- as well as an online guide that provides audio prompts to encourage appropriate response behavior from the child. The task was to have the system act as an ever-patient teacher. This led to a design focused on modeling antecedent interventions used in operant behavior conditioning. In essence, ASQ represented an automated discrete trial intervention tool for teaching emotion recognition. Now the video clip initializes the interaction instead of the doll.
The current system starts with a clip displaying a scene with a primary emotion (antecedent) for the child to identify and match with the appropriate doll (target behavior.) After a short video clip plays, it returns to a location in the clip and freezes on that image frame, which reinforces the emotion that the child is prompted to select. The child then indicates which emotion he recognizes in the clip, or frame, and selects the appropriate doll matching that expression. The doll interface, which is the only input device to the system, creates a playful interaction for the child.
Surprisingly, a play-mode still remained part of the system and promoted social interaction when used in a different context, play versus training. By assigning dolls to each person seated around the computer, this interaction creates social play between group members and their doll, which serves as their avatar. When an emotion displays on the screen anyone can interact using his or her doll to match the emotion shown. For example, if Johnny recognizes the happy emotion on the screen and player Jody has that doll, Johnny can say, "Jody, you have the happy doll," thus promoting joint-attention. Johnny and Jody now share communication and affect; communication when Johnny tells Jody to match emotion with his doll, and affect when matching the doll to the displayed emotion.
The system, affective social quest, gives a child the ability to view a sequence of movie segments on the screens that demonstrate basic emotions. By the system prompts, the child also has multiple representations for the emotion presented. Within the application the child can see several expressions for each emotion from a variety of clips. The dolls and characters associated with a particular emotion will hopefully encourage the child to mimic the vocalization, facial expression, posture or gesture they see on the screen.
The screen interaction has two different parts: child and practitioner. The child interface displays elements that were configured by the practitioner. The number of available options enhances the interaction capabilities by allowing the practitioner to set up a specific training plan for each child session, as done in manual behavior analytic trials. Presented are the screen layouts for the practitioner and the reasons for choosing that interface.
This section presents the elements of the design, shown in screen captures, to provide an inside view of the application windows. The practitioner can register a new child or select the pre-set profile for the child and then set up the child session. The session will display the presentation on the child’s screen based on the practitioner’s selections in the configuration window.
Figure
5: Create Profile Screen
The practitioner interface contains four main windows: Session Profile, Add New Profile, Success History, and Configuration. A fictitious set up for demonstrating the flow of how the practitioner interacts with the following windows are presented using the name Johnny.
Figure
7: Success History Screen
Success History displays the child’s success overview and sessions to date. The window presents the original interface designed in the application. The Emotion indices indicate the emotions presented to the child with the child’s performance rating for each emotion. The performance rating, Fluency shows the child’s percentage of correct responses to date for each emotion. Fluency is discussed later with the other performance measures. The child overview presentation helps the practitioner view the overall success for that child to date. Instead of averaging the overall data to date, more details were gathered trial by trial for each session. The final software version used the data differently than shown here, so the emotion indices are not updated by the system. More will be said later about these measures and the method that was used in the pilot test conducted at the Dan Marino Center.
The Success History screen is the gateway to other resources in the system. The options Configuration,View Statistics, Different Profile, and Start Session allow for accessing information stored by the system. Different Profile returns to the Session Profile screen and Start Session begins a child session. Configuration and View Statistics will be illustrated and explained below. Here is the original design describing how the child’s success could have been viewed.
Configuration brings up a window for configuring the session interaction. The window contains two configurations that can be set up for the session: Interface Options and Clip Configuration. Interface Options list different cue options for each Cue number and clip configuration.
Many different Cues are displayed in the interface screen for the child interaction. Visual aids can be selected to display on the child screen: icons, word, dwarf, guide, and pop-up. These form the first category of selectable options. The next category is Doll Cues. Dolls can cue the child with one of the following three choices: affect sound, hatband lights, or internal vibration. Continuing to the right, the next category is Guide Cues spoken by an online guide.
Figure 8: Interface Options section of Configuration Screen
The guide audibly plays one of three different sequences to prompt the child to select a doll to match the emotion in the video clip. For instance, when a happy clip plays, the guide will say, "match happy" when Match is chosen, or say, "put with same" when Same is chosen, or say, "touch happy" when Touch is chosen. Likewise, reinforcements for incorrect doll selections will say, "that’s sad, Match Happy" for Match. One row of selections sets up one of seven configurable options for the interface. After each Cue has been selected for one Cue, another set of hints can be selected.
Seven different cue set-ups are configurable for one session with the timing, sequence, and repeat rate tailored for each Cue. The Seconds Until Cue select box allows the practitioner to set the time interval between each Cue series. The variety and flexibility of Cue options give the practitioner a tiered approach for setting up the interaction most effective for a particular child.
The order in which the Cues will occur can be set in the Next Cue selection box. The flexibility allows the practitioner to experiment with different Cue approaches to aid each child towards his best performance in emotion recognition.
The last Cue category includes the option of a reinforcement clip. When the child selects the appropriate doll, the guide reinforces that selection and says, "That’s Good, That’s <emotion>" the correct choice selected. An option to reward the child with a reinforcement clip that plays for five consecutive times can be selected by clicking that check box.
Special clips are selected and stored as reinforcement clips in the database. Reinforcement clips are not focused on emotion as much as on rewarding the child with entertainment – for example, the Winnie the Pooh Tigger Bounce song is played– and may reinforce the child’s match of the correct doll and motivate the child. A reinforcement clip plays after the child touches the correct doll. After the next stimulus clip plays and the child matches that emotion, the same reinforcement clip will repeat; the clip repeats five times before a different set of the same reinforcement clips play. The reinforcement clips are selectable from the clip configuration screen.
Figure
9: Clip Configuration section of the configuration screen
Clip configuration offers customization as well. Each column can be sorted by clicking on any one of the headers Title, Source, Complexity, Duration, Primary Emotion or Filename. The ability to sort gives the practitioner a quick view of different aspects to choose between or to group together. Clips can be deselected by highlighting the clip or series of clips and hitting the space bar. For example, Cinderella may not be the best stimulus clip for one child because the clips may be too complex or the child has watched them many times at home. Alternatively, certain emotions can be deselected in early trials.
The design objective was to offer as much flexibility to the practitioner as possible for customizing the screen interface for a particular session or child. This is especially important because autistic children usually have unique idiosyncratic behaviors. Clicking the Done button returns the practitioner to the Success History screen. A session using the configuration just set up is started by clicking the Start Session button.
The practitioner interface sets up the child screen interface. The child screen interface provides a therapeutic environment with sub systems to create a heterogeneous way for the child to interact with the application. Following are the media elements to the child interface.
The screen interface serves as an output device for the application. This session screen was designed to allow the child to view images in a consistent location. Images of the icon, dwarf, word, and guide always appear in the same spot. The panel allows enough area for each to be displayed unequivocally and frames the video clip nicely. Initially, the idea was to have all the images appear in the bottom bar, but this crowded the screen and could distract the child by drawing unnecessary attention to that area and away from the video.
Figure 11: IconsFigure 12: Armbands
The different cues intend to complement existing teaching materials and to reinforce the images in the system. Mayer-Johnson is the originator of Picture Communication Symbols (PCS) and has a set of 3,200 symbols used in many schools for communication by nonverbal and autistic individuals to visually aid communication (Mayer-Johnson 99). Each doll comes with its own removable icon that can be used as a matching cue or removed for advanced interaction. Incorporating these icons was to complement certain standardized teaching methods.
Words are coupled with each icon picture. Nonverbal and autistic children often learn words from pictures, such as the PCS, and will sometimes carry a book containing these images to communicate with when they are not able to articulate verbally. Speech and language pathologists help children use the pictures to learn words and to construct a story. In keeping with this model, the word appears over the icon as well as in its own screen frame.
The guide animates to engage the child either with a positive reinforcement or a prompt to help the child make the correct selection. The guide may appear on the screen if the practitioner chose this visual aid. The guide displays no affective content to keep it separate from the other emotion labels in the application.
The visual guide is animated with its mouth moving. The decision to animate the guide’s verbal prompts with flat affect was because the bear, the displayed guide, represents more than one emotional state while the rest of the interface is directly paired to a single emotion, the importance of consistency led to choosing no affect in the guide’s speech.
The dwarf faces help the child in matching the appropriate doll. The face contains the outward and visual expressive part of the doll. The visual matching helps children to match the dolls with the same emotion from the content they recognize in the video clip.
Another visual feature includes a pop-up window overlaying the video clip. The pop-up is a very short video of someone expressing one of the four emotions. This feature did not get implemented in this version of the software, but was included in the interface as a placeholder for when it would later be added.
The purple background color emphasizes the important content of the video in the center while not being too bright, which might be potentially distracting to the child, nor too dark, which might confuse the child by seeming like a negative reinforcement. The child interface was implemented after many revisions based on suggestions from professionals in the field.
Figure 16: Static Images of Video Clips (angry, happy, surprise, and sad)
Source
Animation, in general, minimizes the background in a scene and the focal point is usually on the main character. Secondly, animation exaggerates the features of emotional expression. Disney and Pixar spend great efforts in representing the quality of emotion in each pixel of an expression. Animators catalogue expression and movement for posture and facial expression, particularly for eye expression (Thomas 95). Pixar, for example, has a room of mirrors where animators can try to mimic the postures and expressions of the emotion they are trying to depict as they animate a character.
Included with the animated expressions are realistic human expressions from television child programming, such as Blues Clues. Though the selection of these clips represents a small sample of the available programs, it was hard to find a variety of clips to represent each emotion, whereas the breadth for the happy emotion was abundant in both non-animated and animated footage (57% happy, 16 % angry, 15 % sad, and 12% surprise of 518 total clips (see appendix for list of sources.)
The clips were not professionally edited. In special cases though, the clips were rendered using Media 100 or Adobe Premier. The challenge was to crop the video segment in order to capture the whole audio track for a scene while keeping the visual content focused on the salient expression. In normal interaction, words blend with other words and clipping them leaves disturbing audible sound. People previewed the clips to validate footage and commented on the audio cuts, so time was spent recapturing segments to reduce those awkward cuts as much as possible.
Low Complexity Criteria | Medium Complexity Criteria | High Complexity Criteria |
|
|
|
Format
MPEG-1 was the format chosen because of the compression rate and decoding compatibility across various applications, and mainly because the Java JMF version included that format in its API. The MPEG-1 code pattern uses still image data, intra frames (I frame), and bidirectional frames (B frame), then predicted frames (P frame) in its coding scheme, e.g. IBBPBBPBBPBB (MPEG 97).
Evaluation by Practitioners
Video clips for the target audience -- children between three and five years of age -- were collected. After the animation and the children’s television programming clips were digitized, professionals in the field of autism reviewed a videotape sampler with samples of these clips. The responses back from the viewers validated the decision to include animation and to keep the clips short, to less than thirty-seconds. Children’s programming, from shows such as Baby Songs and Blues Clues, received positive appraisal and were added to the collection of clips to show real children expressing emotion.
What surprised most of the viewers was the way the clustered set of emotions affected them when they viewed the sampler. For instance, the cluster of happy clips elated the viewers. They commented on feeling happy after viewing only a three-minute segment with twenty different happy clips edited together. Likewise, angry and sad had the same affective sway towards them feeling those emotions. Their feedback helped identify clips that contained complex emotions and were labeled complex or eliminated from the collection because these were not illustrating basic representation of an emotional expression.
In the meeting, the idea to possibly incorporate a play-mode into the application design was suggested. This was similar to the initial idea of creating a doll driven system. Those who recommended this approach were curious as to whether children preferred a play mode to the behavioral mode. From a design standpoint, to include both a video driven system and a doll driven system required the interaction in the application to be re-designed to switch back and forth between both modes. This was difficult to resolve, particularly when statistics were gathered based on the child recognizing an emotion. Capturing the interaction between the child and doll is easiest with one interaction approach. The doll driven concept has thus become part of the future work ideas.
Operational Modes
Plush Toy Interface
Video segments initiate ASQ interaction, but to play in this environment, plush doll dwarves are employed. ASQ uses four interactive dolls that represent the basic emotions, angry, happy, surprised and sad.
ASQ, being an interactive system, also helps in the development of joint-attention. Joint-attention, as mentioned earlier, is a social skill for pointing, sharing, and showing. It incorporates eye-gaze and turn taking with another person. Some autistic children are just not interested in eye contact, and thus rarely initiate it. They prefer to play by themselves, often with inanimate objects, but may not like the loneliness of playing alone. ASQ can help by having different dolls act like playmates.
The dolls may be set up to offer helpful cues during the child session. Each doll either vocalizes emotion, internally jiggles, or its hatband lights up to cue the child to make the correct selection. After the clip has played the appropriate doll will activate one of the cues set up in the configuration. Though this was implemented in the doll hardware, child responses to these were confusing and not used in the pilot at Dan Marino. The child could not easily attend to both the doll cues and the screen cues at the same time. The inclusion of the doll cues may be added in more advance levels of interaction with ASQ, after the child shows success with screen cues.
Applied behavior analysis (ABA) uses operant behavior to shape behavior with reinforcement and prompting methods. Given a stimulus, the respondent is to behave in a trained way. The behavior is modified by continually reinforcing the correct behavior while shaping other behaviors toward the expected result. For example, discrete-trial training procedures derived from strict principles of behavior analysis and modification typically address singular developmental skills until some mastery has been observed. This process includes repeated trials over a specific amount of time. In each trial, a practitioner presents an antecedent cue (discriminative stimulus) and, dependent on the child’s ensuing response, presents the specific consequential event to either reinforce the response, or prompts the child in order to increase the probability that the targeted skill will be exhibited on subsequent trials (Lubin 89). Although highly effective when administered by humans, this method is highly expensive and labor-intensive due to low teacher-to-student ratios. ASQ attempts to offset the time demands on the practitioner with a complementary tool.
ASQ implements operant behavior in its ABA mode. A guide’s verbal response or reinforcement clip rewards the child’s correct behavior. The guide also provides repeat prompt responses to incorrect dolls selected by stating the doll selected and re-requesting the desired response. Additionally, different screen cues offer matching aids for shaping the behavior. The child can either directly pattern match -- the picture of the dwarf’s face on the screen to the dwarf doll, a screen icon and word to the icon with same word on the doll armband -- or use standardized intervention tools. All the cues, dwarf’s face, icon, and word, help the child to generalize the emotion to different representations. They assist the child in identifying one representation and associating it with the expression played in the video clip. As the child’s performance increases, these shaping aids can be eliminated, leaving just the video clip stimuli alone.
Application Development
The hardware interfaces the dolls to the system through infrared communication. Dolls are embedded with iRX 2.1 boards. The iRX is a circuit card measuring 1.25" × 3" with an RS-232 serial port, a visible light emitting diode (LED), an infrared LED, and an infrared detector. A 12-bit micro controller, PIC16F84, made by Microchip Technologies, Inc. controls the board (Poor 99). The iRX 2.1 uses five of the programmable integrated circuit (PIC) input-output (I/O) ports; the remaining eight ports are used by the applications that control doll features: toy detection switch, affective sound recorded voice box, haptic internal vibration motor, and hatband LED lights.
Each toy has a unique ID recognized by the system. The system sends codes for each session to the doll for custom responses based on condition parameters. The system continually polls the toys to identify the doll selection from the child’s touch on the touch switch over a set period of time. The dolls continually request data from the system to activate their appropriate applications based on the system’s configured cue features.
As stated, the system is designed with a great deal of flexibility so each session setup is customized for each child by session. Also, the system is extendable. The system can include custom clips tailored for the child. Digitized clips can be loaded into the database located on the hard drive and retrieved randomly with the others in the system.
The Java programming language was chosen to develop the application because of its system portability, rapid prototyping capability, and media rich packages. Designing the system behavior was challenging because of the desire for built-in flexibility. The application uses a multi-threaded
environment with two application programming interface (API) architectures -- JDK and JMF. This coupled with the transfer of interface information synced to the system clock made debugging a complex and daunting task even for minor modifications.
The system is subdivided into five primary task functions illustrated in figure 18. The main program manager is the application puppet master. It manages the different functions in the application. Being multithreaded, it keeps track of the system interaction and associates it to the media and syncs it to the system clock while polling the serial port communication for hardware interaction. The database is the main repository for data inputted or selected for interaction.
The system continually updates its cached arrays based on the response interactions the serial communication and the cue interactions set up in the configuration. Each 250 ms of interaction time is recorded and stored in memory until the system is exited.
After the application is executed the JDBC-ODBC establishes a bridge between the database and the Java source code. This connects the front-end to the backend and manages the data passed between long term and short term memory. Using SQL select statements, the application collects data and writes it to an array or to respective database tables. At the system execution, it calls the database and requests profile names using SQL SELECT statements.
There are five database tables queried by the system. ASQ executes JDK 1.1.x. from an MS-DOS prompt window of a Win32 environment running Windows 95. The application instantiates a session frame and a Java database class. The database is queried using JDBC/ODBC, the bridge between the Java source code and the database (Microsoft Access). Names of all existing profiles are selected and their addresses are loaded into a table accessed by the application from the system’s memory. When the practitioner creates new profile, a window for the new session frame is instantiated for profile data to be inputted: profile name, deficit and age. When the Done button is clicked that new entry is added to the application table and later stored in the database for that new child profile. If the practitioner chooses an existing profile success history is instantiated and the frame displays in the window on the screen. Originally, this frame was to include statistics on the child’s performance to date. The frame still exists as part of the interface, but the fields are blank. A different method of data gathering was implemented as opposed to having the application calculate an aggregate performance rating for the child’s over all sessions.
A configuration frame is instantiated when the configuration button is clicked on. The new frame displays flexible set up options. In the clip configuration section of this frame video clips are listed and are sorted by the system through Symantec’s application-programming interface (API) package. In that same frame are interface options. Clicking on radio buttons activate application interface features: guide responses and screen displays for the six cues. When that done button is selected, all the configuration parameters are written cached and dynamically accessed by the application based on the system clock associated with the selected features. After a session the parameters are saved into memory in an array string until the session application is exited. The array data is then written to a comma separated value (CSV) file with interaction responses recorded during the interaction.
Clicking on the statistics button instantiates the frame where statistics were going to be stored. The original design was going to include statistics computed by the system and presented in the Success History frame. The intention was that the application would collect data for each profile, calculate the statistical measures and keep a running sum of these measures, which were to be shown in this frame. These were aggregate statistics for the child. With the expansion of the data gathering task a different method of statistical presentation was chosen and changed the data structure. The data is now exported to a CSV file to be read into a spreadsheet program, such as Microsoft Excel. This design changed in the last stage of the system development. It was thought that each session’s data should be preserved and that it would be best if more data could be collected on the child session interactions. As stated earlier, the new approach made the initial design of the data structure obsolete and these fields in the frame are blank. The screen currently displays no information based on the child performance.
Export of the data is called from the menu bar, under File > Export Data, where the practitioner is prompted to give the path and file name for the two files. One filename is for interaction values and the other is for interface options set up by the practitioner in the configuration frame. Data is written to the CSV file at the time export is selected from the menu-bar. Data is collected for all interactions from the last export data to present export data. The system array is cleared with each export and system shutdown. It is important that session data collected for each session be exported after the session to keep the data separate from session to session or from child to child. When the application is exited, then data for parameters, video frames, and interaction intervals (in milliseconds) stops streaming and is downloaded into the comma separated file.
Three threads run together and share the system processing. The main thread controls the frames (windows), a secondary thread controls the serial communication for detecting doll interaction and another controls the data passed to the application array for the CSV files and manages upcoming media elements for the interface. The two secondary threads run their own clock: the serial runs at 250ms and data runs at 150ms. These threads are handled by methods written in the main program.
The main thread controls a JMF panel that deals with the video clips played in the center of that frame. The secondary thread, managing the data, sets up the next frames and waits in the background to be called by the main program. For example, data continues to be stored in the array with addresses accessed from the database, and passed back and forth while serial communication from the doll detection and doll activation are handled by the other thread.
JMF dictates the timing of the video frame and other screen interface elements. The application interaction has to wait until that clip completely plays before the application performs another task. Garbage handling became a major problem in early development because of the rapid growth in virtual memory taken up by the interface elements. Each time a clip played, Java’s garbage collector was called to clear all memory except array data and interface components. Initially, either no garbage was deleted, or no garbage was collected. With the help of developers on the JMF development team, code was re written based on their suggestions.
The touch switch illustrated is a copper laid tape formed into a switch separated by Velcro and cloth. When the switch is touched, the copper tape creates a contact on each side causing an interrupt to occur in the component software, signaling that that doll was chosen. The touch switch pad was made using cloth so that it was soft and could not be felt when the doll was touched or lightly squeezed. The touch pad is located in the tummy of the doll, underneath the doll’s shirt clothing.
An iRX board controls the component applications for each feature of the doll selected in the interface options. The dolls are each configured with a touch switch, speaker, pager motor, and band of four LED lights. The features of the doll are stuffed inside the doll and the wires from each component go through the doll’s back and into its backpack. The black box contains wiring for each component connected to the iRX board powered by a volt battery. All the hardware fits into each dolls’ backpack. The backpack provides a storage container to maintain the plush interface of the doll.
The hatband for each doll contains an array of LED lights, LED receivers, and black transmitters. Each is threaded on its own wire for ground and power. Each wire strand is insulated with plastic casing to shield it from other wires and hot glued for extra protection. The three threaded strand -- one for LED lights, one for transmitters and one for receivers -- are woven together, strategically separating each strand and offsetting them between each other. The strand forms a u-shape that fits on the doll’s head and was sewn into place. Four emitters were selected. The goal was to extend the range of wireless communication. It was effective, though it reduced the communication distance between the system’s receiving device attached and the dolls from nine feet to three feet. The change in distance did not affect the interaction because of the doll’s close location to the system’s receiver box.
Figure
22: Doll's Recording Unit
The affective sound for each doll was recorded onto a microphone hardware chip that has a quarter watt output. A single twenty-second audio sound was recorded for each doll. The component is threaded through the back of the doll’s head and the speaker is placed in the doll stuffing around the doll’s mouth.
Inside each doll is a pager vibration motor. The motor is inside the doll’s nose and causes the whole doll to vibrate when activated. Originally, the idea was to have the doll visibly jiggle and move on a flat surface, but the amount of power and size of the motor exceeded the system power source capabilities. This motor offers a good haptic sensation through the doll when the motor is activated and can be felt anywhere on the doll.
The dolls are mounted on a table with recliner boards and adhered to the table with Velcro. The Velcro prevents the recliner boards from sliding when a child pushes the selected doll. The recliner holds the dolls upright for the child to see them easily from a chair and pick them up to play with. The ease of placement on the recliner allows the dolls to be arranged differently on the table or for one or more to be removed from a session.
The data communication interface between the doll and the system uses infrared to wirelessly transmit signals between the system and dolls. A doll’s tetherless feature allows the dolls to be played with as well as to be arranged in various positions on the table during child sessions.
The dolls receive codes from the system and activate subprograms located in the hardware and stored on the iRX board. These programs run the different doll features. For example, codes sent to the doll could be one of the following: for happy either sound, light, or vibrate (HS, HL, or HV) ; sad to either sound, light, or vibrate (SS, SL, or SV) etc, respective of the interface option set up in the interface by the practitioner. Dolls send codes to the system to indicate whether they are present or detected from a touch on the touch switch: (happy (H), sad (S), angry (A), or surprised (Z)).
Doll programs run a system loop to detect whether one of two things take place: a signal has been receive, or a doll has been chosen. When the doll receives a code, it is processed by the doll’s program to determine what feature was requested, and the doll performs that featured action. When a doll is touched the touch switch interrupt shifts from high-to-low and a code is sent the system, causing the interface to respond based on the code sent. This infinite loop continues until either of these code signals occur.
Figure
26: Doll's Communication Device
The doll program looks for the ID from the system based on the touch switch interrupt. The system continues to perform this test until either the identification signal is detected or a cue to the doll is sent, activating one of its cue indicators. The infrared receiver has an interrupt handler in the software for detecting transitions and responds accordingly.
Andrew Lippman originally suggested toys for the interface to engage the child. The use of toys as the physical interface explored research opportunities to investigate serial communication using multiple objects with one system receiver. Existing toys, such as the ActiMates dolls, can interface with the computer, but it are not capable of recognizing more than one doll at a time. The interaction through this interface as well as the communication between the multiple input devices explored a novel way of computer interaction.
Apparatus
Four toy dolls representing primary emotions were the child’s input devices to the application. Each toy doll was loosely positioned on the table on a reclining board adhered to the table with Velcro pads. The dolls were mounted on reclining boards facing the wireless communication box and child-monitor. This allowed the children to see the entire doll in front of them. The dolls could be picked up easily from their stand, but were intended to remain on the stand and to be selected by the child when pressed the belt-buckle of the chosen doll.
The goal was to see if children can correctly match the emotion presented on the child-screen to the emotion represented by each doll. For experimental control the same dolls were used with each child.The automated training was arranged to teach children to "match" four different emotion expressions, e.g. happy, sad, angry, and surprised. A standard discrete-trial training procedure with the automated application was used. Subjects sat facing the child-screen that exhibited specific emotional expressions under appropriate contexts within the child’s immediate visual field. A video clip played for between 1 and 30 seconds. The clip displayed a scene in which an emotion was expressed by a character on the screen. The screen ‘froze’ on the emotional expression and waited for the child to touch the doll with the matching emotional expression (correct doll). After a pre-set time elapsed, a specific sequence of visual prompts displayed on the computer monitor and auditory prompts played through the computer speakers.
If the child touched the doll with the corresponding emotional expression (correct doll), then the system provided an online guide that audibly stated "Good, That’s <correct emotion selected>," and an optional playful clip started to play on the child-screen. The application then displayed another clip depicting emotional content randomly pulled from the application.
If the child does not select a doll or if he selects the incorrect (non-matching) doll, the online guide provides a verbal prompt: "Match <correct emotion>" for no doll selection, or "That’s <incorrect emotion>, Match <correct emotion>" for incorrect doll selection. The system waits for a set time configured by the practitioner and repeats the prompts until the child selects the correct doll. An optional replay of the clip could be set up before the session, in which case the application replays that same clip and proceeds with the specified order of prompts configured in the set up. If the child still fails to select the correct doll, the practitioner assists the child and repeats the verbal prompt and provides a physical prompt, e.g., pointing to the correct doll. If the child selects the correct doll but doesn’t touch the doll after the physical prompt is provided, then physical assistance is given to insure that the child touches the correct doll.
A decision to collect details of each child’s interaction changed the data structure for the original profile statistics. The change to the data collection method changed how the system treats the data. The changed collection method monitors the child interaction in milliseconds using the system clock. The interaction parameters set up by the practitioner and the randomly retrieved video clip for one trial is written to an array stored by the system. The system collects the time, in milliseconds, that the child touches a doll. For each set of screen aids for one cue and the video clip data for that one clip shown to the child, the system records each doll the child selects and when they selected it. All these data points are written to an array table for each video clip trial. When a session is complete, all the array values, one for each trial, are exportable using an export function in the application. These values are written to a comma separated value file that lists the trial interaction values and the data can be viewed using a spreadsheet program to view the trial data in the spreadsheet rows. With these values, performance ratings can be manually computed. It was thought that the independent trial data would contain more information than a summary to date of performance ratings generated by the system. Below are the formulas used to compute the measures.
Measures of Response Rate (Ri) are used to track the number of training trials (opportunities during each session as affected by subject compliance).
T i
r e = Number of incorrect matching doll responses
T i = Elapsed time under that specific session
Fluency with known physical constraints over response rate is:
ASQ_Chapter 4: Evaluation