Virtual PAT: A Virtual Personal Aerobics Trainer

James Davis

jdavis@media.mit.edu




Publications On This Work

Virtual PAT: A Virtual Personal Aerobics Trainer
James W. Davis and Aaron F. Bobick
MIT Media Lab Technical Report #436, 1997


Summary

A prototype system for implementing a virtual Personal Aerobics Trainer (PAT) is presented. Unlike workout video tapes or TV exercise shows, this system allows the user to create and personalize an aerobics session to meet the user's needs and desires. Various media technology and computer vision algorithms are used to enhance the interaction of the character by enabling it to watch and talk to the user (instead of just the user watching the TV).


Brief Synopsis

In this work we discuss the design and implementation of a prototype virtual Personal Aerobics Trainer (PAT). This system creates a personalized aerobics session for the user and displays the resulting instruction on a TV screen. Here the user can choose which moves (and for how long), which music, and which instructor are desired for the workout. Depending on the mood of the user, the choice of instructor could make a large difference in the workout. For instance, if the user were tired and required strong motivation during the workout, a brash Army Drill Sargent as the instructor would make a great choice. Along those lines, the prototype system here makes available an Army Drill Sargent character as the instructor. The session created by the user is then automatically generated and begins when the user enters the area in front of the TV screen.

The TV display is composed of three windows. The leftmost window shows the instructor as he performs the moves. The top-right window gives the feedback to the user. The bottom-right window acts as a mirror so the user can also watch themselves as the perform the moves.

In addition, the user periodically receives audio-visual feedback from the virtual instructor on how he/she is currently doing. To accomplish this, we place video cameras in the room and use real-time computer vision techniques to recognize the aerobic movements of the user. Using the output of the vision system, the instructor then responds accordingly (e.g ``good job!'' if the vision system recognizes that the user is performing the aerobic move correctly). This vision technology is different from many other sensing technologies in that the user need not wear any special devices or be tethered to machines with bundles of wires. This enables the experience to be more natural and desirable.

The underlying motivation for building such a system is that many forms of media that pretend to be interactive are in fact deaf, dumb, and blind. For example, many of the aerobics workout videos that one can buy or rent present an instructor that blindly expels verbal re-enforcements (e.g. ``very good!'') whether or not a person is doing the moves (or even is in the room!). There would be a substantial improvement if the TV just knew whether or not a person was moving in front of the TV. A feeling of awareness would then be associated with the system. And because of the repetitiveness of watching the same exercise videos, this ``programmable'' system heightens the interest of the user by allowing the design of specialized workouts (e.g. exercising only the upper body).

This system moves beyond the highly un-interactive media forms of video tapes and TV shows by having the system watch the user (instead of just the user watching the TV). We feel that many future systems will be more interactive and less passive, and that perhaps one day will be commonplace within the home environment.



Look at one cycle of a jumping jack (Quicktime). Loop this movie to see how multiple jumping jacks would appear in the sytem (a smoother and better frame-rate is achieved in the actual system, though).


See and hear a positive feedback clip (Quicktime with audio). Check out another clip.

See and hear a negative feedback clip (Quicktime with audio). Check out another clip.




Click here to send me email.