Specific Research Methodologies
Here, we focus on the former and attempt at defining the complexity of the environment based on the relationship between visual motion cues and self motor commands using our soccer playing robots.
Self body and Static Environment: The self body or static environment can be defined in a sense that the observable parts of which changes in the image plane can be directly correlated with the self motor commands (ex. looking at your hand showing voluntary motion, or observing an optical flow of the environment when changing your gaze). Theoretically, discrimination between "self body'' and "static environment'' is a hard problem because the definition of "static'' is relative and depends on the selection of the base coordinate system which also depends on the context of the given task. Usually, we suppose the natural orientation of the gravity and therefore it provides the ground coordinate system.
Passive agents: As a result of actions of the self or other agents, passive agents can be moving or stopped. A ball is a typical one. As long as they are stationary, they can be categorized into the static environment. But, not so simple correlation with motor commands as the self body or the static environment can be obtained when they are in motion.
Active (other) agents: Active other agents do not have a simple and straightforward relationship with the self motions. In the early stage, they are treated as noise or disturbance because of not having direct visual correlation with the self motor commands. Later, they can be found as having more complicated and higher correlation (coordination, competition, and others). The complexity is drastically increased.
According to the complexity of the environment, the internal structure of the robot should be higher and more complex to emerge various intelligent behaviors. We show one of such structure coping with the complexity of agent-environment interactions with real robot experiments and discuss the future issues.
DIRECT activity recognition (after Gibson) is an example of the integration of bottom-up and top-down strategies. In this paradigm, measurements are intertwined with the recognition task so that delineation among visual processes is non-existent.
In my recent research on motion estimation, a framework for learning and estimation of temporal models of motion has proven effective in dealing with complex problems such as: leg and arm tracking under self-occlusion and variations in execution, performers and view-point of activities.
The framework consists of a learning stage in which appearance motion trajectories are computed and then converted into a representation that can be used in a direct-activity-recognition in image sequences.
In the meeting I will discuss the following issues
(1) Are instantaneous measurements sufficient for recognition?
I will propose that a temporal framework is far more appropriate and economic since the ambiguity of instantaneous measurements brings to question the feasibility of effective recognition.
(2) What is DIRECT activity recognition? What are the pros and cons?
DIRECT activity recognition is an approach by which changes in image sequences are immediately interpreted using a priori learned activities.
(3) Activity invariants, What are they?
Defining activity invariants under spatio-temporal transformations is critical to recognition. What are the spatial and temporal "fingerprints'' of activities?
(4) How does spatial and temporal context affect activity recognition?
Context, spatial and temporal can be detrimental to the interpretation of activities. How can context be represented and used?
Back to Workshop Homepage