This is a preview of the print version of your report. Please click "print" to continue or "done" to close this window.

done

or Cancel

Document Viewer
 
Similarity Index
9%
What's this?
Similarity by Source
Internet Sources:
3%
Publications:
9%
Student Papers:
0%
exclude quoted exclude bibliography exclude small matches download print
mode:

Begin Match to source 2 in source list: http://arxiv.org/abs/1004.0085A stochastic model of human visual attention with a dynamic Bayesian network - Akisato Kimura, Derek Pang, Tatsuto Takeuchi, Kouji Miyazato,End Match Kunio Kashino and Begin Match to source 2 in source list: http://arxiv.org/abs/1004.0085Junji YamatoEnd Match Introduction and Motivation for the Problem Humans have a very useful mechanism of visual attention which allows them to focus only on the areas of interest in the visual field. Simulating this in robots would be a significant step ahead in some of the applications in computer and robot vision searches. This would be used as a pre-selection mechanism as it would give us the areas which are likely to contain the objects of interest. The attention in humans is believed to be controlled by the following two mechanisms - A reflexive Begin Match to source 1 in source list: Kunio Kashino. visual focus based on the saliency attributes.End Match"The saliency of an object is the state or quality by which it stands out relative to its neighbours. Saliency detection is considered to be a key attentional mechanism that enables organisms to focus their limited perceptual and cognitive resources on the most pertinent subset of the available sensory data."[1]. A voluntary choice of focus of attention in a task dependent manner eg. the gorilla video where we focussed more on the players even though the gorilla was definitely salient. Begin Match to source 1 in source list: Kunio Kashino. Attention is generallyEnd Match simulated using Begin Match to source 1 in source list: Kunio Kashino. one or a combination ofEnd Match these approaches. This paper deals with the problem of coming up with a suitable model for visual atention. The Approach Used According to a prevelant approach (feature integration theory), several primary visual features(eg colour, orientation) are first processed and then integrated into a saliency map. Another claim (signal detection theory) is that the elements in visual field are represented as independent random variables. This can be validated from the observation that when told to search for an object inclined at 45 degrees, our eyes never wander to the distrator in easy search but they may do so in case of the hard search. A combination of these two is used to obtain a stochastic sailency map where each pixel is a random variable. Also, to take into account the task dependent nature of attention, the paper also takes into account the eye movement patterns. The Visual Attention Model Since we want to simulate human attention, the only input to the model should be the various frames of the video. To take into account the intention, we also use a Hidden Markov Model layer to represent eye movement patters. The flow diagram of the model is the following - The proposed model for visual attention basically comprises of the following layers- Saliency Map - Itti-Koch saliency model is used to extract (deterministic) saliency maps. The implementation includes various Begin Match to source 3 in source list: Laurent Itti. feature channels sensitive to color contrastEnd Match , Begin Match to source 3 in source list: Laurent Itti. temporal luminance flicker, luminance contrast, orientationsEnd Match etc. The map also gives more weightage to the saliency around the central region of the video. The model used to generate the map is as follows - Stochastic Saliency Map - To generate this map which is actually used for predicting eye movements, we use the above generated saliency maps after associating a Probability function and also take into account the temporal changes. Eye Movement Patterns - In this model, two possible states for eye movements are considered - 1) A passive state where Begin Match to source 1 in source list: Kunio Kashino. one tends to stay around oneEnd Match pparticular Begin Match to source 1 in source list: Kunio Kashino. position to captureEnd Match relevant Begin Match to source 1 in source list: Kunio Kashino. informationEnd Match 2) Begin Match to source 1 in source list: Kunio Kashino. Active StateEnd Match where Begin Match to source 1 in source list: Kunio Kashino. one movesEnd Match focus Begin Match to source 1 in source list: Kunio Kashino. aroundEnd Match the scene. Eye Focussing Density Maps - These maps, computed from the above data, represent the probability of eye movements through the video. Conclusion The problem of modelling visual attention is a challenging one and there are many ways to approach it. This paper proposes a new method to predict likelyhood of human attention on various regions which combines the saliency features and the eye movement patterns and the results obtained are an imprvemt over the previous ones. The challenge is to further improve these and/or realize real time attention models which perform on real time videos instead of video frames. References [1]Wikipedia - Salience(Neuroscience) The images included have been taken from the paper which being summarized - Shubham Tulsiani