This is a preview of the print version of your report. Please click "print" to continue or "done" to close this window.

done

or Cancel

Document Viewer
 
Similarity Index
17%
What's this?
Similarity by Source
Internet Sources:
17%
Publications:
10%
Student Papers:
7%
exclude quoted exclude bibliography exclude small matches download print
mode:

Paper Review: Begin Match to source 4 in source list: http://web.mit.edu/torralba/www/Estimating Scene Typicality from Human ratings andEnd Match images Krista A. Begin Match to source 4 in source list: http://web.mit.edu/torralba/www/Ehinger,End Match Jianxiong Begin Match to source 4 in source list: http://web.mit.edu/torralba/www/Xiao,End Match Antonio Begin Match to source 4 in source list: http://web.mit.edu/torralba/www/Torralba,End Match Aude Begin Match to source 4 in source list: http://web.mit.edu/torralba/www/OlivaEnd Match Introduction to Cognitive sciences (SE 367) Bhuwan Dhingra (Y8167) Introduction - The paper being reviewed, [1], aims to study the Prototype theory applied to natural scenes. Scenes are basically defined as visual entities which characterize a given place or a location such as the airport, living room, or forest etc. Prototype theory claims that these scenes are represented in the form of prototypes in humans. The goal of this study was to, first, determine the prototypical examples of some well-defined categories, and secondly, to see if these "typical" examples are better classified using machine learning algorithms. Prototype Theory - Most of the cognitive psychology community today agrees on the Prototype theory of representation in humans, according to which categories are represented in the human brain in the form of a prototypical element of that category. This idea of prototypical representation stems from the higher typicality of certain instances than others belonging to the same class. For example, while both crows and penguins are types of birds, most of us would agree that the crow is a much more typical example of a bird than the penguin. Similarly, if a person is asked to imagine a tree, instead of thinking of a particular tree, they usually think of an average tendency or a prototypical tree. It is not necessary that the person would have actually encountered this central tendency, instead it is usually formed by some process of abstracting out the essentials of a category from the instances encountered of the category. Furthermore, the response time for recognizing a category has been shown to be lesser for more "typical" examples of the category, [2]. Obtained from http://www.jephelan.com/coglab/class2.pdf The prototype theory was proposed and formed Begin Match to source 5 in source list: http://en.wikipedia.org/wiki/Prototype_theoryin the 1970s by Eleanor Rosch and others,End Match [2]. It Begin Match to source 5 in source list: http://en.wikipedia.org/wiki/Prototype_theorywasEnd Match quite Begin Match to source 5 in source list: http://en.wikipedia.org/wiki/Prototype_theoryaEnd Match deviation Begin Match to source 5 in source list: http://en.wikipedia.org/wiki/Prototype_theoryfromEnd Match the existing theories at that time, most of which defined categories in terms of properties that an object must satisfy to belong to it. The prototype theory is not limited to just objects. Scenes (such as an airport scene), scripts (prototype of an alphabet), and concepts in general (concept of a game) are also represented by prototypes in the brain. Scene Categories - Like objects, scenes can also be classified into different semantic and functional groups/categories. Some of the scenes would be better examples of there category, while others would not, [1]. Prototype theory was first extended to scenes in the work of Tversky and Hemenway (1983), [8]. They argued that scenes have categorical structure like objects. In the given work, a database of about 706 scene categories, each containing at least 22 images was formed from the Scene UNderstanding (SUN) database, [3]. To obtain a measure of the typicality of these images an online experiment was conducted on Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdfAmazon's Mechanical Turk, where workers are paid toEnd Match perform small Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdftasks.End Match People were shown a randomly selected subset of the images belonging to a category and asked to select the most typical image. This way a typicality rating out of 5 was given to each image based on the Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdfnumber of times it was selected as theEnd Match most typical image. Several measures were taken to ensure the validity of the tests, which included dropping the data obtained from people with ambiguous choices. Following are the most typical examples of some of the larger categories (obtained from [1]): Even if the images were selected at random, some would be rated more typical than others simply due to chance. Hence a good measure of the reliability of the experiment is to compare the ratings obtained above to the ratings which would have been obtained had the images been selected at random: Typicality ratings of most typical image in category vs Number of images in category (obtained from [1]) The above graph clearly shows that the ratings obtained in the experiment are in general much greater than those obtained if images are rated randomly. This indicates that while rating, the participants were following some sort of a pattern (such as matching the images to their prototype representations), [1]. Hence, the experiment was able to extract out the most typical images of most categories. Scene Classification using Global Features - Classification of scenes into different semantic categories is a well researched problem in the field of computer vision. Xiao et al. proposed an all- feature kernel for scene classification, [3], which combines some of the more popular approaches: The GIST descriptor, Olivia and Torralba, [4] Dense SIFT features, Lazebnik et al., [5] The HOG features, Dalal and Triggs, [6] Self-similarity descriptor (SSIM), Schechtman and Irani, [7] The all-features classifier proposed by Xiao et al. uses the above features and many more (eg: Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdfcolor histograms, representations of texture and scene regionsEnd Match etc.) to perform classification. Only 397 categories were used for testing and training (those which had at least 100 images) to ensure that the results were reliable. The all-features classifier had an overall accuracy of 38% on this dataset, [1]. Next step of the experiment was to study whether there is any correlation between the typicality scores obtained from human ratings above, and the classification performed by the classifiers. Results - To study whether more typical images are classified better or not, the images were divided into 4 quartiles depending on their typicality scores. A one-versus-all Support Vector Machine (SVM) was used in [1]. Following classification accuracies were obtained for the four groups (obtained from [1]): The above graph clearly shows that the most typical images were classified with an accuracy of about 50% while the least typical images had an accuracy of only about 23%. Another measure of the ease with which classification is done is the confidence of a decision. A confidence near 1 of an image shows that the Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdfclassifier believesEnd Match that Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdfthe image matches its assigned categoryEnd Match well, whereas a confidence near -1 indicates the opposite. Following figure shows the confidence versus typicality for both correct and incorrect decision with increasing typicality: While the confidence score of correct decisions increases with increasing typicality, it remains more or less constant for the incorrect decisions. Conclusions - The first part of the experiment shows that given Begin Match to source 7 in source list: Jing Liu. a set of images belonging to the sameEnd Match semantic category, different people tend to choose the same image as the most typical one. The obvious explanation is that they follow a similar procedure in identifying the typical image, which we can hypothesize to be subconsciously matching it with a prototype representation in their brain. Hence, the prototype theory, which is well established in the Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdfdomains of objects, faces and abstract patterns,End Match can be extended Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdftoEnd Match natural Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdfscenesEnd Match as well. Also, it is possible to extract out the more typical images from a dataset through such processes of human ratings. The second part of the experiment shows that the more typical images (as rated by human observers) are more likely to be classified correctly using global image features by computer vision algorithms. While, it is difficult to conclude that human observers use these very features to identify typicality, the results indicate that more typical scenes contain more visual features typical to that category. References - [1] Begin Match to source 3 in source list: http://cvcl.mit.edu/publications.htmEhinger K.A., Xiao J., Torralba A., & Oliva A. (2011). Estimating scene typicality from human ratings and image features.End Match CogSci, 2011. [2] Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdfRosch E. (1973). Natural categories. Cognitive Psychology, 4, 328-350.End Match [3] Begin Match to source 2 in source list: Tatsuya Harada. Xiao J., Hays J., Ehinger K., Oliva A., Torralba A.End Match (2010). Begin Match to source 2 in source list: Tatsuya Harada. SUN database: Large Scale recognition from Abbey to Zoo. CVPR, 2010.End Match [4] Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdfOliva A., & Torralba A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal in Computer Vision, 42, 145-175.End Match [5] Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdfLazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In Proceedings of Computer Vision and Pattern Recognition, 2169-2178.End Match [6] Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdfDalal, N., & Triggs, B. (2005). Histogram of oriented gradient object detection. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, 886-893.End Match [7] Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdfShechtman, E., & Irani, M. (2007). Matching localEnd Match self-similarities Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdfacross images and videos. In Proc. IEEE Conf. Computer Vision and Pattern Recognition.End Match [8] Begin Match to source 1 in source list: http://cvcl.mit.edu/papers/EhingerOliva-SUNtypicality-CogSci2011.pdfTversky, B., & Hemenway, K. (1983). Categories of environmental scenes. Cognitive Psychology, 15, 121-149.End Match Internet Sources - Begin Match to source 6 in source list: http://www.bml.psy.ruhr-uni-bochum.de/Text/HuberEtAl2000.pdfhttp://www.pigeon.psy.tufts.edu/avc/huber/End Match prototype.htm http://en.wikipedia.org/wiki/Prototype_theory