This is a preview of the print version of your report. Please click "print" to continue or "done" to close this window.

done

or Cancel

Document Viewer
 
Similarity Index
12%
What's this?
Similarity by Source
Internet Sources:
10%
Publications:
11%
Student Papers:
0%
exclude quoted exclude bibliography exclude small matches download print
mode:

Cross Modal Object Recognition is Viewpoint Independent Final Project Report Submitted by: Hemangini Parmar (Y8214) Abstract In this paper the previous claims about cross modal object recognition being viewpoint dependent have been negated. It was observed that the recognition accuracy of the participants did not show any substantial change when objects were presented from different viewpoints cross modally but the accuracy was reduced a lot in the within modal case. However the hypothesis given by the authors in the parent work [1] claimed that cross modal object recognition was mediated by a higher-level representation, which could not be proved in this paper. Methodology The experiment was conducted on 20 female undergraduates residing in the Girls Hostel 1 at IIT Kanpur. Each participant was shown four sets of four objects in modes selected randomly out of the sixteen possible combinations of (modality, rotation) shown in table 1. This was done to ensure that the participants did not form a sequence of the four objects in their minds. The rotation was given as 180 degree clockwise about the axis. Objects: The objects were of approximately 3-5cmx2cmx2cm in dimension and were made out of LEGO bricks. They were sixteen in number, one for each of the sixteen possible combinations. This was done to ensure that the memory of the participant was not reinforced with the repetition of objects. The objects were made very similar to avoid giving any kind of distinctive visual or haptic cues to the participants, though mirror images were avoided. However the parent paper had used Begin Match to source 8 in source list: Simon Lacey. 48 objects, each madeEnd Match out of Begin Match to source 8 in source list: Simon Lacey. sixEnd Match component Begin Match to source 8 in source list: Simon Lacey. wooden blocks measuring 1.End Match 6cmx3.6cmx2.2cm, resulting into objects which were 9.5cm high each and the remaining dimensions varied depending on the arrangement of the constituent blocks. The objects used in the present experiment were numbered alphabetically (fig 1) and each object had a small dot to define the unrotated orientation of the object similar to the parent paper. Pilot testing was done to ensure that this dot was not noticed by the participants as it had a color dark enough to hide it from the background color of the object (black). Fig 1: The sixteen objects used in the experiment Modality Rotation Unrotated Rotated about X Rotated about Y Begin Match to source 9 in source list: Rudel, R.G.. Visual-Visual Visual-Haptic Haptic-Visual Haptic-HapticEnd Match Rotated about Z Table 1: The modality and rotation combination in which the 16 objects were given to the participants Experimental Setup : The participants were made to sit on a stool placed in front of a table (fig 2) such that the spacing between the eyes of the participant and the table was sufficient enough to allow her to view the object from all sides in case of visual mode. An additional setup (fig 2b) was permanently placed (removed in fig 2a to better explain the visual setup) which was high enough to block the view of the object completely from the participant but at an adequate distance to allow the participant to touch the object comfortably from all sides. As given in the parent paper, the visual mode prohibited the participant from touching the object or standing up and moving around the object but she was allowed to move her head to look all around the object, while in case of haptic mode the participants were restricted to keep the orientation of the object unchanged, i.e. the way it was given to them, while they were free to feel the object from all sides. The time limit which was adopted was the same as in the parent paper, i.e. 15 seconds for visual observation and 30 seconds for haptic observation [1]. Method: The sixteen possible combinations (table 1) were given randomly to the participants, to avoid formation of any kind of pattern in their minds. The ‘learning level’ consisted of showing four objects, identified by numbers (1-4), either visually or haptically to the participants, for the respective time durations per trial making a total of four trials (4x4 objects), and the ‘recognition level’ involved showing the same four objects in any four possible combinations out of the sixteen combinations and asking the participants to recognize the objects using their allotted numbers. In this way the sixteen objects were made to recognize in the sixteen possible combinations and the observation accuracy was analyzed (see Results). Fig 2a: Visual Setup used in the experiment Fig 2b: Haptic Setup used in the experiment Results The average observation accuracy (%) was plotted against the modality for both the rotated and the unrotated cases (fig 3). It was observed that the accuracy was marginally affected by rotation in the cross-modal case while it decreased substantially in the within-modal case. The results obtained agreed with the results obtained in the parent paper, thus validating the hypothesis that cross modal object recognition is viewpoint independent. Fig 3: Average observation accuracy (%) versus the modality for both rotated and unrotated cases Similar to the parent paper, an OSIQ questionnaire (shown in fig 4) was used here to assess the mental representations formed in the participants while recognizing the objects. Begin Match to source 1 in source list: http://www.mmvirtualdesign.com/object_spatial.pdf‘OSIQ consists of two scales: object imagery scale assesses preferences for representing and processingEnd Match colorful, Begin Match to source 1 in source list: http://www.mmvirtualdesign.com/object_spatial.pdfpictorial, and high-resolution images of objects and a spatial imagery scaleEnd Match which Begin Match to source 1 in source list: http://www.mmvirtualdesign.com/object_spatial.pdfassesses preferences for representing and processing schematic images, spatial relations amongst objects, and spatial transformationsEnd Match’, [2]. The resulting scores were correlated with the average observation accuracy for within-modal and cross-modal cases for both rotated and unrotated conditions, converted to a scale of {0,1} as shown in fig 5 and the Begin Match to source 3 in source list: R. L. Curier. correlation coefficient(r) andEnd Match the Begin Match to source 3 in source list: R. L. Curier. probability of getting a correlation as large as the observed value by random chance when true correlation is zero (p)End Match have been given in table 2 below. Each dot in the figure 5 denotes (observation accuracy, imagery score) for each participant (total six dots for six participants who were asked to give the questionnaire). There is an acceptable correlation between the cross- modal unrotated case and the spatial imagery score (r=0.5387) as expected but there is a discrepancy in the correlation between the cross-modal rotated case and spatial case. CM-R CM-UR WM-R WM-UR SS(r Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf,p) (-0.End Match 1161 Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf,0.End Match 8042) Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf(0.End Match 5387, Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf0.End Match 2121) Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf(-0.End Match 3702 Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf,0.End Match 4137) Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf(0.End Match 2929 Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf,0.End Match 5238) OS(r Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf,p) (-0.End Match 3112 Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf,0.End Match 4970) Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf(0.End Match 3699 Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf,0.End Match 4141) Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf(0.End Match 0320 Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf,0.End Match 9458) Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf(0.End Match 6679 Begin Match to source 6 in source list: http://telin.ugent.be/~ledda/media/pdf/polyimidehong.pdf,0.End Match 1011) Table 2: Correlation coefficients (r) and probability (p) for correlation between imagery scores (SS and OS) and the modalities (CM and WM), for the rotated and unrotated conditions (R and UR) Fig 4: OSIQ Questionnaire used in the experiment (Courtesy [2]) Fig 5a: Correlation between Spatial imagery scores and average observation accuracy for each participant; four graphs for both the modalities (within, cross) and orientations (R,UR) Fig 5b: Correlation between Object imagery scores and average observation accuracy for each participant; four graphs for both the modalities (within, cross) and orientations (R,UR) Discussion This experiment was able to validate the hypothesis that cross-modal object recognition is viewpoint independent. However no substantial evidence was collected in regard to the second hypothesis of cross-modal object recognition being mediated by a higher-level representation. As observed in table 2, acceptable correlation (r=0.5387) was observed between spatial imagery score and the cross-modal unrotated case validating the possibility that cross-modal object recognition could be mediated by an abstract spatial representation which is higher-level in nature. But the unexpected correlation coefficient value (r=-0.1161) for the cross-modal rotated case is an error and the possible reasons behind this discrepancy could be the less number of participants who took the questionnaire (six in number), the abstractness of the questionnaire itself, besides the correlation statistic is very sensitive to noise. Functional neuroimaging studies have observed overlapping in the regions related to visual and haptic shape processing in the brain. However the locus of the ‘higher-level modality and viewpoint independent representation’ [1] have not been located yet, hence obscuring the second hypothesis. Acknowledgement This report would not have been possible without the kind support and help of many individuals. I would like to extend my sincere thanks to all of them. First of all I am thankful to Professor Amitabha Mukerjee, Department of Computer Science and Engineering, IIT Kanpur, for giving me this opportunity to gain an indispensable experience and for constantly encouraging and monitoring me on every step. This project would not have been feasible without his inspiration and support. I would also like to thank the staff at the CSE Laboratory for providing me with the LEGO bricks. References [1] Begin Match to source 2 in source list: http://www.plosone.org/article/info:doi/10.1371/journal.pone.0008009Lacey S, Peters A, Sathian K (2007): Cross-Modal Object Recognition Is Viewpoint- Independent. PLoS ONE 2(9): e890. doi:10.1371/journal.pone.0000890End Match [2 Begin Match to source 10 in source list: Olessia Blajenkova. ]Olessia Blajenkova, Maria Kozhevnikov andEnd Match Michaela Begin Match to source 10 in source list: Olessia Blajenkova. MotesEnd MatchBegin Match to source 7 in source list: Simon Lacey. (2006): Object-Spatial Imagery: A New Self-Report Imagery Questionnaire, Appl. Cognit. Psychol. 20: 239–263End Match [3] Begin Match to source 5 in source list: http://www.cnbc.cmu.edu/cns/papers/DilJeffTechReport.pdfMaximilian Riesenhuber and Tomaso PoggioEnd Match (1999): Begin Match to source 5 in source list: http://www.cnbc.cmu.edu/cns/papers/DilJeffTechReport.pdfHierarchical models of object recognition in cortex. Nature Neuroscience, 2(11):1019–1025, November 1999.End Match [4 Begin Match to source 4 in source list: http://www.haptiklabor.de/book_human_haptic_perception_grunwald/ref19.html?&L=0]Lacey S, Campbell C (2006): Mental representation in visual/hapticEnd Match cross-modal Begin Match to source 4 in source list: http://www.haptiklabor.de/book_human_haptic_perception_grunwald/ref19.html?&L=0memory: Evidence from interference effects.End Match QJ Begin Match to source 4 in source list: http://www.haptiklabor.de/book_human_haptic_perception_grunwald/ref19.html?&L=0Exp Psychol, 59: 361-376End Match