Turnitin

Learning Grammatical Gender in an Artificial Language Based on Hindi S. S. Roy Burman A course project for SE367: Introduction to Cognitive Science A principal component of our comprehension of a language and our ability to use it is the knowledge of the grammatical categorisations in the language. Grammatical gender found in many languages is one such categorisation, traditionally thought to be rather arbitrarily defined. This study follows the methodology of a previous English language-based study to investigate the contribution of distributional and phonological cues in the acquisition of grammatical gender. The participants of the study were taught an artificial language composed of pronounceable Hindi pseudo-words with gender-like classes using a bimodal learning technique. The study demonstrates that the learning of the gender-like classes was influenced by distributional and phonological regularities. This is in accordance with the earlier study, thus indicating similar cues for gender categorisation may exist across different languages. Introduction Certain distinctions in a language like noun-verb distinctions are fairly easy to acquire by a user of the language. However, grammatical gender has traditionally been thought to be more difficult to acquire. This is especially true for learners whose native language does not have gender categorisation of nouns. So the question arises: how does one recognise and remember the gender of such diverse nouns? Traditionally thought to be an arbitrary categorisation in most languages, grammatical gender acquisition is a topic open for debate today. Recent evidence points to the fact that in certain European languages, gender marking of nouns may not be as arbitrary as it was once believed to be. One of the most obvious sources of gender information is the natural gender of a noun, but not all nouns have natural genders. Also, certain semantic categories seem to be linked to a particular gender. For example, 70% of the nouns depicting alcoholic drinks are masculine in German (Corbett, 1991). Further, distributional cues such as co-occurrence with gender-marked articles may be an important source of grammatical category information (Dahan, Swingley, Tanenhaus, & Magnuson, 2000). For example, in French, le ballon (the ball) distinguishes its gender from la chemise (the shirt) with the help of the gender-marked definite article. Associating the noun with the article might help a French speaker remember the gender of the noun. Another interesting cue which has been suggested to help in grammatical gender acquisition is phonological similarities (Brooks, Braine, Catalano,& Brody, 1993). Similar sounding words usually have the same gender in a language. In German, the suffix –lein imparts a neutral gender to the subject, even overriding the natural gender of the subject. A study to evaluate the grammatical cues and their contribution to gender acquisition was done by Mirkovic, Forrest & Gaskell (2011). They constructed an artificial language containing certain semantic regularities and associated grammatical gender-like categories with the nouns. They used two determiners as distributional cues, whose co-occurrence with the nouns taught and tested. The nouns themselves had one of four suffixes, two for each gender. This formed the basis of the phonological cues. The cues were provided probabilistically and the result indicated that such probabilistic and regular cues helped the participants learn the artificial language and the associated gender-like categories. A detailed description of the methods and results has been discussed throughout this article. This study aims to undertake a study on similar lines using Hindi pseudo- words as opposed to English pseudo-words like those used in the original study. Experiment Design The basic experiment was designed on the framework adopted by Mirkovic, Forrest & Gaskell with certain variations to account for the usage on a Hindi-based language. The language was constructed using pronounceable Hindi pseudo-words which held no meaning in Hindi as checked in Google Translate. Two verbs were used per gender-like class, which have been called masculine and feminine for the sake of simplicity. The verbs were two syllable words with the masculine verbs ending in maintain phonological regularity. The phonological cues were provided by ending the feminine nouns in or ो , whereas the masculine nouns had no specific ending. The nouns were either two or three syllable words. To maintain a Hindi-like grammatical construct, ‘[Noun] [verb] !’ was used. Every noun was paired with either verb marking its gender and this association was retained throughout the study. The participants learned the artificial language using a cross-modal learning technique developed by Breitenstein et al. (2007). At different stages of learning of artificial language, the participants were asked to select the verb form which matched the noun. The participants could choose either of the verb forms indicating masculine, or either indicating feminine. This assessment tested the acquisition of gender through association with a gender-marked verb. A second test to assess the acquisition of gender was done using a generalisation set. This set had four consistent noun verb pairs and four inconsistent ones. The accuracy of verb selection for consistent versus inconsistent pairs was used to assess the contribution of phonological and distributional cues in gender acquisition. Expect for the Hindi-specific construct used, the rest of the experimental design was similar to the one used by Mirkovic, Forrest & Gaskell. Also, instead of using one determiner as they have used, I have increased the complexity of memorisation by introducing two similar sounds verbs instead. However, verb selection only assessed the acquisition of the correct gender, not the exact verb form. 44 everyday objects, animals and persons were selected as the training set. The generalisation set consisted of 8 other nouns. The complete list of words may be found here. Gender Verb forms Noun ending to “Masculine” (none) “Feminine” व ो Table 1: Properties of the two gender-like classes ! ! Fig. 1: Examples of the picture stimuli Method Participants Five employees of the mess of Hall of Residence 1, IIT Kanpur were the participants of the study. Three persons finished all three days of study and the results reflect their performance. They were not given any monetary incentive to perform. All three were native Hindi speakers from Uttar Pradesh and the medium of instruction during schooling was Hindi. Stimuli The training set comprised of 44 items, divided into the two categories. They represented commonly found objects, animals and persons. Equal division of animals was done to either gender. Natural gender was kept consistent wherever applicable. All pictures were selected from the Microsoft Office online database. Unambiguous images, with minimal background were chosen. The feminine nouns ended in or ो , whereas the masculine noun had no specific endings. They were paired with their respective gender-marked verbs as given in Table 1. Two examples of the nouns used are given in Fig. 1. The generalisation set consisted of eight additional items, four for each gender. Two items in each gender had noun-verb associations consistent with those in the training set. Two others had inconsistent noun-verb pairing. A correct pairing would be like: pairing would be like: ओव व !. Word-Picture Matching ! whereas an inconsistent Word-Picture Matching tasks consisted of putting a tick against the question number if the participant thought that the sentence played on the audio corresponded to the picture stimulus shown, otherwise putting a cross. Every auditory stimulus was played six times in a day, four times paired with the correct image and twice with the incorrect image. The participants were informed of this ratio before starting the task. Full sentences were used for training. There was no time constraint and the participants were given no feedback of their performance during or after the assessment. This was the bimodal learning paradigm adapted by Mirkovic, Forrest & Gaskell based on the cross-modal learning technique developed by Breitenstein et al. This was also performed for the generalisation set as the last experiment. Verb Selection In the Verb Selection task, the participants were asked to tick which verb group a noun belonged to ( / versus व / ). The stimulus was only auditory, with the participants hearing the noun only. Verb selection was done only once per noun. A verb selection task for the generalisation set was done at the end as well. Schedule of Tasks The whole experiment was conducted over three days. The primary objective was to make the participants learn the language well and then the intricacies were judged. The verb selection task has been conducted differently from the way it was conducted by Mirkovic, Forrest & Gaskell. The reference has been chosen to be a partially trained state instead of a pre-trained set as it was felt that the pre-trained participants could not understand the task as they had not been introduced to the Incorrect Answers (in %) Incorrect Answers (in %) language. Hence, for a more meaningful comparison, a partially trained state was used for comparison. Day Task 1 Word-picture Matching Verb selection (Post partial training) 2 Word-picture Matching 3 Word-picture Matching Verb Selection (Post training) Word-picture Matching (on Generalisation set) Verb selection (on Generalisation set) Table 2: Schedule of Tasks Results and Discussion The experiment was started with five participants, but three completed all tasks, the data presented here reflects the performance of the three only. As the participants were informed of the fact that correct pairing were double the number of incorrect pairings, the Day 1 error was not expected to be around 50%, as the participants would more probably mark a tick thus ensuring 66.67% accuracy. As the word-picture matching task progressed, the accuracy of the guesses of the participants increased. On Day 3, the mean accuracy was 95.71% up from 78.85% on Day 1. This indicated that by the probabilistic association of the auditory and the visual stimulus over a period of three days, the participants had picked up the word-picture associations. Semantic annotations of the newly acquired verb was checked by informally asking them the Hindi synonym for the words. The assessment was qualitative and the participants got almost all words correct, with minor pronunciation mistakes. 30 25 30 25 20 15 10 5 Day 1 Day 2 Day 3 20 15 10 5 Post Partial- Training Post Traing 0 Word-Picture Matching 0 Verb Selection Figure 2: Performance on Word-Picture Matching Task on the Training Set Figure 3: Performance on Verb Selection Task on the Training Set Incorrect Answers (in %) The verb selection task yielded results as expected. In two out of the three participants, the error in associating the noun with the correct verb group halved after two further rounds of training, while in one participant the figure remained the same. This indicates that post training, distributional clues are easier to recognise. As the baseline has been taken to be a partially trained state, the baseline error rate is not very high. This indicates that even without knowing that verb selection is a task and hence preparing for it, the participants learned the gender-marked verb. The next set of tests done on the generalisation set yielded interesting results. There was no clear difference between the word-picture matching accuracy for the consistent and the inconsistent items in the set as reported by Mirkovic, Forrest & Gaskell. However, the importance of phonological cues could be assessed from the fact the the error rate of verb selection for consistent items was 25%, whereas that of inconsistent items id 66.67%. This indicates that the participants were trained to recognise phonological cues that would hint at the gender of the noun. 100 80 60 40 Consistent Inconsistent 20 0 Verb Selection Figure 4: Performance of Verb Selection Task on the Generalisation set. The results suggest that the participants began to relate the items to two classes based on the distributional cues (as assessed by the verb selection done on the training set) as well as on the phonological cues (as assessed by the verb selection done on the generalisation set). Hence, participants look for regularities in the grammatical structure to obtain such a gender based distinction. Also, the recognition of the cues may not be an explicit task. When the aim of the experiment and all the cues were disclosed to the participants post the experiment, they could not relate their performance on the generalisation set to the phonological cues. This indicates certain recognition processes to recognise grammatical gender may be implicit. All in all, the finding supports the work of Mirkovic, Forrest & Gaskell and indicates that certain consistencies may influence the acquisition of gender in a language, even non-European languages like Hindi. Acknowledgement The author is grateful to Dr. Achla Raina, Professor, Department of Humanities and Social Sciences, IIT Kanpur for the advice she gave on modifying the original experiment. References Breitenstein, C., Zwitserlood, P., Vries, M. de, Feldhues, C.,Knecht, S., & Dobel, C. (2007). Five days versus a lifetime:Intense associative vocabulary training generates lexically integrated words. Restorative Neurology and Neuroscience, 25, 493-500.Brooks, P. J., Braine, M. D. S., Catalano, L., & Brody, R. E .(1993). Acquisition of gender-like noun subclasses in an artificial language: the contribution of phonological markers to learning. Journal of Memory and Language, 32, 76-95. Corbett, G. G. (1991). Gender. Cambridge, UK: Cambrigde University Press.Dahan, D., Swingley, D., Tanenhaus, M. K., & Magnuson, J. S. (2000). Linguistic gender and spoken- word recognition in French. Journal of Memory and Language, 42, 465-480. Mirkovic J., Forrest S. & Gaskell M. G. (2011). Semantic Regularities in Grammatical Categories: Learning Grammatical Gender in an Artificial Language. Proceedings of the 33rd Annual Conference of the Cognitive Science Society

turn off anonymous marking

Turnitin Originality Report

report.pdf By Sonal Kumari