SE367 Project Proposal

Vidur Kumar (Y8560)

"Development of structure in artificial communication"

Introduction and Importance :

Languages, essentially consist of a finite number of basic units/entities, which - when combined/structured in particular ways - lead to a nearly infinite set of complex units that convey some meaning. Languages develop to facilitate communication - and the combinatorial/structural aspect follows - to facilitate organized recall of complex language units (making the language easier to learn), and allow for easier association to their semantic content. [Ref 2]

Thus, an instance/sentence of the language, not only communicates meaning, but also communicates information about the language's own structure.

The development of language structure has been the focus of researchers for many years, in order to understand the factors affecting the development of structure, and what possibility could there be for modeling growth of language systems, etc.

The original hypothesis in the field (by Hockett, 1960), stated that the development of the combinatorial structure was a consequence of increasing semantic space that the language would have to be able to encompass - however, it has been shown [Ref 1], that combinatorial structure emerges even without the requirement of semantic communication - but simply by virtue of iterative learning.

The proposed project aims to study the effects of imposing semantic content on an artificial language system - and observing the development of combinatorial structure via iterative learning.

Hypothesis :

The inclusion of semantic content being associated to the artificial language - will facilitate better recall, and easier learnability of the language.

Methodology :

Generating an artificial language - the language created would be symbolic (written) in nature, and composed of completely synthesized designs (possible example below). Creating a set of 20-30 such symbols for the experiments.

[Alternatively, the artificial language would have to be a phonological one - involving clicks and beat-boxing sounds]

Semantic content to be associated - the semantic content would be in the form of shapes (coloured or plain) (example - square, circle, etc), and actions between these shapes (example - A touching B, A going through B, etc).

Experiment 1 :

{Non-semantic iterative learning task}

Subjects would be separated into sets of 5 or 6 (each representing a chain of generations for the iterative learning task).

The first member of each set would be shown a symbol for 2 seconds, and asked to reproduce it (for a set of 20 symbols) - called the learning phase. (Repeated 2 times)
After every learning phase - the participant would then be asked to reproduce all 20 symbols, without being shown any of them - called the recall phase.
The results of the last recall phase from one participant, will be the set used for the learning phase of the next participant (and so on).
Repeated for every 'chain-set' of subjects.

Analysis :

The difference of characteristics in the input symbols and the output symbols of every participant - would represent ease of recall and learnability of the language
Commonality in characteristics of output symbols of each generation - would indicate the emergence of the combinatorial structure in the language, via the iterative learning process.

Experiment 2 :

{Semantic iterative learning task}

Subjects would be separated into sets of 5 or 6 (each representing a chain of generations for the iterative learning task).

3 symbols will be associated with 'shapes', 3 with 'colours' and 3 with specific 'actions' - to create a sample set of 27 complex symbol units associated distinctly with the animation of two objects in some action. The symbol order associated with every animation will be randomized.

Sets of 10 animations would be chosen from above - such that they include every shape, every colour and every action, atleast twice. Let there be 6 or 7 such test sets (A to F/G).

The first member would be trained on a given test set (A)- being shown the animation, and the associated symbols - and then asked to recall the symbols, for every element in the test set - learning phase. (Repeated 2 times)
After the learning phase, the participant would be asked to assign symbols for animations from another test set (B) - recall phase.
The assignment of symbols by a participant for the test set (B) - will then form the learning test set for the next participant.
Test set (C) would then be used for the the recall phase of the 2nd participant, and test set (D) for the 3rd particpant, etc - till the last participant of the chain.
Repeated for every chain-set of participants.

Analysis :

The difference of characteristics in the input symbols and the output symbols of every participant - would represent ease of recall and learnability of the language
Commonality in characteristics of output symbols of each generation - would indicate the emergence of the combinatorial structure in the language, via the iterative learning process.
The difference in the above, from the results of Experiment 1 - would indicate the difference caused by imposing semantic content onto the iterative learning of an artificial language.

Requirements :

20-25 participants to participate in the above experiments - with uncompromised memory ability (and non-eidetic memories as well :) )

Expected Results :

The fidelity of transmission of the language, across generations - will be higher in the case of Experiment 2, than in Experiment 1.

References :

"Cultural emergence of combinatorial structure in an artificial whistled language" CogSci2011; Tessa Verhoef, Simon Kirby & Carol Padden.
"Language evolution: consensus and controversies" TRENDS in Cognitive Sciences Vol.7 No.7 July 2003; Morten H. Christiansen and Simon Kirby.
"Cumulative cultural evolution in the laboratory: An experimental approach to the origins of structure in human language" PNAS August 5, 2008 vol. 105 no. 31 10681–10686; Simon Kirby, Hannah Cornish, and Kenny Smith.
"The Emergence of Linguistic Structure An overview of the Iterated Learning Model"; Simon Kirby and James Hurford (2002) In A. Cangelosi & D. Parisi (Eds.), Simulating the evolution of language (pp. 121-148). Springer Verlag New York.