==Motor: gestures / language==

@article{ozcaliskan-goldinMeadow-10_sex-differences-in-gesture-presage-language,
  title={{Sex differences in language first appear in gesture}},
  author={Ozcaliskan, Seyda and Susan Goldin-Meadow}, 
  journal={Developmental Science},
  volume={13},
  number={5},
  pages={752--760},
  issn={1363-755X},
  year={2010},
  publisher={Wiley-Blackwell}
  annote = {

for some years now, ozcaliskan has been studying meaningful gestures by
children, and in a 2005 paper, she and Goldin-Meadow showed that gesture-word
combinations (e.g. point at cookie and say "eat"), precede multi-word
utterances by several months.

here the same experimental setup (longitudinal observations) is extended to
gender imbalances.  it is known that on average infant boys utter 2-word
constructs some 3 months later than girls.

this longitudinal study videotaped 40 children (22 girls, 18 boys) at home
every 4 months between 14 and 34 months, (90 min x 6 per child).  by
analyzing the videos, it was found that boys are also later in G+S
combinations, again by about 3 months; e.g. at 14 mos, the avg girl child has
a gestural vocab of 65 tokens vs 40 tokens for the boy.  By 18 months, boys
will reach 74 tokens on avg, but girls may be at 106.  

while this is interesting, it is not really v surprising.  re: the
methodology however, given the wide variability in ages at which children
acquire language, it is not clear if a sample bias was really avoided in such
a study based on forty children altogether.  The high SDs on the data reflect
this variation. 

---
Abstract: 
Children differ in how quickly they reach linguistic milestones. Boys
typically produce their first multi-word sentences later than girls do. We
ask here whether there are sex differences in children’s gestures that
precede, and presage, these sex differences in speech. To explore this
question, we observed 22 girls and 18 boys every 4 months as they progressed
from one-word speech to multi-word speech. We found that boys not only
produced speech + speech (S+S) combinations (‘drink juice’) 3 months later
than girls, but they also produced gesture + speech (G+S) combinations
expressing the same types of semantic relations (‘eat’ + point at cookie) 3
months later than girls. Because G+S combinations are produced earlier than
S+S combinations, children’s gestures provide the first sign that boys are
likely to lag behind girls in the onset of sentence constructions.

unicode: Şeyda Özçalışkan

}}


==Perception: Objects==

@article{porway-yaoB-zhuSC-08_learning-compositional-models-for-object-categories, 
title={Learning compositional models for object categories from small sample sets},
author={Porway, J. and Yao, B. and Zhu, S.C.},
journal={Object categorization: computer and human vision perspectives},
year={2008}

Abstract: 

In this chapter we present a method for learning a compositional model in a
minimax entropy framework for modeling object categories with large
intra-class variance. The model we learn incorporates the flexibility of a
stochastic context free grammar (SCFG) to account for the variation in object
structure with the neighborhood constraints of a Markov random field (MRF) to
enforce spatial context. We learn the model through a generalized minimax
entropy framework that accounts for the dynamic structure of the
hierarchical model. We first learn the SCFG parameters using the frequencies
of object parts, then pursue spatial relations in order of greatest
information gain. The learned model can generalize from a small set of
training samples (n < 100) to generate a combinatorially large number of
novel instances using stochastic sampling. This process is similar to
"recognition-by-components", a theory that postulates that biological vision
systems recognize objects as composed from a dictionary of commonly appear-
ing 3D structures. This paper provides one possible implementation of this
theory. To verify our learning method and model performance, we present plots
of KL divergence minimization as the algorithm proceeds, and show realistic
samples drawn from the model. We also show the model accurately predicting
missing or undetected parts for top-down recognition along with preliminary
results showing that the model can learn a large space of category
appearances from a very small (n < 15) number of training samples. Finally,
we discuss a compositional boosting algorithm for inference and show
examples using it for object recognition.

}}



@inproceedings{kulkarni-berg-11cvpr_baby-talk-image-descriptions,
  title={Baby talk: Understanding and generating simple image descriptions},
  author={Kulkarni, G. and Premraj, V. and Dhar, S. and Li, S. and Choi, Y. and Berg, A.C. and Berg, T.L.},
  booktitle={Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on},
  pages={1601--1608},
  year={2011},
  annote = {

We posit that visually descriptive language offers computer vision
researchers both information about the world, and information about how
people describe the world. The potential benefit from this source is made
more significant due to the enormous amount of language data easily available
today. We present a system to automatically generate natural language
descriptions from images that exploits both statistics gleaned from parsing
large quantities of text data and recognition algorithms from computer
vision. The system is very effective at producing relevant sentences for
images. It also generates descriptions that are notably more true to the
specific image content than previous work.

}}


==Perception: Action==

@article{georgeon-ritter-11_intrinsically-motivated-schema-for-emergent-cognition,
author = {Olivier Georgeon and Frank Ritter}
title = {An Intrinsically-Motivated Schema Mechanism 
    to Model and Simulate Emergent Cognition},
journal = {Cognitive Systems Research}
annote = {

considers an agent, Ernest, that moves in a maze-like environment. it uses
intrinsic motivation (maximize built-in utility functions), and learns
hierarchical action "schemas"  in a
Piagetian, constructivist manner.  the simulator code is available. 
?project? 

simulator: 
http://liris.cnrs.fr/ideal/doc/GeorgeonO2011-emergent-cognition.pdf
http://e-ernest.blogspot.com/

abstract:

We introduce an approach to simulate the early mechanisms of emergent
cognition based on theories of enactive cognition and on constructivist
epistemology. The agent has intrinsic motivations implemented as inborn
proclivities that drive the agent in a proactive way. Following these drives,
the agent autonomously learns regularities afforded by the environment, and
hierarchical sequences of behaviors adapted to these regularities. The agent
represents its current situation in terms of perceived affordances that
develop through the agent’s experience. This situational representation works
as an emerging situation awareness that is grounded in the agent’s
interaction with its environment and that in turn generates expectations and
activates adapted behaviors.  Through its activity and these aspects of
behavior (behavioral proclivity, situation awareness, and hierarchical
sequential learning), the agent starts to exhibit emergent sensibility,
intrinsic motivation, and autonomous learning. Following theories of
cognitive development, we argue that this initial autonomous mechanism
provides a basis for implementing autonomously developing cognitive systems.

}}


@inproceedings{swift-kautz-12_multimodal-corpus-language-action,
  title={A multimodal corpus for integrated language and action},
  author={Swift, M. and Ferguson, G. and Galescu, L. and Chu, Y. and Harman, C. and Jung, H. and Perera, I. and Song, Y.C. and Allen, J. and Kautz, H.},
  booktitle={Proc. of the Int. Workshop on MultiModal Corpora for Machine Learning},
  year={2012}
  annote = {

make tea : 12 subjects x 3 episodes
	[e.g. take a tea bag from the cupboard - "tea bag" can refer to the
	indiv teabag, or to the teabag box]

use RFID emitters on objects - but didn't have enough sensitivity. 

make sandwiches
snack bar activity

--> want to label the data

We describe a corpus for research on learning everyday tasks in natural
environments using the combination of natural language description and rich
sensor data that we have collected for the CAET (Cognitive Assistant for
Everyday Tasks) project. We have collected audio, video, Kinect RGB-Depth
video and RFID object-touch data while participants demonstrate how to make a
cup of tea. The raw data are augmented with gold-standard annotations for the
language representation and the actions performed. We augment activity
observation with natural language instruction to assist in task learning.

}}


@inProceedings{tokunaga-iidra-10_bilingual-multimodal-corpora-referring-expressions,
title={Construction of bilingual multimodal corpora of referring expressions in collaborative problem solving},
author={Tokunaga, T. and Iidra, R. and Yasuhara, M. and Terai, A. and Morris, D. and Belz, A.},
year={2010},
annote = {

Over the last decade, with a growing recognition that referring expressions
frequently appear in collaborative task dialogues (Clark and WilkesGibbs,
1986; Heeman and Hirst, 1995), a number of corpora have been constructed to
study the nature of their use. This tendency also reflects the recognition
that this area yields both challenging research topics as well as promising
applications such as human-robot interaction (Foster et al., 2008; Kruijff et
al., 2010).

The COCONUT corpus (Di Eugenio et al., 2000) was collected from
keyboard-dialogs between two participants, who worked together on a simple
2-D design task, buying and arranging furniture for two rooms. The COCONUT
corpus is limited in annotations which describe symbolic object information
such as object intrinsic attributes and location in discrete co-ordinates. As
an initial work of constructing a corpus for collaborative tasks, the COCONUT
corpus can be characterised as having a rather simple domain as well

The QUAKE corpus (Byron, 2005) and its successor, the SCARE corpus (Stoia et
al., 2008) deal with a more complex domain, where two participants
collaboratively play a treasure hunting game in a 3-D virtual world. Despite
the complexity of the domain, the participants were only allowed limited
actions, e.g. moving step forward, pushing a button etc.

As a part of the JAST project, the Joint Construction Task (JCT) corpus was
created based on dialogues in which two participants constructed a puzzle
(Foster et al., 2008). The setting of the experiment is quite similar to ours
except that both participants have even roles. Since our main concern is
referring expressions, we believe our asymmetric setting elicits more
referring expressions than the symmetric setting of the JCT corpus.

In contrast to these previous corpora, our corpora record a wide range of
information useful for analysis of human reference behaviour in situated
dialogue. While the domain of our corpora is simple compared to the QUAKE and
SCARE corpora, we allowed a comparatively large flexibility in the actions
necessary for achieving the goal shape (i.e. flipping, turning and moving of
puzzle pieces at different degrees), relative to the complexity of the
domain. Providing this relatively larger freedom of actions to the
participants together with the recording of detailed information allows for
research into new aspects of referring expressions.

As for a multilingual aspect, all the above corpora are English. There have
been several recent attempts at collecting multilingual corpora in situated
domains. For instance, (Gargett et al., 2010) collected German and English
corpora in the same setting. Their domain is similar to the QUAKE corpus. Van
der Sluis et al. (2009) aim at a comparative study of referring expressions
between English and Japanese. Their domain is still static at the moment. Our
corpora aim at dealing with the dynamic nature of situated dialogues between
very different languages, English and Japanese.

}}


@article{loucks-sommerville-12_recognizing-human-actions-4-10mo,
  title={Developmental changes in the discrimination of dynamic human actions in infancy},
  author={Loucks, J. and Sommerville, J.A.},
  journal={Developmental Science},
  year={2012},
  annote = {

how does an infant figure out the intentionality of an agent doing some
action?  

Even for a simple action like reaching for and grasping a toy, 5-6 mo infants  
attend to multiple different properties: which toy the actor selects, the
particular grasp used, the spatial trajectory of the reach, how fast the
reach is executed, etc.  [Woodward (1998)]

has 4-mo and 10-mo infants watch an actor in three situations: 
   - move a toy across a table [habituation]
   - change how the hand contacts the toy [featural change]
   - change spatial aspects (more global) - such as pose of head or body, or
	   arm trajectory [configurational]
   - change temporal aspects (speed)

though configurational changes were quantitatively larger shifts in the
image, adults pay more attention to featural change, indicating that they
were sensitive to functional aspects of the situation.  [Loucks and Baldwin
(2009)]

Loucks, J., & Baldwin, D. (2009). Sources of information for
discriminating dynamic human actions. Cognition, 111 (1),
84–97.

how do we develop such sensitivity to functionally important aspects?  
     One possibility is that infants become increasingly sensitive to both
     sources of information, but continue to gain sensitivity to featural
     information when sensitivity to configural information levels
     off. Another possibility is that infants begin with relatively broad
     sensitivity to both sources, but lose sensitivity to configural
     information while maintaining sensitivity to featural information.

it is known that by 5-6 months, infants gaze longer at the goal of a motion
than the motion per se.  thus, the particular toy being picked up is more
important than the relative position, or the trajectory [configurational].
[Woodward group 2002-2009]

Woodward, A.L. (2009). Infants’ grasp of others’ intentions.
Current Directions in Psychological Science, 18 (1), 53–57.

The main result is that at 4 months, infants are more sensitive to config and
temporal changes than to featural, but by 10 months, their response to
configurational or temporal change is about the same as habituation, while
looking time for featural is significantly higher. 

In the discussion, they suggest that the motor acquisition and maturation in
the intervening months may be a source for heightened awareness of the
functional (featural) aspects of grasp, and its relation to object shape. 

Subsequent work: 10-month-olds’ understanding of the functional consequences
of the precision grasp is correlated with their ability to perform precision
grasps themselves (Loucks & Sommerville, in press).

(Loucks & Sommerville, in press).
    The role of motor experience in understanding action function: the case
    of the precision grasp. Child Development.

--Abstract--

Recent evidence suggests that adults selectively attend to features of
action, such as how a hand contacts an object, and less to configural
properties of action, such as spatial trajectory, when observing human
actions. The current research investigated whether this bias develops in
infancy. We utilized a habituation paradigm to assess 4-month-old and
10-month-old infants’ discrimination of action based on featural, configural,
and temporal sources of action information. 

Younger infants were able to discriminate changes to all three sources of
information, but older infants were only able to reliably discriminate
changes to featural information. These results highlight a previously unknown
aspect of early action processing, and suggest that action perception may
undergo a developmental process akin to perceptual narrowing.

}}



@article{rakison-krogh-12_causal-action-facilitates-causal-perception-5mo,
  title={Does causal action facilitate causal perception in infants younger than 6 months of age?},
  author={Rakison, D.H. and Krogh, L.},
  journal={Developmental Science},
  year={2012},
  annote = {

same perceptual input, but differing situations of causality (stickiness) are
simulated through a clever use of velcro.  chilren around 5 mos are able to
perceive distinctions based on stickiness.  
Q. Is this learning laws of physics ("velcro sticks") or of the mysterious
substance called "causality"?  

abstract: 
Previous research has established that infants are unable to perceive
causality until 6¼ months of age.  The current experiments examined whether
infants’ ability to engage in causal action could facilitate causal
perception prior to this age. In Experiment 1, 4.5-month-olds were randomly
assigned to engage in causal action experience via Velcro sticky mittens or
not engage in causal action because they wore non-sticky mittens. Both groups
were then tested in the visual habituation paradigm to assess their causal
perception. Infants who engaged in causal action – but not those without this
causal action experience – perceived the habituation events as
causal. Experiment 2 used a similar design to establish that 4.5-month-olds
are unable to generalize their own causal action to causality observed in
dissimilar objects. 

These data are the first to demonstrate that infants under 6 months of age
can perceive causality, and have implications for the mechanisms underlying
the development of causal perception.

}}


@article{cannon-woodward-11_action-production-influences-attention-12mo,
  title={Action production influences 12-month-old infants’ attention to others’ actions},
  author={Cannon, E.N. and Woodward, A.L. and Gredeb{\"a}ck, G. and von Hofsten, C. and Turek, C.},
  journal={Developmental Science},
  year={2011},
  annote = {

priming is observed from what they do to what they see, in 12 month-olds.
gaze tracking data.  if they did the same task earlier (behavior first),
their gaze shifts were reliably quicker (mean latency drops from 71 msec to 0
msec).

action anticipation: 
Falck-Ytter et al. (2006): 
three balls are grasped by a person and deposited in a bucket.  alternately,
the balls move their on their own. 
both adults and 12-mo olds gaze-shift reliably to the goal only when it is
done by a human. 

},
  abstract = {

Recent work implicates a link between action control systems and action
understanding. In this study, we investigated the role of the motor system in
the development of visual anticipation of others’ actions. Twelve-month-olds
engaged in behavioral and observation tasks. Containment activity, infants’
spontaneous engagement in producing containment actions; and gaze latency,
how quickly they shifted gaze to the goal object of another’s containment
actions, were measured. Findings revealed a positive relationship: infants
who received the behavior task first evidenced a strong correlation between
their own actions and their subsequent gaze latency of another’s
actions. Learning over the course of trials was not evident. These findings
demonstrate a direct influence of the motor system on online visual attention
to others’ actions early in development.

}}


@article{wang-mori-09pami_action-recognition-by-semilatent-topic-models,
   author={Yang Wang and Mori, G.}, 
   journal={Pattern Analysis and Machine Intelligence, IEEE Transactions on}, 
   title={Human Action Recognition by Semilatent Topic Models}, 
   year={2009}, 
   month={oct. }, 
   volume={31}, 
   number={10}, 
   pages={1762 -1774}, 
   doi={10.1109/TPAMI.2009.43}, 
   annote = {

machine learning approach applying ideas from document information retrieval
to video data. 

--abstract--
We propose two new models for human action recognition from video 
sequences using topic models. Video sequences are represented by a novel
“bag-of-words” representation, where each frame corresponds to a “word”. Our
models differ from previous latent topic models for visual recognition in two
major aspects: first of all, the latent topics in our models directly
correspond to class labels; secondly, some of the latent variables in
previous topic models become observed in our case. Our models have several
advantages over other latent topic models used in visual recognition. First
of all, the training is much easier due to the decoupling of the model
parameters. Secondly, it alleviates the issue of how to choose the
appropriate number of latent topics. Thirdly, it achieves much better
performance by utilizing the information provided by the class labels in the
training set. We present action classification results on five different
datasets. Our results are either comparable to, or significantly better than
previous published results on these datasets.

}}



@article{anderson-chiu-11_temporal-dynamics-language-vision,
  title={On the temporal dynamics of language-mediated vision and vision-mediated language},
  author={Anderson, S.E. and Chiu, E. and Huette, S. and Spivey, M.J.},
  journal={Acta psychologica},
  volume={137},
  number={2},
  pages={181--189},
  year={2011},
  publisher={Elsevier}

Recent converging evidence suggests that language and vision interact
immediately in non-trivial ways, although the exact nature of this
interaction is still unclear. Not only does linguistic information influence
visual perception in real-time, but visual information also influences
language comprehension in real-time.  For example, in visual search tasks,
incremental spoken delivery of the target features (e.g., “Is there a red
vertical?”) can increase the efficiency of conjunction search because only
one feature is heard at a time.  Moreover, in spoken word recognition tasks,
the visual presence of an object whose name is similar to the word being
spoken (e.g., a candle present when instructed to “pick up the candy”) can
alter the process of comprehension. Dense sampling methods, such as
eye-tracking and reach-tracking, richly illustrate the nature of this
interaction, providing a semi-continuous measure of the temporal dynamics of
individual behavioral responses. We review a variety of studies that
demonstrate how these methods are particularly promising in further
elucidating the dynamic competition that takes place between underlying
linguistic and visual representations in multimodal contexts, and we conclude
with a discussion of the consequences that these findings have for theories
of embodied cognition.

}}



@article{french-mareschal-11_connectionist-sequence-chunks,
  title={TRACX: a recognition-based connectionist framework for sequence segmentation and chunk extraction.},
  author={French, R.M. and Addyman, C. and Mareschal, D.},
  journal={Psychological review},
  volume={118},
  number={4},
  pages={614},
  year={2011},
  annote = {

computational simulation of temporal sequence chunking

--abstract--
Individuals of all ages extract structure from the sequences of patterns they
encounter in their environment, an ability that is at the very heart of
cognition. Exactly what underlies this ability has been the subject of much
debate over the years. A novel mechanism, implicit chunk recognition (ICR),
is proposed for sequence segmentation and chunk extraction. The mechanism
relies on the recognition of previously encountered subsequences (chunks) in
the input rather than on the prediction of upcoming items in the input
sequence. A connectionist autoassociator model of ICR, truncated recursive
autoassociative chunk extractor (TRACX), is presented in which chunks are
extracted by means of truncated recursion. The performance and robustness of
the model is demonstrated in a series of 9 simulations of empirical data,
covering a wide range of phenomena from the infant statistical learning and
adult implicit learning literatures, as well as 2 simulations demonstrating
the model’s ability to generalize to new input and to develop internal
representations whose structure reflects that of the items in the input
sequence. TRACX outperforms PARSER (Perruchet & Vintner, 1998) and the simple
recurrent network (SRN, Cleeremans & McClelland, 1991) in matching human
sequence segmentation on existing data. A new study is presented exploring
8-month-olds’ use of backward transitional probabilities to segment auditory
sequences.

}}


==Language : Grammar learning==

@article{waterfall-sandbank-10_computational-language-acquisition,
  title={An empirical generative framework for computational modeling of language acquisition},
  author={Waterfall, H.R. and Sandbank, B. and Onnis, L. and Edelman, S.},
  journal={Journal of child language},
  volume={37},
  pages={671--703},
  year={2010},
  publisher={Cambridge Univ Press},
  annote = {

CHILDES: corpus of utterances by children learning various languages, and also
that of caregivers. 

adopts an unsupervised grammar learning system caled ConText to learn from
this corpus to see if a system can acquire a grammar for English in an
unsupervised manner. 

ConText, a much simpler algorithm developed in response to ADIOS,
operates directly on the distributional statistics of the corpus and
characterizes words and phrases by the local linguistic contexts in which
they appeared.
In ConText, the distributional statistics of a word or a sequence of words
(w) are determined by the surrounding words (i.e. local context). The width
of this local context, L, is a user-specified parameter, set in most of our
experiments to be two words on either side of w. To calculate the distributional
statistics of w, ConText constructs its left and right context
vectors.

[Distributional statistics] has been instrumental for the automatic
acquisition of syntactic categories 
(Redington, Chater & Finch, 1998), the grouping of nouns into semantic
categories (Pereira, Tishby & Lee, 1993), unsupervised parsing (Clark,
2001; Klein & Manning, 2002) and text classification (Baker & McCallum,
1998).

ConText forms word and phrase categories : 

E15 → bowl | refrigerator | oven | house | mirror | country | corner | sky |
	basket | living room | kitchen | barn | bath tub | snow | closet |
	carriage | world | box | bag | bedroom | car | sink air | water |
	movie | forest | sand | drawer
E32 → eat | drink
E68 → warm | hot | cold
E104 → hold on | listen

here E32 and E104 - both verbs - seem to also incorporate some semantic
aspects. 

  abstract = {

This paper reports progress in developing a computer model of
language acquisition in the form of (1) a generative grammar that is (2)
algorithmically learnable from realistic corpus data, (3) viable in its
large-scale quantitative performance and (4) psychologically real. First,
we describe new algorithmic methods for unsupervised learning of
generative grammars from raw CHILDES data and give an account of the
generative performance of the acquired grammars. Next, we summarize
findings from recent longitudinal and experimental work that suggests
how certain statistically prominent structural properties of child-directed
speech may facilitate language acquisition. We then present a series of
new analyses of CHILDES data indicating that the desired properties
are indeed present in realistic child-directed speech corpora. Finally,
we suggest how our computational results, behavioral findings,
and corpus-based insights can be integrated into a next-generation
model aimed at meeting the four requirements of our modeling
framework.

the approaches to grammar acquisition that are of most interest to us are
those that work in a completely unsupervised fashion on completely
unannotated corpora – that is, algorithms that start with no explicit
knowledge of potential structures and no data beyond the raw text or
transcribed speech.  Most existing algorithms for grammar induction have not
been designed or tested for operation that is realistic in that sense
(e.g. the highly successful algorithm of Klein and Manning (2002) learns
structures from data annotated for part of speech information). A most
notable exception in this respect is the Unsupervised Data-Oriented Parsing
(U-DOP) algorithm developed by
Bod (2009). The DOP approach uses the tree-substitution grammar formalism,
representing the structure of a novel sentence in terms of probabilistically
weighted structural analogies to trees gleaned from a training corpus. In the
unsupervised version, these trees are obtained by simply listing all the
possible binary tree descriptions of sentences in the training corpus. As
reported by Bod (2009), the U-DOP algorithm performs well in the task of
learning a grammar from CHILDES data annotated with part of speech
information, as assessed by comparing the structures it induces to those from
a hand-annotated gold-standard syntactic parse of the corpus (its performance
on raw CHILDES data is somewhat lower).

}}


@conference{borensztajn-09_neural-theory-grammar-acquisition,
  title={The hierarchical prediction network: towards a neural theory of grammar acquisition},
  author={Borensztajn, G. and Zuidema, W. and Bod, R.},
  booktitle={Proc. of the 31th Annual Meeting of the Cognitive Science Society},
  year={2009}
  annote = {

discussion 09: 

neocortex :
six-layered structure - vertical columns - replicated throughout

hawkins : memory-prediction framework:
information is stored in hierarchical fashion; top levels are more invariant
input is processed in a bottom-up fashion, but expectation is top-down

topology among the cells --> lead to grammar.

input node layer : interacts with the world.  e.g. each node is a word
      compressor node: connected to one or two nodes below it
      substitution space: represent the data in some n-dim virtual space
      production: ordered slots --> fire --> attach to the virtual space or
	      compressor nodes

Extends normal neural network structures by allowing a substitution operation
between the nodes.

--abstract--

We develop an approach to automatically identify the most probable multi-word
constructions used in children’s utterances, given syntactically annotated
utterances from the Brown corpus of CHILDES. The found constructions cover
many interesting linguistic phenomena from the language acquisition
literature and show a progression from very concrete toward abstract
constructions. We show quantitatively that for all children of the Brown
corpus grammatical abstraction, defined as the relative number of variable
slots in the productive units of their grammar, increases globally with age.

}}


@article{singh-reznick-12_infant-word-segmentation-longitudinal-8mo,
  title={Infant word segmentation and childhood vocabulary development: a longitudinal analysis},
  author={Singh, L. and Steven Reznick, J. and Xuehua, L.},
  journal={Developmental Science},
  year={2012},
  abstract = {

Infants begin to segment novel words from speech by 7.5 months, demonstrating
an ability to track, encode and retrieve words in the context of larger
units. Although it is presumed that word recognition at this stage is a
prerequisite to constructing a vocabulary, the continuity between these
stages of development has not yet been empirically demonstrated. ...

Results [of 2 expts] demonstrated a strong degree of association between
infant word segmentation abilities at 7 months and productive vocabulary size
at 24 months. In addition, outcome groups, as defined by median vocabulary
size and growth trajectories at 24 months, showed distinct word segmentation
abilities as infants. These findings provide the first prospective evidence
supporting the predictive validity of infant word segmentation tasks and
suggest that they are indeed associated with mature word knowledge.

}}


@article{junge-Kooijman-12_rapid-word-recognition-at-10mo,
  title={Rapid recognition at 10 months as a predictor of language development},
  author={Junge, C. and Kooijman, V. and Hagoort, P. and Cutler, A.},
  journal={Developmental Science},
  year={2012},

Infants’ ability to recognize words in continuous speech is vital for
building a vocabulary. We here examined the amount and type of exposure
needed for 10-month-olds to recognize words. Infants first heard a word,
either embedded within an utterance or in isolation, then recognition was
assessed by comparing event-related potentials to this word versus a word
that they had not heard directly before. Although all 10-month-olds showed
recognition responses to words first heard in isolation, not all infants
showed such responses to words they had first heard within an
utterance. Those that did succeed in the latter, harder, task, however,
understood more words and utterances when re-tested at 12 months, and
understood more words and produced more words at 24 months, compared with
those who had shown no such recognition response at 10 months. The ability to
rapidly recognize the words in continuous utterances is clearly linked to
future language development.

}}



@article{kaminski-schulz-12_how-dogs-know-when-addressed,
  title={How dogs know when communication is intended for them},
  author={Kaminski, J. and Schulz, L. and Tomasello, M.},
  journal={Developmental Science},
  year={2012},
  abstract = {

Domestic dogs comprehend human gestural communication in a way that other
animal species do not. But little is known about the specific cues they use
to determine when human communication is intended for them. In a series of
four studies, we confronted both adult dogs and young dog puppies with object
choice tasks in which a human indicated one of two opaque cups by either
pointing to it or gazing at it. We varied whether the communicator made eye
contact with the dog in association with the gesture (or whether her back was
turned or her eyes were directed at another recipient) and whether the
communicator called the dog’s name (or the name of another
recipient). 

Results demonstrated the importance of eye contact in human–dog
communication, and, to a lesser extent, the calling of the dog’s name – with
no difference between adult dogs and young puppies – which are precisely the
communicative cues used by human infants for identifying communicative
intent. Unlike human children, however, dogs did not seem to comprehend the
human’s communicative gesture when it was directed to another human, perhaps
because dogs view all human communicative acts as directives for the
recipient.

}}



@article{hochmann-etal-11_consonants-help-word-recog-vowels-structure-12mo,
  title={Consonants and vowels: different roles in early language acquisition},
  author={Hochmann, J.R. and Benavides-Varela, S. and Nespor, M. and Mehler, J.},
  journal={Developmental Science},
  year={2011},
  abstract = {

Language acquisition involves both acquiring a set of words (i.e. the
lexicon) and learning the rules that combine them to form sentences
(i.e. syntax). Here, we show that consonants are mainly involved in word
processing, whereas vowels are favored for extracting and generalizing
structural relations. 

We demonstrate that such a division of labor between consonants and vowels
plays a role in language acquisition. In two very similar experimental
paradigms, we show that 12-month-old infants rely more on the consonantal
tier when identifying words (Experiment 1), but are better at extracting and
generalizing repetition-based srtuctures over the vocalic tier (Experiment
2). These results indicate that infants are able to exploit the functional
differences between consonants and vowels at an age when they start acquiring
the lexicon, and suggest that basic speech categories are assigned to
different learning mechanisms that sustain early language acquisition.

Infants are able to use statistical information, such as dips in transition
probabilities (TPs) between syllables to identify word boundaries in a
continuous speech stream (Saffran, Aslin & Newport, 1996).

}}


@article{butler-patterson-12_semantic-effects-on-past-tense-inflection,
  title={In search of meaning: Semantic effects on past-tense inflection},
  author={Butler, R. and Patterson, K. and Woollams, A.M.},
  year={2012},
  journal = {Quarterly Journal of Experimental Psychology},
  Volume 65, Issue 8, 2012
  annote = {

Within single-mechanism connectionist models of inflectional morphology,
generating the past-tense form of a verb depends upon the interaction of
semantic and phonological representations, with semantic information being
particularly important for irregular or exception verbs. We assessed this
hypothesis in two experiments requiring normal speakers to produce the past
tense from a verb stem that takes a regular or exceptional past
tense. Experiment 1 revealed significant latency advantages for high- over
low-imageability words for both regular verbs (e.g., “lunged” faster than
“loved”) and exception items (e.g., “drank” faster than “dealt”); but
critically, this effect was significantly larger for exceptions than for
regulars. Experiment 2 employed a semantic priming paradigm where
participants inflected verb stems (e.g., sit) preceded by related (e.g.,
chair) or unrelated primes (e.g., jug) and revealed a priming effect in
accuracy that was confined to the exception items. Our results are consistent
with predictions from single-mechanism connectionist models of inflectional
morphology and converge with findings from neurological patients and studies
of reading aloud.

}}


@inproceedings{kim-mooney-12_unsupervised-pcfg-induction-grounded-language,
  title={Unsupervised PCFG Induction for Grounded Language Learning with Highly Ambiguous Supervision},
  author={Kim, J. and Mooney, R.J.},
  booktitle={Proceedings of the Conference on Empirical Methods in Natural Language Processing and Natural Language Learning, EMNLP-CoNLL},
  volume={12},
  year={2012},		
  annote = {

“Grounded” language learning employs training data in the form of sentences
paired with relevant but ambiguous perceptual contexts.  B¨orschinger et
al. (2011) introduced an approach to grounded language learning based on
unsupervised PCFG induction. Their approach works well when each sentence
potentially refers to one of a small set of possible meanings, such as in the
sportscasting task. However, it does not scale to problems with a large set
of potential meanings for each sentence, such as the navigation instruction
following task studied by Chen and Mooney (2011). This paper presents an
enhancement of the PCFG approach that scales to such problems with
highly-ambiguous supervision.  Experimental results on the navigation task
demonstrates the effectiveness of our approach.

}}



==Language : Word learning / grounding==

@article{matuszek-zettlemoyer-12_language-perception-grounded-attribute-learning,
  title={A Joint Model of Language and Perception for Grounded Attribute Learning},
  author={Cynthia Matuszek and FitzGerald, N. and Zettlemoyer, L. and Bo, L. and Fox, D.},
  journal={Arxiv preprint arXiv:1206.6423},
  year={2012},
  annote = {

Cynthia Matuszek: Learning Novel Attributes from Combined
  Language and Perception

learn cognition from observation

show some objects, have people describe

"can you describe these objects to us"

try to parse descriptions to obtain semantic grounding. 

mainly colour and shape. 

mechanical turk data collection: 
have people describe
incentive for minimizing description lengths

15 people (8 male, 7 female)

"this is a green object" --> lambda x green(x)

orange ball - combination of colour and shape. 

doesn't know that "orange" is class colour - tries both. 

input problems: someone says:
"this is a fake lettuce, don't eat it"

--abstract--

As robots become more ubiquitous and capable of performing complex tasks, the
importance of enabling untrained users to interact with them has increased.
In response, unconstrained natural-language interaction with robots has
emerged as a significant research area. We discuss the problem of parsing
natural language commands to actions and control structures that can be
readily implemented in a robot execution system. Our approach learns a parser
based on example pairs of English commands and corresponding control language
expressions. We evaluate this approach in the context of following route
instructions through an indoor environment, and demonstrate that our system
can learn to translate English commands into sequences of desired actions,
while correctly capturing the semantic intent of statements involving complex
control structures. The procedural nature of our formal representation allows
a robot to interpret route instructions online while moving through a
previously unknown environment.

}}


@article{muncer-knight-12_bigram-trough-syllable-effect-in-lexical-decision,
  title={The bigram trough hypothesis and the syllable number effect in lexical decision},
  author={Muncer, S.J. and Knight, D.C.},
  year={2012},
  journal = {Quarterly Journal of Experimental Psychology},
  Volume 65, Issue 8, 2012
  annote = {

There has been an increasing volume of evidence supporting the role of the
syllable in various word processing tasks. It has, however, been suggested
that syllable effects may be caused by orthographic redundancy. In
particular, it has been proposed that the presence of bigram troughs at
syllable boundaries cause what are seen as syllable effects. We investigated
the bigram trough hypothesis as an explanation of the number of syllables
effect for lexical decision in five-letter words and nonwords from the
British Lexicon Project. The number of syllables made a significant
contribution to prediction of lexical decision times along with word
frequency and orthographic similarity. The presence of a bigram trough did
not. For nonwords, the number of syllables made a significant contribution to
prediction of lexical decision times only for nonwords with relatively long
decision times. The presence of a bigram trough made no contribution. The
evidence presented suggests that the bigram trough cannot be an explanation
of the syllable number effect in lexical decision. A comparison of the
results from words and nonwords is interpreted as providing some support for
dual-route models of reading.

}}


@inproceedings{chen-12_fast-lexicon-learning-for-grounded-language-acquisition,
  title={Fast online lexicon learning for grounded language acquisition},
  author={Chen, D.L.},
  booktitle={Proc. of the Annual Meetings of the Association for Computational Linguistics (ACL)},
  year={2012}

Abstract
Learning a semantic lexicon is often an important first step in building a
system that learns to interpret the meaning of natural language.  It is
especially important in language grounding where the training data usually
consist of language paired with an ambiguous perceptual context. Recent work
by Chen and Mooney (2011) introduced a lexicon learning method that deals
with ambiguous relational data by taking intersections of graphs. While the
algorithm produced good lexicons for the task of learning to interpret
navigation instructions, it only works in batch settings and does not scale
well to large datasets. In this paper we introduce a new online algorithm
that is an order of magnitude faster and surpasses the stateof- the-art
results. We show that by changing the grammar of the formal meaning
representation language and training on additional data collected from
Amazon’s Mechanical Turk we can further improve the results. We also include
experimental results on a Chinese translation of the training data to
demonstrate the generality of our approach.

}}



@article{mather-plunkett12_role-of-novelty-in-early-word-learning,
  title={The role of novelty in early word learning},
  author={Mather, E. and Plunkett, K.},
  journal={Cognitive Science},
  year={2012},
  publisher={Wiley Online Library}
  annote = {

22-month old babies know that a new, unfamiliar word is more likely to be
associated with a novel object.  

abstract:
What mechanism implements the mutual exclusivity bias to map novel labels to
objects without names? Prominent theoretical accounts of mutual exclusivity
(e.g., Markman, 1989, 1990) propose that infants are guided by their
knowledge of object names. However, the mutual exclusivity constraint could
be implemented via monitoring of object novelty (see Merriman, Marazita, &
Jarvis, 1995). We sought to discriminate between these contrasting
explanations across two preferential looking experiments with
22-month-olds. In Experiment 1, infants viewed three objects: one nameknown,
two name-unknown. Of the two name-unknown objects, one was novel, and the
other had been previously familiarized. The infants responded to hearing a
novel label by increasing attention only to the novel, name-unknown
object. In a second experiment in which the name-known object was absent, a
novel label increased infants’ attention to a novel object beyond baseline
preference for novelty. The experiments provide clear evidence for a
novelty-based mechanism. However, differences in the time course of
disambiguation across experiments suggest that novelty processing may be
influenced by contextual factors.

}}


@article{mani-mills-12_vowels-in-early-words,
  title={Vowels in early words: an event-related potential study},
  author={Mani, N. and Mills, D.L. and Plunkett, K.},
  journal={Developmental Science},
  year={2012},
  publisher={Wiley Online Library}

Abstract
Previous behavioural research suggests that infants possess phonologically
detailed representations of the vowels and consonants in familiar
words. These tasks examine infants’ sensitivity to mispronunciations of a
target label in the presence of a target and distracter image. Sensitivity to
the mispronunciation may, therefore, be contaminated by the degree of
mismatch between the distracter label and the heard mispronounced
label. Event-related potential (ERP) studies allow investigation of infants’
sensitivity to the relationship between a heard label (correct or
mispronounced) and the referent alone using single picture trials. ERPs also
provide information about the timing of lexico-phonological activation in
infant word recognition. The current study examined 14-month-olds’
sensitivity to vowel mispronunciations of familiar words using ERP data from
single picture trials. Infants were presented with familiar images followed
by a correct pronunciation of its label, a vowel mispronunciation or a
phonologically unrelated non-word. The results support and extend previous
behavioural findings that 14-month-olds are sensitive to mispronunciations of
the vowels in familiar words using an ERP task. 

We suggest that the presence of pictorial context reinforces infants’
sensitivity to mispronunciations of words, and that mispronunciation
sensitivity may rely on infants accessing the cross-modal associations
between word forms and their meanings.

}}


@article{caza-knott-12_pragmatic-bootstrapping-neural-network-vocabulary-acquisition,
  title={Pragmatic bootstrapping: a neural network model of vocabulary acquisition},
  author={Caza, G.A. and Knott, A.},
  journal={Language Learning and Development},
  volume={8},
  number={2},
  pages={113--135},
  year={2012},
  publisher={Taylor \& Francis},
  annote = {

learns from single word input: 
   A final difference in our data representation is that it associates
   individual words with individual concepts rather than associating
   multiword utterances with groups of concepts (as, e.g., in Siskind, 1996;
   Yu & Ballard, 2007). Our streams differ in their granularity by isolating
   individual concepts and single-word utterances. For simplicity, we assume
   that the child pays special attention to certain emphasized words and
   these emphasized words are the ones that appear in our utterance stream.

}}


@article{greco-carrea-11_grounding-symbols-no-composition-without-discrimination,
  title={Grounding compositional symbols: no composition without discrimination},
  author={Greco, A. and Carrea, E.},
  journal={Cognitive Processing},
  pages={1--12},
  year={2011},
  publisher={Springer},
  annote = {

The classical computational conception of meaning has been challenged by the
idea that symbols must be grounded on sensorimotor processes. A difficult
question arises from the fact that grounding representations cannot be
symbolic themselves but, in order to support compositionality, should work as
primitives. This implies that they should be precisely identifiable and
strictly connected with discriminable perceptual features. Ideally, each
representation should correspond to a single discriminable feature. The
present study was aimed at exploring whether feature discrimination is a
fundamental requisite for grounding compositional symbols. We studied this
problem by using Integral stimuli, composed of two interacting and not
separable features. Such stimuli were selected in Experiment 1 as pictures
whose component features are easily or barely discriminable (Separable or
Integral) on the basis of psychological distance metrics (City-block or
Euclidean) computed from similarity judgments. In Experiment 2, either each
feature was associated with one word of a two-word expression, or the whole
stimulus with a single word. In Experiment 3, the procedure was reversed and
words or expressions were associated with whole pictures or separate
features. Results support the hypothesis that single words are best grounded
by Integral stimuli and composite expressions by Separable stimuli, where a
strict association of single words with discriminated features is possible.

}}



@article{battaglia-borensztajn-bod-12_structured-cognition-rats-to-language,
  title={Structured cognition and neural systems: From rats to language},
  author={Battaglia, F.P. and Borensztajn, G. and Bod, R.},
  journal={Neuroscience \& Biobehavioral Reviews},
  year={2012},
  publisher={Elsevier}
  annote = {

very interesting ideas that suggest that learning grammars is a
generalization of an ability to parse the input into hierarchies such as
sub-events in an action, regions of a painting, or phrases in sentences. 

they suggest an approach based on learning all sub-trees of a parse rather
than just the bottom level structure - called Data-Oriented parsing or DOP. 
[code available. ??project??]

--abstract--

Much of animal and human cognition is compositional in nature: higher order,
complex representations are formed by (rule-governed) combination of more
primitive representations. We review here some of the evidence for
compositionality in perception and memory, motivating an approach that takes
ideas and techniques from computational linguistics to model aspects of
structural representation in cognition. We summarize some recent developments
in our work that, on the one hand, use algorithms from computational
linguistics to model memory consolidation and the formation of semantic
memory, and on the other hand use insights from the neurobiology of memory to
develop a neurally inspired model of syntactic parsing that improves over
existing (not cognitively motivated) models in computational
linguistics. These two theoretical studies highlight interesting analogies
between language acquisition, semantic memory and memory consolidation, and
suggest possible neural mechanisms, implemented in computational algorithms
that may underlie memory consolidation.

}}



==Humour==

@article{marinkovic-baldwin-11_right-hemisphere-joke-appreciation,
  title={Right hemisphere has the last laugh: neural dynamics of joke appreciation},
  author={Marinkovic, K. and Baldwin, S. and Courtney, M.G. and Witzel, T. and Dale, A.M. and Halgren, E.},
  journal={Cognitive, Affective, \& Behavioral Neuroscience},
  volume={11},
  number={1},
  pages={113--130},
  year={2011},
  publisher={Springer}
  annote = {

the neural processes in understanding humour...

abstract: 
Understanding a joke relies on semantic, mnemonic, inferential, and emotional
contributions from multiple brain areas. Anatomically constrained
magnetoencephalography (aMEG) combining high-density whole-head MEG with
anatomical magnetic resonance imaging allowed us to estimate where the
humor-specific brain activations occur and to understand their temporal
sequence. Punch lines provided either funny, not funny (semantically
congruent), or nonsensical (incongruent) replies to joke questions. Healthy
subjects rated them as being funny or not funny. As expected, incongruous
endings evoke the largest N400m in left-dominant temporo-prefrontal areas,
due to integration difficulty. In contrast, funny punch lines evoke the
smallest N400m during this initial lexical–semantic stage, consistent with
their primed “surface congruity” with the setup question. In line with its
sensitivity to ambiguity, the anteromedial prefrontal cortex may contribute
to the subsequent “second take” processing, which, for jokes, presumably
reflects detection of a clever “twist” contained in the funny punch
lines. Joke-selective activity simultaneously emerges in the right prefrontal
cortex, which may lead an extended bilateral temporo-frontal network in
establishing the distant unexpected creative coherence between the punch line
and the setup. This progression from an initially promising but misleading
integration from left fronto-temporal associations, to medial prefrontal
ambiguity evaluation and right prefrontal reprocessing, may reflect the
essential tension and resolution underlying humor.

}}


==Tacit knowledge==

@article{bargh-schwader-12trics_automaticity-in-cognition,
  title={Automaticity in social-cognitive processes},
  author={Bargh, J.A. and Schwader, K.L. and Hailey, S.E. and Dyer, R.L. and Boothby, E.J.},
  journal={Trends in cognitive sciences},
  year={2012},
  annote = {

    automaticity has emerged as a broad phencmenon over the past few yearss.
    30 years ago [1]: some social-percetual processes e.g. impression formation
    and stereotyping may have efficient and unintentional components
    operating outside conscious awareness), has now become a staple in
    explaining almost all psychological phenomena.

two classes of automaticity: 

‘preconscious': generated from effortless sensory or perceptual activity and
    then serve as implicit, unappreciated inputs into conscious and
    deliberate processes.  e.g. behavioral contagion or conformity effects
    triggered by the perception of others’ behavior and immediate impressions
    of others based on their facial features or expressions alone, 
    also others driven by automatic sensory perception and the perception of
    internal states as in embodied cognition and emotional influences,
    including emotional influences on moral judgment. 
    A major development over the past decade and especially the past 5 years
    has been the inclusion of motivational and goal pursuit processes into
    this category of preconsciously automatic processes. Research has shown
    that goal pursuits can become activated (primed) by relevant situational
    features; they then operate outside of conscious awareness and guidance. 

‘goal-dependent’ or ‘postconscious’ : consequences of prior conscious and
    intentional thought, such as unconscious components in consciously
    intended decision-making processes and those that support one's conscious
    commitment to a relationship partner. (see [1]; also [5]). 

--from [1]--
[J.A. Bargh Conditional automaticity: varieties of automatic influence on
social perception and cognition, J. Uleman, J.A. Bargh (Eds.), Unintended
Thought, Guilford (1989), pp. 3–51 ]

Bargh p. 5: 

the thesis that a given cognitive process is either automatic or
controlled is incorrect.       This assumption results in faulty
conclusions... 

an automatic process is taken to be 
   - intentional, 
   - effortless, 
   - autonomouse, 
   - involuntary,
   - occurring outside conscious awareness.  
and anything that has one or two of these criteria are taken to be
automatic. 

However attention, awareness, intention and control do not nec occur together
in an all-or-none fashion. 

automaticity in impression formation and social judgment have shown subjects
engaging in task-relevant processing very efficiently, even when attentional
resources are scarce.  Because these routinized modes of thought are
relatively indpendent of conscious attention, they are automatic or
effortless . But subjects are following explicit instructions to form an
impression or make the judgment, so it is not unintentional. 

many processing effects that are unintentional may depend on conscious or
attentional processing ... e.g. opon perception of the attitude object, text
categorization of behavioural information, and most category-priming
demonstrations.  

Processes previously believed to be prototypic examples of automaticity --
a) activation of a word's meaning during reading; b) semantic priming and
spreading activation; c) the Stroop color - word interference effect;
d) well-practiced  visual target detection -- have all been shown to require
some intentional resouurces (and not completely effortless).  [Dark, Johnston,
Myles-Worsley Farah 1985]

--abstract--
Over the past several years, the concept of automaticity of higher cognitive
processes has permeated nearly all domains of psychological research. In this
review, we highlight insights arising from studies in decision-making, moral
judgments, close relationships, emotional processes, face perception and
social judgment, motivation and goal pursuit, conformity and behavioral
contagion, embodied cognition, and the emergence of higher-level automatic
processes in early childhood. Taken together, recent work in these domains
demonstrates that automaticity does not result exclusively from a process of
skill acquisition (in which a process always begins as a conscious and
deliberate one, becoming capable of automatic operation only with frequent
use) – there are evolved substrates and early childhood learning mechanisms
involved as well.

}}


==Embodiment==

@article{pezzulo-barsalou-cangelosi-11_mechanics-of-embodiment-computational,
  title={The mechanics of embodiment: a dialog on embodiment and computational modeling},
  author={Pezzulo, G. and Barsalou, L.W. and Cangelosi, A. and Fischer, M.H. and McRae, K. and Spivey, M.J.},
  journal={Frontiers in psychology},
  volume={2},
  year={2011},
  publisher={Frontiers Media SA}


Abstract
Embodied theories are increasingly challenging traditional views of cognition
by arguing that conceptual representations that constitute our knowledge are
grounded in sensory and motor experiences, and processed at this sensorimotor
level, rather than being represented and processed abstractly in an amodal
conceptual system. Given the established empirical foundation, and the
relatively underspecified theories to date, many researchers are extremely
interested in embodied cognition but are clamoring for more mechanistic
implementations. What is needed at this stage is a push toward explicit
computational models that implement sensorimotor grounding as intrinsic to
cognitive processes. In this article, six authors from varying backgrounds
and approaches address issues concerning the construction of embodied
computational models, and illustrate what they view as the critical current
and next steps toward mechanistic theories of embodiment. The first part has
the form of a dialog between two fictional characters: Ernest, the
“experimenter,” and Mary, the “computational modeler.” The dialog consists of
an interactive sequence of questions, requests for clarification, challenges,
and (tentative) answers, and touches the most important aspects of grounded
theories that should inform computational modeling and, conversely, the
impact that computational modeling could have on embodied theories. The
second part of the article discusses the most important open challenges for
embodied computational modeling.

}}


@article{maouene-smith-08_body-parts-and-early-verbs,
  title={Body Parts and Early-Learned Verbs},
  author={Maouene, J. and Hidaka, S. and Smith, L.B.},
  journal={Cognitive Science},
  volume={32},
  number={7},
  pages={1200--1216},
  year={2008},
  publisher={Wiley Online Library}

early verbs are correlated with body parts. 
body maps - proportional to brain areas devoted to them (homunculus maps

At 21 months, verbs involving actions of the mouth and lip are 47% of the
	“meanings” of all verbs known at this age. 

Growth in verb meanings from 22 to 24 months overwhelmingly (86% of all new
meanings) concerns actions by the limbs.

The predominant region of growth after this point is in verbs that
specifically involve the hands, counting for 58% of new meanings from 24 to
26 months and 59% of all new meanings from 26 to 30 months.  

At 30 months, verbs labelling actions involving hands and arms dominate all
verb meanings, accounting for 51% of all verbs in children’s total productive
vocabulary at 30 months.  Together, these body maps provide a developmental
picture of verb learning that is strongly organized by the body’s morphology.

earlier version: [maouene-06icdl_body-parts-early-verb-acquisition]

}}


==Gaze / Attention ==

@inproceedings{kuriyama-tokunaga-11_gaze-matching-of-referring-expressions-in-collaborative-problem-solving,
title={Gaze matching of referring expressions in collaborative problem solving},
author={Kuriyama, N. and Terai, A. and Yasuhara, M. and Tokunaga, T. and Yamagishi, K. and Kusumi, T.},
booktitle={Proceedings of International Workshop on Dual Eye Tracking in CSCW (DUET 2011)},
year={2011}
annote = {

subjects A and B collaborate in tasks.  e.g. A asks B: "Put the big triangle
next to the square".  Is B looking at the same part of the screen as A?
Among pairs who manage to do the task better, this overlap is higher.

Abstract. Richardson and Dale (2005) showed that eye gaze matching between
speakers and listeners contributed to language comprehension. While their
study used a static image as a visual stimulus, and the speech and eye gaze
of speakers and that of listeners were recorded serially, we recorded speech
in synchronisation with eye gaze of both participants simultaneously in a
collaborative problem solving setting. The analysis of the collected data
revealed that the eye gaze matching rate is higher in successful pairs than
in unsuccessful pairs, and the peak of the matching rate comes at different
position from the onset of referring expressions depending on surface form of
the expressions.

}}


==Number sense / Math==

@article{siegler-fazio-12trics_fractions-numerical-development,
  title={Fractions: the new frontier for theories of numerical development},
  author={Siegler, R.S. and Fazio, L.K. and Bailey, D.H. and Zhou, X.},
  journal={Trends in Cognitive Sciences},
  year={2012},
  annote = {
January 2013, Vol. 17, No. 1, p. 13-19

our sense of magnitude is located in an area of the brain (intraparietal
sulcus, IPS).  fractions integrate a large degree of implicit, non-symbolic
knowledge (which figure has a higher proportion of blue dots) with symbolic
(is 2/3 greater than 3/4).  

this compact survey paper also covers how educational aspects of how
fractions are learned and used.

--abstract--

Recent research on fractions has broadened and deepened theories of numerical
development. Learning about fractions requires children to recognize that
many properties of whole numbers are not true of numbers in general and also
to recognize that the one property that unites all real numbers is that they
possess magnitudes that can be ordered on number lines. The difficulty of
attaining this understanding makes the acquisition of knowledge
about fractions an important issue educationally, as well as
theoretically. This article examines the neural underpinnings of fraction
understanding, developmental and individual differences in that
understanding, and interventions that improve the understanding. Accurate
representation of fraction magnitudes emerges as crucial both to conceptual
understanding of fractions and to fraction arithmetic.

}}


@article{libertus-feigenson-11_approximate-number-sense-predicts-math-ability-3yo,
  title={Preschool acuity of the approximate number system correlates with school math ability},
  author={Libertus, M.E. and Feigenson, L. and Halberda, J.},
  journal={Developmental Science},
  year={2011},
  annote = {

children's rapid response on images w q's such as "are there more yellow dots
than blue" are measured.  There is a wide spread - tests done on 85 children
age 3-5 indicate a variability in accuracy with s.d. (sigma) about 15-20%. 
response time and weber fraction were also measured. 

statistical correlation is fond between the numerical ability and early math
test scores on Test of Early Math Ability (TEMA-3).  As reaction times
increase, there is a gentle downward slope in the TEMA scores.  However, I
couldn't understand fig 2a, where as
accuracy increases, the slope is downward as well.  This seems to contradict
the claim in the text that "faster RT and greater accuracy on
the ANS acuity task are associated with higher math ability.
},
  abstract = {

Previous research shows a correlation between individual differences in
people’s school math abilities and the accuracy with which they rapidly and
nonverbally approximate how many items are in a scene. This finding is
surprising because the Approximate Number System (ANS) underlying numerical
estimation is shared with infants and with non-human animals who never
acquire formal mathematics. However, it remains unclear whether the link
between individual differences in math ability and the ANS depends on formal
mathematics instruction. Earlier studies demonstrating this link tested
participants only after they had received many years of mathematics
education, or assessed participants’ ANS acuity using tasks that required
additional symbolic or arithmetic processing similar to that required in
standardized math tests. To ask whether the ANS and math ability are linked
early in life, we measured the ANS acuity of 200 3- to 5-year-old children
using a task that did not also require symbol use or arithmetic
calculation. We also measured children’s math ability and vocabulary size
prior to the onset of formal math instruction. We found that children’s ANS
acuity correlated with their math ability, even when age and verbal skills
were controlled for. These findings provide evidence for a relationship
between the primitive sense of number and math ability starting early in
life.

}}


@article{agrillo-piffer-12_musicians-better-at-magnitude-estimation,
  title={Musicians outperform non-musicians in magnitude estimation: evidence of a common processing mechanism for time, space and numbers},
  author={Agrillo, C. and Piffer, L.},
  year={2012},
  year={2012},
  journal = {Quarterly Journal of Experimental Psychology},
  Volume 65, Issue 8, 2012
  annote = {

It has been proposed that time, space, and numbers may be computed by a
common magnitude system. Even though several behavioural and neuroanatomical
studies have focused on this topic, the debate is still open. To date, nobody
has used the individual differences for one of these domains to investigate
the existence of a shared cognitive system. Musicians are known to outperform
nonmusicians in temporal discrimination tasks. We therefore observed
professional musicians and nonmusicians undertaking three different tasks:
temporal (participants were required to estimate which of two tones lasted
longer), spatial (which line was longer), and numerical discrimination (which
group of dots was more numerous). If time, space, and numbers are processed
by the same mechanism, it is expected that musicians will have a greater
ability, even in nontemporal dimensions. As expected, musicians were more
accurate with regard to temporal discrimination. They also gave better
performances in both the spatial and the numerical tasks, but only outside
the subitizing range. Our data are in accordance with the existence of a
common magnitude system. We suggest, however, that this mechanism may not
involve the whole numerical range.

SUBITIZE: able to estimate (a small) number without having to count

}}


==Belief / Categorization==

@article{baillargeon_10trics_false-belief-understanding,
  title={False-belief understanding in infants},
  author={Baillargeon, R. and Scott, R.M. and He, Z.},
  journal={Trends in Cognitive Sciences},
  volume={14},
  number={3},
  pages={110--118},
  issn={1364-6613},
  year={2010},
  annote = {

In a classic paper from 1983, Wimmer and Penner 
In the False-Belief task, a toy is hidden in a green box in front of child A
and agent B.  Then when agent B has left the room, the toy is shifted to an
yellow box.  

Agent B re-enters the room, and child A is now asked - which box will agent A
search for the toy in?  

ALl the 3–4-year olds, and about half the 4–6-year olds invariably answer the
"yellow box".  (the new location).  By age 9, the idea that agent B has a
false belief, becomes available to most subjects.

Thus, children's ideas about the belief in other agents is different from
adults.  

This aspect has led to a vast literature.  Here, the idea is to explore this
false belief not by explicit questions, but by their looking patterns, as
tested in a violation of expectation task (VoE).  based on this they surmise
that even 15-months have some awareness that agent B may believe the item to
be in the green box.

--abstract--

At what age can children attribute false beliefs to others?  Traditionally,
investigations into this question have used elicited-response tasks in which
children are asked a direct question about an agent’s false belief. Results
from these tasks indicate that the ability to attribute false beliefs does
not emerge until about age 4. However, recent investigations using
spontaneous-response tasks suggest that this ability is present much
earlier. Here we review results from various spontaneous-response tasks that
suggest that infants in the second year of life can already attribute false
beliefs about location and identity as well as false perceptions. We also
consider alternative interpretations that have been offered for these
results, and discuss why elicited-response tasks are particularly difficult
for young children.

}}



@article{ell-ashby-12_unsupervised-category-w-feature-fusion,
  title={Unsupervised category learning with integral-dimension stimuli},
  author={Ell, S.W. and Ashby, F.G. and Hutchinson, S.},
  year={2012},
  journal = {Quarterly Journal of Experimental Psychology},
  volume = 65, 
  issue = 8, 
  pages = {1537-1562},

  annote = {

How do we form categories? What features are used, which are ignored? 

--abstract--
Despite the recent surge in research on unsupervised category learning,
relatively little research has focused on constrained tasks in which the goal
is to learn predefined stimulus clusters [as opp to unconstrained] in the
absence of feedback. The few studies that have addressed this issue have
focused almost exclusively on stimuli for which it is relatively easy to
attend selectively to the component dimensions (i.e., separable
dimensions). In the present study, we investigated the ability of
participants to learn categories constructed from stimuli for which it is
difficult, if not impossible, to attend selectively to the component
dimensions (i.e., integral dimensions). 

The experiments demonstrate that individuals are capable of learning
categories constructed from the integral dimensions of brightness and
saturation, but this ability is generally limited to category structures
requiring selective attention to brightness. As might be expected with
integral dimensions, participants were often able to integrate brightness and
saturation information in the absence of feedback — an ability not observed in
previous studies with separable dimensions. Even so, there was a bias to
weight brightness more heavily than saturation in the categorization process,
suggesting a weak form of selective attention to brightness. These data
present an important challenge for the development of models of unsupervised
category learning.

}}


==Spatial Cognition==

@article{avraamides-galati-denis-12_spatial-info-updating-from-narratives}
  doi = {10.1080/17470218.2012.712147},
  author = {Marios N. Avraamides and Alexia Galati and Francesca Pazzagli and
	Chiara Meneghetti and Michel Denis}
  title = {Encoding and updating spatial information presented in narratives}
  journal = {Quarterly Journal of Experimental Psychology},
  annote = {

subjects read a description that placed them in a square space (e.g. a hotel
lobby), and told about objects placed at the corners and centers around them
(e.g. [in the hotel lobby] the swimming pool could be seen in the front, the
reception to the left, the elevators to the right, and the lobby entrance at
the back).   Details are described : (e.g., “the painting depicts a scene
from the ancient Greek mythology with the 12 gods from mount Olympus. You
stare at the painting for a while thinking that its colours do not match well
with those of the courtroom”). 

Then the subjects read about the protagonist turning to face these objects.
Finally, they were asked to turn themselves, and were asked to interpret
where objects may be with respect to the protagonists.  In one expt, they are
asked to turn in the direction the character rotated, and in another, in the
opposite direction, but there was no effect of their embodied pose.  The
initial encoding is what most subjects used. 

abstract:
Four experiments investigated whether directional spatial relations encoded
by reading narratives are updated following described protagonist
rotations. Participants memorized locations of objects described in short
stories that placed them, as the protagonist, in remote settings. After
reading a description that the protagonist rotated to the left or the right
of the initial orientation, participants made judgements about object
relations in the described environment (Experiment 1). Before making these
judgments, participants were instructed to physically rotate to match
(Experiment 2) or mismatch (Experiment 4) the protagonist's described
rotation, and in Experiments 3 and 4 to also visualize the changed relations
following rotation. 

Participants' performance suggested that they relied on
the initial representation they constructed during encoding rather than on
the updated protagonist-to-object relations. Participants' physical movement
to match the described rotation and additional visualization instructions did
not facilitate updating through a sensorimotor process. In these respects,
updating spatial relations in situation models constructed from narratives
differs from updating in perceptually experienced environments.

}}


@inproceedings{matuszek-herbst-zettlemoyer-12_parsing-commands-to-robot,
  title =	 {Learning to parse natural language commands to a robot
                  control system},
  author =	 {Matuszek, C. and Herbst, E. and Zettlemoyer, L. and Fox,
                  D.},
  booktitle =	 {Proc. of the 13th Int’l Symposium on Experimental Robotics
                  (ISER)},
  year =	 {2012},
  abstract =	 { 

As robots become more ubiquitous and capable of performing complex tasks, the
importance of enabling untrained users to interact with them has increased.
In response, unconstrained natural-language interaction with robots has
emerged as a significant research area. We discuss the problem of parsing
natural language commands to actions and control structures that can be
readily implemented in a robot execution system. Our approach learns a parser
based on example pairs of English commands and corresponding control language
expressions. We evaluate this approach in the context of following route
instructions through an indoor environment, and demonstrate that our system
can learn to translate English commands into sequences of desired actions,
while correctly capturing the semantic intent of statements involving complex
control structures. The procedural nature of our formal representation allows
a robot to interpret route instructions online while moving through a
previously unknown environment.

}}


==Neuroscience==

@article{kravitz-saleem-12trics_ventral-visual-pathway-object-recog,
  title={The ventral visual pathway: an expanded neural framework for the processing of object quality},
  author={Kravitz, D.J. and Saleem, K.S. and Baker, C.I. and Ungerleider, L.G. and Mishkin, M.},
  journal={Trends in Cognitive Sciences},
  year={2012},
  publisher={Elsevier}
  annote = {
January 2013, Vol. 17, No. 1, p.26-49

Since the original characterization of the ventral visual pathway, our
knowledge of its neuroanatomy, functional properties, and extrinsic targets
has grown considerably.  Here we synthesize this recent evidence and propose
that the ventral pathway is best understood as a recurrent occipito-temporal
network containing neural representations of object quality both utilized and
constrained by at least six distinct cortical and subcortical systems. Each
system serves its own specialized behavioral, cognitive, or affective
function, collectively providing the raison d’eˆtre for the ventral visual
pathway. This expanded framework contrasts with the depiction of the ventral
visual pathway as a largely serial staged hierarchy culminating in singular
object representations and more parsimoniously incorporates attentional,
contextual, and feedback effects.

Fig. 2b
At least six distinct pathways emanate from the occipitotemporal network. 
1. occipitotemporo-neostriatal pathway (black lines) originates from every
   region in the network and supports visually-dependent habit formation and
   skill learning. 
2. projection targeting the ventral striatum (or nucleus accumbens) and
   supports the assignment of stimulus valence. 
3. occipitotemporoamygdaloid pathway supports the processing of
   emotional stimuli. 
4. occipitotemporo-medial temporal pathway targets the perirhinal and
   entorhinal cortices as well as the hippocampus and supports longterm
   object and object-context memory. 
5. occipitotemporo-orbitofrontal pathway : reward processing
6. occipitotemporo-ventrolateral prefrontal pathway: 
   object working memory

}}


@article{caggiano-fogassi-rizzolatti-11_view-based-action-recog-motor-neurons,
title={View-based encoding of actions in mirror neurons of area F5 in macaque premotor cortex},
author={Caggiano, V. and Fogassi, L. and Rizzolatti, G. and Pomper, J.K. and Thier, P. and Giese, M.A. and Casile, A.},
journal={Current Biology},
year={2011},
annote = {

[neuroscience models of action recognition, based on the intriguing discovery
of "mirror neurons".  These neurons fire when the person does the action
himself, but also when they see the action being done by others.  

This paper proposes that motor neurons are view sensitive - work only for a
given view. 
]

}}



@article{mcnealy-mazziotta-11_neural-language-learning,
  title={Age and experience shape developmental changes in the neural basis of language-related learning},
  author={McNealy, K. and Mazziotta, J.C. and Dapretto, M.},
  journal={Developmental Science},
  year={2011},
  annote = {

... neural underpinnings of language learning

abstract: 
One hundred and fifty-six participants, ranging from age 5 to adulthood,
underwent functional magnetic resonance imaging (fMRI) while listening to
three novel streams of continuous speech, which contained either strong
statistical regularities, strong statistical regularities and speech cues, or
weak statistical regularities providing minimal cues to word boundaries. 
only the 5- to 10-year-old children displayed significant signal increases
for the stream 
with low statistical regularities, suggesting an age-related decrease in
sensitivity to more subtle statistical cues. Further, in a sample of 78
10-yearolds, we examined the impact of proficiency in a second language and
level of pubertal development on learning-related signal increases, showing
that the brain regions involved in language learning are influenced by both
experiential and maturational factors.

}}


@article{deshmukh-knierim-11_representation-spatial-entorhinal,
title={Representation of non-spatial and spatial information in the lateral entorhinal cortex},
author={Deshmukh, S.S. and Knierim, J.J.},
journal={Frontiers in behavioral neuroscience},
volume={5},
year={2011},
publisher={Frontiers Media SA},
annote = {

the role of the hippocampus in memory formation - integrates the what, when
and where.  e.g.  

	 at the dhaba on the lko highway [WHERE]
	 i saw two gunmen for protection [WHAT]
	 day before yesterday [WHEN]

place cell: responds to specific location 	 

[okeefe 78] : hippocampus as cognitive map 

The hippocampus is involved in episodic memory in
humans,, and possibly in animals.  it is also involved in spatial location,
which led to the notion that the hippocampus provides a spatial framework to
organize memory. 

tetrode based expts on individual neurons in rat-brain - cable back to recorder
hyperdrive - each screw carries a tetrode which records 4 recording points
	listen to a set of neurons - can disambiguate outputs of single
	neurons [similar to triangulation] - 70 micron range
		[may have 1K neurons in this range - but only 30 or so are
		seen - rest may be silent]

bsbe 12oct TALKS.t

--Abstract--
Some theories of memory propose that the hippocampus integrates the
individual items and events of experience within a contextual or spatial
framework. The hippocampus receives cortical input from two major pathways:
the medial entorhinal cortex (MEC) and the lateral entorhinal cortex
(LEC). During exploration in an open field, the firing fields of MEC grid
cells form a periodically repeating, triangular array. In contrast, LEC
neurons show little spatial selectivity, and it has been proposed that the
LEC may provide non-spatial input to the hippocampus. Here, we recorded MEC
and LEC neurons while rats explored an open field that contained discrete
objects. LEC cells fired selectively at locations relative to the objects,
whereas MEC cells were weakly influenced by the objects. These results
provide the first direct demonstration of a double dissociation between LEC
and MEC inputs to the hippocampus under conditions of exploration typically
used to study hippocampal place cells.

}}

==Cognition and evolution==



@article{csibra-gergely-11_natural-pedagogy-as-evolution,
  title={Natural pedagogy as evolutionary adaptation},
  author={Csibra, G. and Gergely, G.},
  journal={Philosophical Transactions of the Royal Society B: Biological Sciences},
  volume={366},
  number={1567},
  pages={1149--1157},
  year={2011},
  annote = {

very similar to [csibra-gergely-09_natural-pedagogy] which is a slightly
lighter treatment.  particularly see Hoppit etal 08, lessons from animal teaching

abstract: 
We propose that the cognitive mechanisms that enable the transmission of
cultural knowledge by communication between individuals constitute a system
of ‘natural pedagogy’ in humans, and represent an evolutionary adaptation
along the hominin lineage. We discuss three kinds of arguments that support
this hypothesis. First, natural pedagogy is likely to be human-specific:
while social learning and communication are both widespread in non-human
animals, we know of no example of social learning by communication in any
other species apart from humans.  Second, natural pedagogy is universal:
despite the huge variability in child-rearing practices, all human cultures
rely on communication to transmit to novices a variety of different types of
cultural knowledge, including information about artefact kinds, conventional
behaviours, arbitrary referential symbols, cognitively opaque skills and
know-how embedded in means-end actions. Third, the data available on early
hominin technological culture are more compatible with the assumption that
natural pedagogy was an independently selected adaptive cognitive system than
considering it as a by-product of some other human-specific adaptation, such
as language. By providing a qualitatively new type of social learning
mechanism, natural pedagogy is not only the product but also one of the
sources of the rich cultural heritage of our species.

see also: 
hoppitt 08: Lessons from animal teaching

}}

@article{csibra-gergely-09_natural-pedagogy,
  title={Natural pedagogy as evolutionary adaptation},
  author={Csibra, G. and Gergely, G.},
  journal={Trends in cognitive sciences},
  volume={13},
  number={4},
  pages={148--153},
  year={2009},
  abstract = {

We propose that the cognitive mechanisms that enable the transmission of
cultural knowledge by communication between individuals constitute a system
of ‘natural pedagogy’ in humans, and represent an evolutionary adaptation
along the hominin lineage. We discuss three kinds of arguments that support
this hypothesis. First, natural pedagogy is likely to be human-specific:
while social learning and communication are both widespread in non-human
animals, we know of no example of social learning by communication in any
other species apart from humans. Second, natural pedagogy is universal:
despite the huge variability in child-rearing practices, all human cultures
rely on communication to transmit to novices a variety of different types of
cultural knowledge, including information about artefact kinds, conventional
behaviours, arbitrary referential symbols, cognitively opaque skills and
know-how embedded in means-end actions. Third, the data available on early
hominin technological culture are more compatible with the assumption that
natural pedagogy was an independently selected adaptive cognitive system than
considering it as a by-product of some other human-specific adaptation, such
as language. By providing a qualitatively new type of social learning
mechanism, natural pedagogy is not only the product but also one of the
sources of the rich cultural heritage of our species.

}}