Cognitive Modeling and Computational Linguistics (CMCL) 2018

CMCL 2018 will have both oral presentations and poster presentations. Posters will be presented during one of the SCiL poster sessions and during a short session after the oral CMCL presentations.

Friday, January 05, 2-3:30: SCiL/CMCL Joint Poster Session
Sunday, January 07, 9-12:00: CMCL Oral Presentations
Sunday, January 07, 12-12:30: CMCL Poster Presentations


All presentations will be held at Grand America Salt Lake City.

555 South Main Street // Salt Lake City, Utah // 84111

CMCL Oral Presentations

Oral presentations will be 18 minutes long followed by 7 minutes for questions.

Coreference and Focus in Reading Times
Evan Jaffe (The Ohio State University)
Cory Shain (The Ohio State University)
William Schuler (The Ohio State University)

This paper presents evidence of a linguistic focus effect on coreference resolution in broad-coverage human sentence processing. While previous work has explored the role of prominence in coreference resolution (Almor, 1999; Foraker and McElree, 2007), these studies use constructed stimuli with specific syntactic patterns (e.g. cleft constructions) which could have idiosyncratic frequency confounds. This paper explores the generalizability of this effect on coreference resolution in a broad-coverage analysis. In particular, the current work proposes several new estimators of prominence appropriate for broad-coverage sentence processing and evaluates them as predictors of reading behavior in the Natural Stories corpus (Futrell, Gibson, Tily, Vishnevetsky, Piantadosi, and Fedorenko, in prep), a collection of ``constructed-natural’’ narratives read by a large number of subjects. Results show a strong facilitation effect for one of these predictors on exploratory data and confirm that it generalizes to held-out data. These results provide broad-coverage support for the hypothesis that coreference resolution is easier when the target entity is focused by discourse properties, resulting in faster reading times.

Predictive power of word surprisal for reading times is a linear function of language model quality
Adam Goodkind (Northwestern University)
Klinton Bicknel (Northwestern University)

Words with low probability in context take longer to read. This relationship has been quantified using information-theoretic surprisal, the amount of information a word conveys. Here, we compare surprisal estimates derived from a range of language models including n-gram models and state-of-the-art deep learning models. We show that the predictive power of surprisal for reading times improves as a tight linear function of the linguistic quality of the language model used to derive it. Further, the size of the surprisal effect is estimated consistently across all language models, pointing toward a lack of bias and striking robustness of surprisal estimates.

Dynamic encoding of structural uncertainty in gradient symbols
Pyeong Whan Cho (Johns Hopkins University)
Matthew Goldrick (Northwestern University)
Richard L. Lewis (University of Michigan)
Paul Smolensky (Johns Hopkins University)

A key insight into language processing is the discovery of the relationship between processing difficulty and surprisal. We provide a mechanistic account of this effect, bridging symbolic and subsymbolic connectionist models. Gradient Symbolic Computation is a continuous-time, continuous-state stochastic dynamical systems framework that computes the representation of a discrete structure gradually. We apply this to incremental parsing and show it can dynamically encode and update structural uncertainty via the gradient activation of symbolic constituents. We show that in this model surprisal is closely related to the amount of change in the optimal activation state driven by a new word input.

Phonological (un)certainty weights lexical activation
Laura Gwilliams (New York University)
Tal Linzen (Johns Hopkins University)
David Poeppel (New York University)
Alec Marantz (New York University)

Spoken word recognition involves: i) matching acoustic input to phonological categories (e.g. /b/, /p/), ii) activating words consistent with those phonological categories. Here we test the hypothesis that activation of a lexical candidate is weighted both by certainty of phonological discretisation and word frequency. Neural responses were recorded from auditory cortex using magneto-encephalography, and modelled as a function of the size and relative activation of lexical candidates. Our findings indicate that towards the beginning of a word, the processing system weights lexical candidates by both phonological certainty and lexical frequency; later into the word, activation is weighted by frequency alone.

Predicting and Explaining Human Semantic Search in a Cognitive Model
Filip Miscevic (Indiana University, Bloomington)
Aida Nematzadeh (University of California, Berkeley)
Suzanne Stevenson (University of Toronto)

Recent work has attempted to characterize the structure of semantic memory and the search algorithms which, together, best approximate human patterns of search revealed in a semantic fluency task. However, these models vary in the degree of their cognitive plausibility and neglect the constraints that the incremental process of language acquisition place on the structure of semantic memory. We present a model that incrementally updates a semantic network with limited computational steps, and replicates patterns found in human semantic fluency using a random walk. We also show that both structural and semantic features are requisite for replicating human performance patterns.

Modeling bilingual word associations as connected monolingual networks
Yevgen Matusevych (University of Toronto)
Amir Ardalan Kalantari Dehaghi (University of Toronto)
Suzanne Stevenson (University of Toronto)

Word associations are a common tool in research on the mental lexicon. Bilinguals tend to produce different associations in their non-native language than monolinguals do, and three mechanisms have been proposed for this difference: relying on native associations (through translation), on collocational patterns, and on phonological similarity between words. We show that the observed difference is significant, and present a computational model of bilingual word associations, implemented as a semantic network with a retrieval mechanism. Our model predicts bilingual responses better than monolingual baselines. Its success is mainly explained by translation; collocational and phonological associations do not improve the model.

CMCL Poster Presentations

Distributional and Dependency-based Word Embeddings have Complementary Roles in Decoding Brain Activity
Samira Abnar (University of Tehran)
Max Mijnheer (University of Amsterdam)
Rasyan Ahmed (University of Amsterdam)
Willem Zuidema (University of Amsterdam)

We evaluate different word embeddings on their usefulness for predicting the neural activation patterns associated with concrete nouns. Our goal is to assess the cognitive plausibility of these models, and understand how we can improve the methods for interpreting brain imaging data. We show that neural word embeddings exhibit superior performance beating experiential word representations. Interestingly, the error patterns of these models are markedly different. This may support the idea that the brain uses different systems for processing different kinds of words. We suggest that taking the relative strengths of different embedding models into account will lead to better models.

Uniform Information Density (UID) Effects on Syntactic Choice in Hindi and English [Extended Abstract]
Ayush Jain (Indian Institute of Technology Delhi)
Vishal Singh (Indian Institute of Technology Delhi)
Sumeet Agarwal (Indian Institute of Technology Delhi)
Rajakrishnan Rajkumar (Indian Institute of Technology Delhi)

In this work, we investigate the extent to which syntactic choice is influenced by the drive to minimize the variance of information across the linguistic signal, as predicted by the UID hypothesis. We propose multiple measures to capture the uniform spread of information over entire sentences. Subsequently, we incorporate these measures in machine learning models aimed to distinguish between naturally occurring corpus sentences and their grammatical variants. Our results indicate that for this task, our UID measures are not a significant factor in the case of Hindi and have a very small impact for English.

Exactly two things to learn from modeling scope ambiguity resolution: Developmental continuity and numeral semantics
K.J. Savinelli (University of California, Irvine)
Greg Scontras (University of California, Irvine)
Lisa Pearl (University of California, Irvine)

Behavioral data suggest that both children and adults struggle to access the inverse interpretation of scopally-ambiguous utterances in certain contexts. To determine whether the causes of both child and adult difficulty are similar, we extend an existing computational model of children’s scope ambiguity resolution in context. We find that the same utterance-disambiguation mechanism is active in both children and adults, supporting the theory of developmental continuity. Moreover, because adult behavior requires an exact semantics for numerals, we also provide empirical support for this theory of linguistic representation.