Jabberwocky, the Beast that Tames the Beast?

Here is a short blog version (without slides) of a talk I gave at the recent Jabberwocky workshop hosted jointly by UMass Amherst and the University of Bucharest (thank you Camelia Bleotu and Deborah Foucault for a great initiative!). The ideas in this talk were rather non-standard and I suspect rather unpopular, but the concept was interesting and it was a great group of people to potentially interact with. Unfortunately, the time zone and weekend timing of the workshop did not allow me to participate as fully as I would have liked, I am airing those ideas here on this blog just in case someone is interested.

Jabberwocky sentences consist of syntactically well formed sentences with nonsense content words like this one I just made up: She didn’t glorph their lividar

If you are a syntactician, the nonce words here are a clever way to eliminate the effect of real lexical items and conceptual content, and zero in on combinatorial processes which underlie sentential structure and generativity. The very fact that we can make these sentences, seems to show that this aspect of language is distinct and modularizable away from the Lexicon per se. It is good to be able to abstract away from contentful LIs in a variety of methodologies, because controlling for frequency, semantic prediction, association etc. can be hard. From the point of view of syntactician, Jabberwocky sentences seem to offer a way of surgically removing the messy bits and to target pure syntax.

So the lexicon is hard, but in modern Chomskian views of grammar, the Lexicon is also the boring bit, where memorized chunks exist, but where no generative processes reside. This is taken to extremes in the Distributive Morphology tradition, where roots are devoid even of syntactic information that would tell you how to insert them in a sentence. The formal semanticists tend to concur: in that tradition we prove theorems of the form: Snow is white is TRUE iff ‘snow is white’ (Davidson 1967). Where the contentful lexical items are simply repeated in the metalanguage, languishing there for someone else (presumably NOT the formal semanticist) to elucidate.

However, there are some reasons to be a little suspicious of the possibility of excising the LI in a clean modular fashion.

Jabberwocky and fMRI

Fedorenko et al. (2010) develop a localizer task for helping in the analysis of regions of interest (ROIs) for linguistic experiments using fMRI. They use four conditions:

1. Sentences (The Sentences condition):

2. Scrambled Sentences (Word list condition):

3. Jabberwocky Sentences:

4. Scrambled Jabberwocky Sentences (the Non-words Condition):

Sentences > NonWords showed the language regions. Words and Jabberwocky both showed intermediate activation of the sentence regions but could not be reliably distinguished from each other. Words > NonWords and Jabberwocky > Nonwords showed ‘inconsistent and variable results across subjects’. This is disappointing if we think that jabberwocky sentences should show the brain doing its pure syntactic thing.

Jabberwocky Sentences and Neural Oscillations

There has been recent work in neurolinguistics exploring the idea that the processing of hierarchical linguistic structure is correlated with the synchronization of brain rhythms in various frequency bands. Kaufeld et al. (2019) (2020) recorded (EEG) while 29 adult native speakers (22 women, 7 men) listened to naturally spoken Dutch sentences, jabberwocky controls with morphemes and sentential prosody, word lists with lexical content but no phrase structure, and backward acoustically matched controls.

I quote: “Mutual information (MI) analysis revealed sensitivity to linguistic content: MI was highest for sentences at the phrasal (0.8–1.1 Hz) and lexical (1.9–2.8 Hz) timescales, suggesting that the delta-band is modulated by lexically driven combinatorial processing beyond prosody, and that linguistic content (i.e., structure and meaning) organizes neural oscillations beyond the timescale and rhythmicity of the stimulus.´´

The jabberwocky sentences on the other hand were no different from word lists with lexical content and no phrase structure on this measure.

One reaction to this kind of disappointing result is to say that syntax is just not really modularizable the way we thought. This seems to be the position of Blank and Fedorenko (2020), Mahowaldi et al 2022, essentially embracing work in Construction Grammar (Goldberg 1995, Goldberg and Jackendoff 2004).

These kinds of authors are also quick to point out that we don’t ‘need syntax’ to understand complex sentences most of the time, since lexical content and real world knowledge do the job for us. These ‘fake debates’ present us I think with a false set of analytic options. Grammar is not all constructions with lots of rich lexical content interacting with statistical properties of the world, nor is it Super Syntax the heavy lifter (oh wow recursion) with lexical items a relic of fuzziness that syntax can mold and structure for its creative purposes.

My own interpretation is to say that syntax exists (and is a cool weird thing), but that it needs to feed off content for the whole engine to get rolling. this means that our job as linguists requires us to understand (at least) two things:

(1) What are lexical meanings?

(2) How do they integrate with syntax in a compositional and combinatorical way?

So, we should use Jabberwocky sentences not to erase the lexical item, but as a way of trying to understand it better. All words are nonce before we know them.

The Real Beast: Jabbo Sapiens

This talk is a plea to use Jabberwocky sentences and nonce words to help usunderstand not the comprehensible residue, but the things they are replacing—- content words themselves! These little monsters, these Jabbo Sapiens turn out to pose loads of hard problems for compositional semantics and understanding how we communicate with other minds.

One might argue with Donald Davidson that building truth theorems is already hard, and good enough, and that it really is not the immediate job of the formal semanticist to elucidate the meanings of the individual lexical concepts snow and white.

The problem with the meanings of open class lexical items:

(i) they are conceptually polysemous while still being atomic with respect to how they function within the system , and

(ii) they undergo productive compositional processes with each other.

The latter point shows that understanding their behaviour is an important component of understanding the central properties of the human language system and its powers of productive meaning generation.

The psycholinguistics literature is very clear at showing us that there is a hub, or unity to the lemma with a localized point of access. This point of lexical access seems to be in the mid temporal gyrus (MTG) is independent of whether the sensory input is visual or auditory (Indefrey and Levelt 2004, Hickok and Poeppel 2007. Friederici 2012). Activation in this area can also be tracked using MEG and fMRI. Based on both neurolinguistic and behavioural evidence, we have strong support for the existence of the lemma which is the lexeme family underlying a symbol and all of its inflectional forms. Specifically, we know that lemma frequency as a whole (not the frequency of individual forms) modulates effects in the 300/450 ms time window in the MTG (Solomyak and Marantz 2010).

This literature is important because it shows that there is a lemma hub for all inflectional forms of the ‘same’ lexeme. But what constitutes ‘sameness’ in this sense? While in practice, it is not always easy to decide whether a pair of meanings associated with a form are homonyms or polysemic variants, or what leads learners/speakers to classify them as such, the evidence now seems clear that we can distinguish between cases where there must be two ‘lexical entries’ versus cases where there must be one. The cases where we have clear evidence for one lexical entry involve lemmas which characteristically embrace a large number of polysemic variants. Thus, polysemy is sharply distinguished in terms of cognitive consequences from homonyny, or genuine ambiguity, in which two distinct lemmas happen to share the same form. Polysemous readings are bunched together for the purposes of priming. Polysemous meanings are facilitory in word recognition, while genuine homonyms are inhibitory and cause slow downs in processing because of more alternatives remaining active. (Rodd et al. 2002 (lexical decision), Beretta et al. 2005 (MEG)),

Jabba Sapiens and Polysemy in Acquisition

How does a learner decide to group forms heard under the same umbrella lemma, the ‘same lexical entry’ if you will. Both the typological evidence and evidence from developing semantic competence in children show that polysemy is natural and ubiquitous. Novel word learning in children shows generalization across polysemous senses, even when the denotational percepts are quite different (Snedeker and Srinivasan 2014). Children also distinguish clearly between homonymy and polysemy at an early age, before they pass any tests of metalinguistic competence, showing that the difference cannot be metalinguistic, as claimed by Fodor and Lepore (2002) (Srinivasan and Snedeker 2011). Moreover, certain types of polysemy seem to be not idiosyncratically memorized, but are plausibly part of a pervasive conceptual system underlying all languages. For example, the container/containee polysemy was found by Mahesh Srinivasan and Rabagliati (2019) across 14 different languages (see also Zhu and Malt 2014 for crosslinguistic evidence).

.The take home point of this blog post is the following. Current formal semantic and morphosyntactic models fall short on explaining how symbolic primes of the open class system are compositionally integrated into sentences. Lexical Items are usually relegated to a sealed off no-man’s land that is somebody else’s business. But how the two domains interact in practice is never made explicit and turns out to be both HARD and IMPORTANT

References

Asher, N. (2011). Lexical Meaning in Context: A Web of Words. Cambridge: Cambridge University Press.

Beretta, A., R. Fiorentino, and D. Poeppel (2005). The effects of homonymy and polysemy on lexical access: an MEG study. Cognitive Brain Research 24, 57–65.

Blank, I. and E. Fedorenko (2020). No evidence for differences among language regions in their temporal receptive windows. NeuroImage 219, 116925.

Fedorenko, E., P.-J. Hsieh, A. N.-C. n on, S. Whitfield-Gabrieli, and N. Kanwisher (2010). New method for fmri investigations of language: Defining ROIs functionally in individual subjects. Journal of Neurophysiology, 1177– 1194.

Fodor, J. and E. Lepore (2002). The emptiness of the lexicon: Reflections on Pustejovsky. In The Compositionality Papers, pp. 89–119. Oxford University Press.

Friederici, A. (2012). The cortical language circuit: from auditory perception to sentence comprehension. Trends in Cognitive Sciences 16 (5), 262–268.

Goldberg, A. (1995). Constructions: A Construction Grammar Approach to Argument Structure. Chicago: University of Chicago Press.

Goldberg, A. and R. Jackendoff (2004). The English resultative as a family of constructions. Language 80, 532–568.

Hickok, G. and D. Poeppel (2007). The cortical organization of speech processing. Nature Reviews Neuroscience 8(5), 393–402.

Indefrey, P. and W. J. Levelt (2004). The spatial and temporal signatures of word production components. Cognition 92(1-2), 101–144.

Kaufeld, G., H. R. Bosker, P. M. Alday, A. S. Meyer, and A. E. Martin (2020). Structure and meaning “entrain” neural oscillations: a timescale-specific hierarchy. Journal of Neuroscience  40 (49) 9467-9475.

Leminen, A., E. Smolka, J. D. nabeitia, and C. Pliatsikas (2018). Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding). Cortex 116, 4–44.

Mahesh Srinivasan, C. B. and H. Rabagliati (2019). Children use polysemy to structure new word meanings. Journal of Experimental Psychology: General 148(5), 926–942.

Marslen-Wilson, W. T., M. Ford, L. Older, and X. Zhou (1996). The combinatorial lexicon: Priming derivational affixes. Proceedings of the Eighteenth Annual Conference of the Cognitive Science Society 18, 223–227.

Marslen-Wilson, W. T. and Tyler (2007). Morphology, language and the brain: the decompositional substrate for language comprehension. Transactions of the Royal Society of London. Biological Sciences 1481 (362), 823–836.

Pustejovsky, J. (1995). The Generative Lexicon. Cambridge, Ma.: MIT Press.

Rodd, J., G. Gaskell, and W. Marslen-Wilson (2002). Making sense of semantic ambiguity: semantic competition in lexical access. Journal of Memory and Language 46, 245–266.

Sahin, N. T., S. Pinker, S. Cash, D. Schomer, and E. Halgren (2009). Sequential processing of lexical, grammatical, and phonological information within Broca´s area. Science 5951 (326), 445–449.

Snedeker, J. and M. Srinivasan (2014). Polysemy and the taxonomic constraint: children’s representation of words that label multiple kinds. Language Learning and Development 10(2), 97–128.

Solomyak, O. and A. Marantz (2010). Evidence for early morphological decomposition in visual word recognition. Journal of Cognitive Neuroscience 22(9), 2042–2057.

Srinivasan, M. and J. Snedeker (2011). Judging a book by its cover and its contents: The representation of polysemous and homophonous meanings in four-year-old children. Cognitive Psychology 62 (4), 245 – 272.

Whiting, C., Y. Shtyrov, and W. Marslen-Wilson (2014). Real-time functional architecture of visual word recognition. Journal of Cognitive Neuroscience2 (27), 246–265.

Zhu, H. and B. Malt (2014). Cross-linguistic evidence for cognitive foundations of polysemy. Cognitive Science 36.

2 thoughts on “Jabberwocky, the Beast that Tames the Beast?

  1. A short point about Fedorenko et al. 2010 (and much of the work that’s based on it):

    If you’re someone who thinks that there’s no “morphological (de)composition” as distinct from “syntactic (de)composition” – a position shared by Distributed Morphology, Nanosyntax, and others — then their findings are a non sequitur, I think. That’s because all four of the conditions in their blocked design crucially implicate syntax per se. Sentences with actual items obviously implicate syntax; and so do sentences with jabberwocky items. But recognizing whether something is a “word” or a “nonword” often involves attempting so-called morphological decomposition of that item. E.g. to check whether MALVITE, an item from Fedorenko et al.’s 2010 experiment 2, is a “word”, one must presumably check whether there is a monomorphemic entry of that shape, but also whether decomposing into MALV + ITE yields anything intelligible. And if the only cognitive engine for this kind of (de)composition is syntax, then all four conditions they are subtracting from one another involve syntax. They are subtracting syntax from syntax. No wonder they don’t find pure syntax anywhere!

    Note that I’m not asking you to accept that there’s no “morphological (de)composition” as distinct from “syntactic (de)composition”; I’m only pointing out that *if* one holds this position, then the Fedorenko et al. results about jabberwocky are completely uninformative about the issue they purport to investigate.

    (This emerges from conversations I’ve had with Athulya Aravind, who nevertheless should not be held responsible for any mischaracterizations I’ve made here.)

    Like

    • Yeah, I basically agree that their claims with respect to syntax (i,e. that it is Not A Thing) are highly tendentious. There are lots of other reasons why the conditions they used should not be expected to function to selectively show syntax. First of all, the word list version and the scrambled non words version all also contain syntactic function words scrambled in, like pronouns and auxiliaries and prepositions ! If the brain reacts to even seeing this syntax relevant words then syntax is always a little bit being triggered by these conditions. Also the jabberwocky sentences are likely to provoke some attempt at semantic content inference and a check in the lexicon for something that the word could be a misspelling of, for instance. But its still true that with respect to my point above concerning the lexicon, one would expect the scrambled real sentence and the intact jabberwocky sentence to be different! But they were not. The only time you get clear results is when both lexicon and syntax are engaged. This does not mean that there is no difference between lexicon and syntax, just that (i) simplistic localization in terms of Broca´s area is probably wrong and (ii) you can´t really engage the syntax machinery unless it has some real content to feed off. The latter is my real point in this post. I don´t think it is the same point that Fedorenko has been trying to make recently, e.g. in Blank and Fedorenko 2020

      Liked by 1 person

Leave a comment