Naming, Pronouns, and Sports Journalism

 At some point early on in my son’s life (I think he was about 3), I suddenly realised with a horrified shock that I had entirely forgotten to use any actual pronouns in his presence— OMG, Of course he needs pronouns! How on earth is he going to learn proper language if I don’t ever use pronouns in front of him? — In my defense, I think I had implicitly intuited that somehow pronouns were deeply complicated, and that his young,  developing language capacities were not yet ready for those mysteries.  That even innocuous sentences like “am going to wash my hands now. Do you need to wash your hands, baby?’’ presented deep philosophical problems of authorship, conversation and ego,   leading to shifting  reference that went way beyond the point-and-shoot index-finger  labelling strategy that had been the foundation of our mutual world exploration up until that point. As a retreat from the philosophical abyss I sensed in the yawning maw of the pronoun , I had resorted to referring to myself with the referentially rigid name (in the world of my son)  Mamma. (“Mamma’s going to wash her hands now. Does Vidar need to wash hands? ‘’) That is, up until the aforementioned moment of horrified realisation,  when I knew I had to finally put aside such childish things and introduce him to the proper functioning of I, you, he, she, they and us. Now, many years later1, I continue to be  impressed by the flexibility and utility of pronouns, as well as the mysteries of  I and you  in  contrast with the clunky wordy literal-minded ness of actual names.  But here, I put aside the alluring shiftiness of the pronouns I and you to concentrate on the underappreciated ordinariness of he, she, it and they which perform important functions as foot soldiers of discourse coherence every day of our lives. To illustrate, imagine a narrative in which I wish to describe the activities of a hypothetical creature, known to us both as Thilo, who is preparing for a summer gathering: 

  • (1) “Thilo took a quick shower as the sky darkened in the west, and he briefly thought about what shirt he would wear to dinner.’’  

The pronouns he  in the previous sentence are unambiguous in this context in that they clearly (both) refer to Thilo. Moreover, pronouns here are highly preferable to simply re-using the name Thilo: 

  • (2) “Thilo took a quick shower as the sky darkened in the west, and Thilo briefly thought about what shirt Thilo would wear to dinner.’’ 

 which would be awkward and ugly,  and grate on the ear of any  English speaker2. The pronoun solution  in (1) is the perfect design choice here for a smooth sounding transparently comprehensible output.  But pronouns are not always perfect either, since the presence of more than one male animate participant in the discourse can lead to uncertainty.

  • (3) “Thilo called up Magnus on the phone as the sky darkened in the west, and he cast doubt on the wisdom of driving into town in the middle of a snow storm,  even though it was  his birthday.’’

Since Thilo and Magnus are both presumably male (in a language and culture where certain names are associated with certain pronouns by default), it is very unclear who the  worrier is in this case (the caller or the callee), and more interestingly whose birthday is it anyway? 

So in avoiding the unimaginative and hamfisted repetition of Proper Names which guarantees comprehension at the expense of art, we rely instead on the reassuringly open ended place-holder of the pronoun, whose information content is so reduced that repetition is innocuous, but with increased risk of misunderstanding if our interlocutor is not perfectly attuned to our thoughts.  An utterance like the one in (3)  in fact,  likely occurs in a context where it is perfectly well known to both conversants that it is e.g. Magnus’  birthday (not Thilo’s), and the speaker even has the possibility of subtly finessing the syntactic context to nail down the identity of the `worrier’  by making the second clause non-finite, whose subject is then unambiguously interpreted as Thilo: 

(3’)  “Thilo called up Magnus on the phone as the sky darkened in the west,  to  cast doubt on the wisdom of driving into town in the middle of a snow storm,  even though it was  his birthday.’’

So with suitable reliance on a background of shared information, and a little bit of care and creativity in syntactic construction selection, crisis can usually be averted in many if not most natural language situations where individuals are introduced and tracked across event descriptions.   In synchronous and face to face auditory communication, we have the additional helpful knowledge of often knowing our conversants,  and having a shared set of information we can rely on due to context and mutual knowledge. We also have the extra tools of intonation and gesture to clarify instinctively what we intend. For example, consider the discourse in (4), which uses the potentially ambiguous pronouns `he’ and `him’, which we somehow have to share between the named protagonists to figure out who did what to whom.

  • (4) “Thilo pushed Magnus, and then he kicked him.’’

If we utter the sentence neutral with a slight leaning on the word `kicked’ (4’),  then Thilo is the consistent aggressor here; if we stress both the pronouns and destress `kicked’ (4’’), then we understand that Magnus is the kicker and retaliator,  and Thilo is the one being kicked in this case.

(4’) “Thilo pushed Magnus, and then he kicked him.’’  (Thilo = kicker)

(4’’) “Thilo pushed Magnus, and then he kicked him.”  (Magnus = kicker)

Human users of language are constantly (and instinctively) juggling these different pressures on communication and fine tuning their output for clarity and efficiency according to the specific toolbox accorded by the words and grammar of the language they are communicating in, combined with the  knowledge they have of their interlocutors’  background interests and commitments.  It is easy to forget that the lowly pronoun, when it is not out there fighting on the frontlines of the culture wars, is the unassuming, almost contentless linkage device that makes discourses coherent, and that relies on the language user’s  complex inferential abilities and  knowledge to deploy and comprehend. Tracking pronoun use in human language is one of the most fascinating windows onto human communication and creativity you can find. 

When it comes to the written (published) word, the situation becomes more complex for the communicator because one can no longer exploit the shared communicative space of a single spatial and temporal context and a single well understood interlocutor. The published word is asynchronous communication because the producer of the message and the recipient of it do not operate simultaneously,  but often across great distances of time (and indeed space). In addition, the audience for these published written words is not a single recipient whose background knowledge and beliefs are in principle knowable, but a huge unpredictable audience of readers, united only by the fact that they wish to read the same text.  The writer in addition, by virtue of earning their living from said activity, has the pressure of having to create a text that is both comprehensible and pleasant (preferably even engaging and exciting) to convey their message. 

All this is perfectly obvious to both writers and linguists but I spell it out in gorey detail now because I have been bemused and befuddled by my recent experiences in newspaper reading, where I find the writers in sports journalism (yes, that’s the only part of the newspaper I actually read) have been developing creative and enterprising techniques for avoiding pronouns and names altogether leading to tracts of surreal creativity and for the uninitiated almost complete opacity, such as:3

  • (5) `The game between Ipswich town and Newcastle United got off to a slow start with no score until the hosts put away a well deserved penalty in stoppage time at the end of the first half. The tractor boys are now facing relegation,  as the League cup winners scored twice in the second half to run away with the win.  ‘’

Ok, see what I  did there? Instead of repeating Ipswich and Newcastle United a couple of times each in the course of two sentences, and avoiding the pronoun they  that could be confusingly ambiguous, the author (me) has sprinkled the text with epithets. These are definite descriptive noun phrases that clearly denote a particular individual/group,  either because it’s a conventionalized nickname or a property uniquely held by one of the participants in the story. It is a clever way of referring to that participant one more time, unambiguously, without using the standard name label. Devices like the former  and the latter are a subtype of this technique, where the property that singles out the referent we intend is simply the property of being the first mentioned, or the second mentioned of the participants in the mix. But football writers never resort to anything so formal, stilted or obvious as `the former’ or the `the latter’. No, that would be cheating. Why make that pedestrian choice,  when instead you could use an epithet constructed from a property uniquely distinguishing your participants (in this case Ipswich Town vs. Newcastle United) using obscure in-group knowledge that only an actual football fan will parse transparently?  For full disclosure, I will admit that I have returned to live in England after twenty years of living abroad in Norway where I absolutely didn’t watch any English football at all (Wait, Ipswich Town is in the Premier League?!) and I had no idea that Ipswich Town were known by the affectionate nickname The Tractor Boys (huh?). And I also concede that I should know in advance of a match which team is the home team and which is the away team. But geez!  I am catching up after the weekend on a lot of games and I don’t remember off the top of my head whether Ipswich were playing home to  Newcastle, or vice versa. Who are the damn hosts? I don’t know,  Mr Sports Journalist. You lost me at `tractor boys’ and then the `host’  thing did nothing to clear it up.  Finally, if one were paying attention, as all football afficionados surely were (because the League cup is a Big Deal)  one should surely  know that Newcastle United won it, and are therefore the holders by the time this particular game gets played. So that one is easy, right? Not always. Here are a couple of  actual quotes from an online blog reporting on the Women’s Euros final played this summer (Guardian Minute by Minute report, Women’s Euros Final, 27th July, 2025 )

  • (6) Every time England come forward there seems to be a million Spanish players in defence, that is how good they are at taking up space and pressing the defending champions.   …….
  • (7) Spain are looking to hit England right back and Bronze gets a yellow card for a challenge on Carmona. The world champions have a free kick now.

Texts (6) and (7) occur in quick succession during the course of the Guardian online blog of the match. In (6), the defending champions  refers to England because they won the Euros the last time it was held back in 2022. In (7), the world champions  refers to Spain because they won the Women’s World Cup in 2023.   Unless you are really really paying attention, or a total womens football nerd, and not skimming at all, this could slow you down because the two epithets look so superficially similar. In (6) the italicized phrase removes the need for them or a repetition of England. In (7), Spain  would be reasonable choice here, and totally natural sounding so the use of the epithet seems somewhat gratuitous. But epithets aren’t just an avoidance strategy, they also have positive value in conveying information in their own right. Let us go back to Thilo and Magnus for a second. Suppose I say:

  • (8) “Thilo talked to  Magnus on the phone yesterday and the bastard didn’t even wish him happy birthday.’’

So the question is, who had the birthday and who is the bastard? Suppose I am talking to you and we both know it was Magnus’  birthday, (and I know you know etc.), then I also know that you will infer that I mean Thilo with the epithet the bastard , even if you don’t possess the information or belief that he is  a bastard, or even that I think so.  But you do  know that I think so after my utterance, because you accommodate that information to make the discourse coherent. So an epithet like this, even if its first and main job is to signal reference to a particular entity in our mental world, also has a secondary function of smuggling in some extra information `on the side’4. In the case of (8), the extra information my hearer gets is my critical stance towards Thilo. In (6), the extra information the reader gets (or reminder she gets) is that England are the defending Euro champs and so we should feel some extra drama in their fight to win back-to-back finals. In (7) the information is smuggled in that Spain are the world champions, adding to the drama again because we are reminded these ladies are the best in the world and will be so hard to beat! So provided reference is successful, the use of the epithet contributes to the semantic richness of the concepts at play in unfolding the story for an audience.  I suspect this is another part of the reason that epithets are so staggeringly prevalent in football journalism. Imagine the sheer tedium of having to write a blow by blow account taking place over the course of a whole 90 minutes,  where you have to find more and more creative ways of referring to Ipswich Town and Newcastle while negotiating the constant threat of ambiguity in the pronoun `they’.  And avoiding the dull thudding unimaginative proper name repetition that would be the only obvious alternative5. The nerdy epithet: A somewhat marginal device in face to face discourse; unlikely saviour of exciting sporting journalism in the written genre. Extra bonus the nerdier and more creative the journalist can be, and extra extra bonus for constructing texts that are obscure and off-putting to all  but the true tribe who understand all the code required to tell who did what to whom. I’m clearly going to need to up my football game. At least before the new season starts….

  1. In case anyone is worried, I can assure the tenderhearted reader that my son did eventually acquire English perfectly with a full repertoire of pronouns at his disposal (Good save, Mamma!) ↩︎
  2. Linguists debate whether (2) is genuinely ungrammatical in English or just so truly aesthetically awful that no native speaker would ever voluntarily produce it. ↩︎
  3. The following is pure invention, a fake text inspired by my reading experiences. ↩︎
  4. Linguists call this `not at-issue content’. ↩︎
  5. Again, I think face to face, people talking to each other find ways of using intonation and de-stressing to mitigate the effects of both repetition and ambiguity, but the sports writer cannot call on these devices. Writing good copy is hard! ↩︎

The Lonely Londoners

One of the bonuses attached to  living in England for long periods again is that I get to hang out with the Ramchand side of my family, in the form of the clan that emanates from my brother Michael and his wife Karena and has so far produced 4 daughters and 2 granddaughters. So it was  appropriate that my first `show’  since moving to take up my job at Oxford was a trek in to London to see a performance of The Lonely Londoners at Kilburn Theatre with the family in tow. Especially appropriate since The Lonely Londoners is a dramatization of a book of the same name by Sam Selvon, A Trinidadian writer, written during, and about, West Indian immigration to London in the Windrush era.  In fact, our whole family had read the book and knew it well, having also had the privilege of meeting  Selvon himself, a friend of my father’s (Kenneth Ramchand, the renowned Literary Critic whose PhD in West Indian literature in English from the University of Edinburgh was the original pioneering work in the field.)  For this reason, the Ramchand clan was buzzing with anticipation and expectations for the spectacle we had signed up for. The prognostications were good. And so were the reviews

The reason for turning our experience into a blog post is the connection to language, because the author’s central use of the vernacular in his work, and our own family’s native competence in Trinidad dialect speech is a main player in this reaction piece.  The  show was great on many measures: the individual performances, the staging which combined  the words of our beloved text, with music, drama and body movement; the way in which the narratives from the different notionally independent short stories in the book were combined to form an organic whole that tells an overarching story.  All of these qualities conspired to create an enjoyable afternoon for the Ramchands in London. 

We were disappointed however to realise early on that there were no actual Trinidadians within the cast who could have been tasked with performing Sam Selvon’s words in their original vernacular musicality.  One of the upshots of this missed opportunity is the transformation of what we as a family had experienced as a (at times elegiac) but mostly richly comedic novel into something more seriously dramatic. We recognised certain memorably comedic `incidents/moments’  or characters being played out on the stage, but the result simply wasn’t that funny.  We had chuckled and giggled through the book, but now we found ourselves sitting with earnest and engaged attention, without smiles and snorts. I asked around to see whether my co-watchers had had the same reaction: “The book was much funnier.” said my brother.   We decided after some discussion that the humour must have rested on the subtle phrasing and timing of the original Trinidadian dialogue and not on the events in and of themselves, and that threw me into musing about the effects of the oral vernacular and its relation to writing, and how so much can get lost when you take something off the written page without having really heard  its music in your mind’s ear (if there can be a mind’s eye, then there is surely also a mind’s ear). 

As linguists, we are well aware that when the speaker of a language reads the written words off a page, they also mentally rehearse the sounds and articulatory gestures that would correspond to producing that text orally, although they do it in a highly compressed fashion temporally, since reading a written text is very much faster than pronouncing it (Leinenger 2014 for an overview ). (As a particularly striking effect of this, linguists have noted that readers of a purely written text are actually  slowed down in relative terms by tongue-twisting words and phonology, even though they are not actually literally reading out loud. (McCutchen and Perfetti 1982 ). Returning to Selvon’s writing, the non Trinidadian reader would notice only minor deviations from standard English expression in the written text, and those that exist are mostly found in the explicit quoted dialogue between the characters in the stories. Nevertheless, the text as a whole swims in the rhythms of Trinidadian speech, for those who are attuned and primed to hear it as it was conceived, in their own mind’s ear.  This point was made to me early by my father about Selvon’s work (and West Indian writing in general), but this was the first time I had experienced the effects of it so strongly— in the absence that was created by a performance that had not been tuned in to those frequencies that only we could hear.  (The review linked to above does not mention, or notice, the lack of Trinidadian actors in the cast).

Humour is subtle, complex, and does not easily cross the translation barrier. The effects we were missing were based on the expectations of rhythm, and stress,  and beat, word choice and timing.  Selvon’s prose  allowed the flow of thought and emotion of the characters to be transparent to us, and this, combined with Selvon’s ability to draw characters who we recognized, whose motivations we could easily interpret, produced a much richer landscape of effects (including in this case humour, and in some instances also pathos) over and above the simple drama of the factual incidents in their unfolding. Luckily there is much drama and interest in these stories that survives the gutting of the particular oral language medium, and it would have been interesting to have a conversation with the many British non-Trinidadian who were in the theatre that afternoon, to gauge their response to the performance. 

I read The Lonely Londoners   as a teenager, and it was interesting to revisit the stories and their historical context now as an older person (female, hyper-educated, myself a piece of flotsam resulting from various successive waves of colonialism and diaspora). One of the things I missed the first time around, and which came out strongly in this production,  was the extent to which the Windrush men from the Caribbean largely failed to operate in solidarity with the West Indian women of their generation. The camaraderie is strong within the men— this band of brothers with hopes, and dreams, and agency, co-travelers on this path in an alien society that both needs them and rejects their humanity. They share a complex relationship to this society in which pride, ambition, curiosity and the forging of  an identity are all  driving motivations. But for them, the women from their own islands who might potentially understand them and support them best are not conceived of as Fellow Travellers in any sense. It was striking to me in watching this performance how these women were objects bound up  in the larger questions of `leaving or staying’,  `keeping, or giving up for something newer’, and are ultimately let down by not notionally being included as co-participators in the same struggle as the other males in the stories.   (Do we not have hopes, and dreams and agency too? And when you cut us, do we not bleed? )  This theme  must also be present in Selvon’s original work,  even while he simultaneously pours  his expressive genius  into the male friendships that occupy the bulk of the book.  It came out to me more forcefully seeing the performance this time around, and I would like to go back to the book to see how much of that perception comes from the changes in my own mind-eyes and mind-ears, how much came from the artistic direction in this particular production, and how much was sitting there in the work all along just waiting to be seen by the right reader. 

A couple of weeks later, I went on a walking tour in London whose theme was The Clash, and the musical influences on the band from the region of London they lived in, a part of the capital rich in history from the point of view of the Windrush era in particular.  Ladbroke Grove, Notting Hill. I saw the pubs where the earliest punk gigs took place, and the building housing the original Mangrove restaurant that was the focus of the famous police harassment case,  and eventual Old Bailey trial  (topic of the Steve McQueen BBC docudrama, Mangrove in 2020).  As an addendum to my Lonely Londoners play experience, it was very appropriate. I was reminded about how deep these cultural connections run in the capital city of this country— how the reggae music of the West Indians seeped into the punk aesthetic of the London scene, and how the in the best case, creativity and the revolutionary spirit can make allies transcending race and upbringing.  From the carnival that flourished here, to the food and influences on London speech, the lonely Londoners of Selvon’s tales eventually become part of the warp and weft of London. So that when Joe Strummer sings the refrain “ London Calling “ in 1979, in the now iconic punk ode to London, he does so in a musical idiom intertwined with the rhythms of those original immigrants in the form of reggae and ska.  The London that `calls’  contains their voices as well. 

LLMs, Generative Grammar and Why we need Theory more than Ever

Are there certain core beliefs of generative grammar that are fatally undermined by the recent successes of LLMs and the unsupervised learning that trains them? Do LLMs then constitute a rival (and superior) ‘theory´ that can and should take over now from (all) previous theories in pushing the science forward? This short article has been commissioned as a response to a target article by Cristiano Chesi in the Italian Journal of Linguistics called `Is it the end of (generative) linguistics as we know it?’ In it, I argue that the answer to both these questions is No. On the positive side, I make an urgent case for maintaining theory at the centre of the new era of linguistic science, and for generative grammar to expand its energies into theorizing the link between competence and various aspects of performance in order to shore up its claims to explanatory adequacy.

There´s a Dumpster Fire at the End of the Information Superhighway

In the early nineties when millions of computers were connected to form the internet, it ushered in a digital revolution promising the democratization of access to information and leading to metaphors such as the `information superhighway´ as synonyms for the newly connected interwebs. I  was a young graduate student at the time and one of the exciting things about the internet for me was not in fact the analogy to driving,  or rapid information transfer,  but the experience of it as a messy,  unregulated anarchic door-opening, almost like the multiverse or a piece of untidy knotted string, which you could follow in your own idiosyncratic way.  I know, very GenX of me.

Even though we did not know exactly how it would play out long term, I think we all sensed at the time that our way of storing, accessing and searching for information had changed irrevocably from that moment on.  I think we are facing a similar informational tipping point again now, but not perhaps in the way that many folks are imagining.  Nobody wants to go back to index cards in physical libraries, but the question is whether our methods for searching for information are now going to make another quantum leap of improvement, made faster and more efficient via posing questions to an interactive Chatbot.  This  is precisely what Big Tech is telling us that we need, and they are currently fighting each other to be first to roll out the next generation of search-technology chatbots (Bing vs. Bard). These applications are fed by massive natural language models (from OpenAI) which because of the trillions of words they are trained on,  can generate plausible grammatical responses to our queries and potentially summarize information from that huge pool of networked text.   When it comes to pure search functionality, though,  there are good reasons to believe that the ways in people actually search and the way information-search interacts dynamically and dialectically with other cognitive goals such as learning and exploration will not all be equally well served by the `ask a bot’  model of communication interaction. (See this article by Emily Bender and Chirag Shah for a discussion of the issue https://dl.acm.org/doi/pdf/10.1145/3498366.3505816 . )

But I darkly suspect that helping people to search for information is not just a `selfless´ goal in the name of progress that these tech companies are pursuing.  In other words, Bing and Bard are not the only uses that OpenAI is going to be put to. The developers of the natural language models that make ChatGPT possible will sell that technology to others, and it will be modifiable beyond the `constrained´ pretty guardrailed version that underpins. ChatGPT itself.

There is no doubt that ChatGPT, the interactive content generator, has taken the world by storm since its launch in November 2022 and its ability to produce plausible and seemingly helpful text has been massively impressive. There’s been hype;  there’s  been backlash. There have been hard questions asked and performance flaws, leading to fixes and  improved guardrails.   In this blog post I will end up summarizing some of the major worries that have been aired and then go on to emphasize what I take to be the most serious threat that the  technology poses if it is not regulated now.  Some of these worries have appeared already on social media and published sources, which I will try to indicate as I proceed.  But the bottom line is going to be a version of my very own dystopian worry,  and involves the experiment of thinking consequentially about what will happen to information itself,  when more and more content-carrying text is handed over to artificial intelligence and dissociated from actual minds. Call it the Semanticists Take.

The Robots Are Coming Worry

So maybe you think I am going to go for the-chatbots-will-become-sentient-and-try-to-destroy-us worry (think Hal, or the latest behaviour of Bing). Or the more gentle sci-fi version where we potentially embrace new forms of sentience and come to understand and welcome them in our shared cognitive future.  But both these scenarios are just a form of the HYPE.  No! These chatbots understand nothing. They scrape content produced by actual minds,  and regurgitate it in statistically acceptable sounding forms.  They have no notion of truth or `aboutness´, let alone emotion. The fact that they seem to, is due to echoes from all the human texts they have consumed,  and testimony to our own human response mechanisms which impute content and feeling, and make the assumption of `another mind’ when faced with language produced for us.

The March of Capitalism Worry

There is an actual  real worry here, namely  that real Bad Actors (humans working for capitalist organizations who are trying to earn money for their shareholders) will  actually use it to continue taking over the world in the form of controlling and curating creative content (text, images, tunes),  and relegating actual humans to poorly paid monitors and editors with no job security or health insurance, but that is more a continuation of our present capitalist dystopian political reality than science fiction woo hoo.

The Bias and Toxic Content Worry

Here’s another concern that has rightly made the rounds. Because it is cannibalized from human content, chatbot content will repeat and recycle content from all the racist, misogynistic,  toxic and other questionable biases of the humans who created it. Huge amounts of resources will have to be spent regulating these AI technologies if they are to come equipped with `guardrails´. Even scarier is the thought that many of the purchasers of this technology will not  equip their use of it with guardrails. The fact is that this technology has not been created by public funds or non profit universities or even governments who are answerable in principle to an electorate. No, these applications have been created by and are owned by private companies whose only aims are to make money with it.

Here is Timnit  Gebru on why big tech cannot be trusted to regulated itself.

Also Gary Marcus recommending the pause button.

Is It time to Hit the Pause Button on AI

ChatGPT Inherently Does Not Know What Information Is

As a semanticist, I regularly have to think about meaning and what it means for something to have meaning. Formal semanticists ground their theories of meaning in some kind of model of reality: to give a theory of meaning in language you cannot simply redescribe it in terms of language itself; there needs to be a reckoning,  a final `reality´ check in terms of, well, Reality (or at least the thing(s) we humans apprise as Reality).  Actually, the way I like to think about it is more along the lines of Mark Dingemanse’s statement that language is the best mind to mind interface that we have.. The important next step is to realise that language does that by anchoring itself in consequences for the shared reality that the  two human minds are embedded in. There is an aboutness to language that is a crucial design feature,  and theory of mind is one of the cognitive abilities that humans need to decode it.  You need to know/assume/understand/trust  that there is another mind there apprising the same common reality as you are, and labeling it in similar ways.

Take ChatGPT now. ChatGPT has no Theory of Mind (Gary Marcus again on testing ChatGPT), and it has no notion of  any kind of reality or `aboutness´ to what it is generating. This means that it does not actually understand anything. It has no connection to truth.  All it is doing is scraping content in the form of text and generating plausible natural language sentences from its training material. It repeats and recycles but does not genuinely infer (bad at math and reasoning). It also cannot distinguish a fact from a non fact as a matter of principle. It produces false citations and false data unnecessarily and gratuitously, although it most often repeats correct things if that’s where it is getting its most statistically likely sentence.  

Emily Bender, also a linguist, has been a tireless campaigner against the breathless hype over large language models, even before the launch of ChatGPT in November. Read her viral article about Stochastic Parrots here

Ok, so one could imagine building an interactive search engine that was instructed only to summarize,  and where in addition,  all the sources were vetted and verified information. However, the technology as we see it now  seems also to hallucinate content even when it could not possibly have grabbed it from somewhere unreliable. It is unclear to me why the technology does this,  or whether it can be fixed. Is it to do with a built in feature that tells it to not repeat verbatim because of plagiarism risk, or is it due to the kinds of information compression and decompression algorithms that are being used?  Hallucinated content from Chatbots means that even if you  tell the search engine to only search a particular list of reputable sources, it could still give you erroneous information. 

It is apparent to me and every serious scientist that we would never use ChatGPT as our search engine for anything we need to find out in our academic field. It is moreover not clear to me at any rate, that I need my search interface to be in this impressively plausible linguistic form at all.  I do not necessarily think, in other words that universities and libraries should be racing to use, modify, or invent their own versions of Bing or Bard to search scientifically curated content. We know that developing a natural language model on this scale is extremely expensive. The reality is more likely to be that once it has been developed once by Microsoft, they will then sell it to everyone else and we will feel that we need it so much that we will rush to buy it.

Who is going to buy the technology?  And what are they going to use it for in the future? It is already being used by some companies to generate content for articles in online magazines (leading famously to retractions, when the content was not sufficiently overseen by a human), and by all kinds of folks to write summaries for meetings and presentations etc.  It will also no doubt be used to produced advertising texts and disinformation texts which will run rampant over the internet. We already have a problem with disinformation and unverifiability on the internet and these problems will increase exponentially since the present technology is much more believable and also, crucially, automatizable. Not only will the content so produced not be verified, it will also be increasingly non-verifiable.  Since these very  helpful chatbots will be the ones you turn to to find out whether the sources check out. As we have seen, ChatGPT regularly authoritatively spits out totally made up citations.

One can imagine fondly that some other tech bros will invent software that will detect whether something has been written by AI or not, but it will be a moving target, with so many different versions out there, and next generation versions that can cleverly outwit the automated checkers in a spiralling  arms race of  ever-increasing subtlety.  That way lies madness.

As more and more people use this technology to generate content, whether with the best of intentions or the worst of intentions (and we would be naïve to assume that Microsoft are not going to sell their new toy to anyone who is willing to pay for it), I predict that in the next few years the information highway is going to be more and more littered with content that has been created by artificial intelligence (think Plastic as a tempting environmental analogy).

The problem is that this is simply not information any more.

 It is faux-information.

It is content which bears some sort of causal relationship to information, but where the relationship is indirect and untrustworthy.   

What is going to happen when the information superhighway is contaminated with about 5 percent of faux-information? What about when it is 10 percent? 50 percent? What is going to happen when half of the content that ChatGPT is scraping its `information’ from is itself AI generated scraped content? Will the hallucinations start to snowball?

Here’s my prediction. We will lose the small window we have at the moment for governments to regulate, and in five years time (maybe more maybe less)  the internet superhighway will be more like something out of a Mad Max movie than a place where you can find information about how to fix your fridge yourself. 

AI will have consumed itself and destroyed the very notion of information.

(Well, at least the idea that you can find information on the internet.)

So the problem is NOT: how can we get this great new thing for ourselves and adapt it so that it does the good stuff and not the bad stuff? The problem is what happens when this thing is let out of the box. In five or ten years time,  how will we be able to distinguish the content from the faux-content from the faux-faux-content, using search applications  that also have no idea.

For those of us who watched the dumpster fire that consumed Twitter a couple of months ago, this is going to be similar and for similar reasons due to wilful lack of regulation, but now exacerbated by automated plagiarism generators.  Bigger. Maybe slower to unfold.  And we are sleepwalking into it.

There is hope for the preservation and advance of human knowledge at least if publicly funded universities and research institutions band together now to safeguard the content that it now houses (physically and digitally) in the form of libraries. There are two aspects to this: (i) we need to keep making principled decisions about what we allow to be searchable and (ii) we need to create our own versions of search engines for searching that content. We should not make the mistake of trying to use OpenAI technology to do this, because plausible linguistic interaction or essay writing ability is not what we need here. We just need slightly better functionality than current indexing systems, otherwise we will lose out to the bots. No need for plausible human interactive language, but  a much simpler ability wherein the search interface  simply repeats core findings verbatim and shows us the actual citation. Creating this kind of search engine (owned publicly and not by Google or Microsoft) would be way less resource-intensive than employing large language models. And arguably more scientifically useful.

We need to build good search engines that are NOT Artificial Intelligence machines,  but computer data sifters and organizers designed to aid intelligent human agents.   These search applications need to be publicly owned and maintained and open access.

The only people who are going to have the will or motivation to do this are the public universities, and we may need to work together. Everyone else is compromised (or drinking the KoolAid), including many of the worlds’ governments.

Now I know you are all probably thinking I am paranoid overreacting GenXer who is just yearning for a return to the internet of nineties. Like every other middle aged person before me, I am being negative about change and the past was always golden.  ChatGPT is great! We can use it for good.

I really really really hope you guys are right.

.

Foundations of Extended Projections

We at CASTLFish at UiT were recently thrilled to host a workshop on the Foundations of Extended Projections on  October 27-28, 2022. Due to funding cuts and  cuts even to the places where one can apply for basic research funding, it appears that CASTLFish will be very very poor from now on. To spend the little stash of cash we had left (which must be handed back by the end of 2022) we thought it would be appropriate to hold a conference on extended projection, since Peter Svenonius and I have a fairly well cited collaborative paper on the topic, and lots of strong opinions!

So two days of fun and stimulation were had by all!  Hour-long talks, lots of discussion, and lots of late night conversations. Just what we like up here in the Arctic as the days are drawing in. We also like controversy, and new radical ideas, and thinking from first principles. We got lots of that as well!

The programme can be found here, where you can read the authors´own abstracts.

.

A small in person workshop with people who are interested in thinking about the same general issue but from a variety of different perspectives is a great model for a stimulating and productive workshop.  The notion of the functional sequence and cartography has been a question of great theoretical interest over the last couple of decades, although detailed questions of descriptive cartography have not in themselves created much of a buzz. The main interest has been generated by the more controversial positions on the fine grainedness and universality of the hierarchy of projections.   At one extreme, ardent cartographers embrace a highly articulated and specific order of functional projections which forms an innate template for all speakers of human language. At the other extreme, distaste with overly specific representational innateness and universality claims leads to syntacticians essentially discarding the whole subfield and concentrating on other topics like Agree, Merge, Locality, or Labeling.

However, in my view, questions of Agree or Merge cannot be usefully discussed if the representational primes of the system are not agreed on. Thus cartography in the mundane sense of just figuring out what the categories and labels active in a particular grammatical system are, is  an important component of any computational or descriptive claim for the system. This `boring´ descriptive work is often left undone because both  camps seem to assume that whatever they have in their list of categories box  is universal (whether coarse grained or fine grained). If it’s  universal then the individual syntactician  does not need to figure it out on a language by language basis, they can just take it off their chosen ideological shelf.  But if Ramchand and Svenonius (2014), and Wiltschko (2018) are right, then we cannot in fact take those details for granted.

One of the outcomes of this small workshop was an emerging consensus that the language particular details are non-trivial, and that arguments for the lexical vs. functional distinction and the existence of a particular functional item must be argued for on language particular grounds, and without the help of a universal semantic template. This is because the notional categories themselves cannot be defined in a non circular fashion (Pietraszko, Ramchand, Tsai) without diacritics for `zone´—- cause, possibility, inception are notional categories that exist at many levels, and in certain languages many verbs can be used both functionally and lexically (Aboh, Pietraszko). I myself argued that conceptual, essential content as enshrined in the lexical symbol (located in declarative memory) is architecturally distinguished in every language from the referential and instantiational information in which it is clothed. This abstract distinction cuts across many of the notional semantic labels that are in common use within cartographic templates.

Other outcomes of the workshop were the beginning to an investigation into the crosslinguistic variation in the kinds of verbs that allow ECM, and whether this can be handled by notions of size, or truncation (Wurmbrand). Diercks and Tang presented detailed linguistic descriptive work investigating the representation of information structure in Bantu and Chinese respectively. Their proposed solutions convinced me that with respect to Focus and Givenness, the connection to functional items in the hierarchy of projections is far from obvious. Diercks asked us to believe in countercyclic Merge, which instead prompted a very productive discussion about alternatives.  Paul argued that the proposed FOFC language universal really is undermined by very basic constructions in Chinese, and that arguments putting those constructions aside do not work.   

Another major feature of the workshop was the willingness of the participants to think from first principles in fresh ways about the foundational questions in this domain. I have been growing weary of large  conferences where researchers present their work in an environment closely tied to the job market, to the demonstration of professional skills and talents, to competition for air time,  hyper-sensitivity towards market forces and what is currently trendy in our field. We unfortunately inhabit an academic space where this has become a necessary feature of professional meetings— the narrowing of jobs and resources, and the commodification of academia, has led to a hypercompetitive environment, and lots of stress and burnout.  Our small workshop was a refreshing change from that other kind of conference and one which all of our attendees appreciated, across a widely  diverse speaker group from well seasoned to early researcher status. Many of our speakers  expressed the idea that they were going to `say something controversial or crazy’ , or `try something new’ (Wurmbrand, Diercks, Pietraszko).   Adger told us about his new mereological foundations of phrase structure as an alternative to the set based metaphor. The mereological algebra, he argued, was better suited to the part/whole relationships we build through hierarchies. Svenonius showed how we could model the hierarchical orderings of the extended projection with all its gaps and repetitions and language specific detail using a finite state machine. Zhang speculated about what would happen if we countenanced the existence of functional items without category.

All in all, I feel grateful that CASTL  had the luxury and privilege to host such an event, and pay for all accepted papers to attend. We, and our own students, could witness linguists describing, explaining, arguing and generally doing what they do best trying to figure stuff out.

SALT32 in Mexico City: Some Thoughts on Linguistic Diversity

Flying in to Mexico city for one of the world´s most exciting annual formal semantics conferences (Semantics and Linguistic Theory SALT32)— it does not get much better than this, especially after more than two years of digital conference participation. I was not disappointed. The conference delegates stayed mostly in Coyoacan, one of the older suburbs on the edge of the city, with a rich cultural history. It was a wonderful lively place, with lots of restaurants and bars and one felt completely safe walking around the neighbourhoods both in daylight and darkness. The weather was warm, with the occasional dramatic thunderstorm (see Sant and Ramchand´s poster on occasional here). The food and drink were wonderful, and the people were warm and friendly. Many many thanks to the organizers at el Colegio de Mexico and El Colegio es conocimiento ciencia y cultura for moving heaven and earth to get this to work  so well in a hybrid format and for being such gracious hosts.

As far as I know, this was the first time that SALT was held in a location outside of North America, and the first in a country where the local language was not English. Therefore it was appropriate that the conference was host to a special SALTED workshop on Prestige English as an Object and Meta Language. The invited speakers were, Enoch Aboh, Donka Farkas, Carol Rose Little and Andres Saab. We can all acknowledge  that English has emerged as the dominant language for dissemination in our field, and that there are indeed some advantages to having a common language of science.  However, the situation does present extra hurdles for linguists whose native language is not English— they have to write and present In a language that is not  simply a transparent conduit to thought, but whose comprehension and production is an ´extra thing to do´.   

When it comes to choice of Object language, all would also agree that more diversity in the object languages being studied semantically is something we should all work towards. Diversity in object language has certainly increased over the past few decades,  but the situation is still rather skewed.  Enoch Aboh´s position was that we as linguists need to work harder to train native speaker linguists in the understudied languages of the world. Especially when it comes to semantics, there are nuances and insights that are simply not available to the non native speaker. We desperately need a more diverse set of linguists to be working on a more diverse set of languages in semantics. There are challenges in teaching formal semantics to students whose native languages are not English because of the lack of teaching materials for semantics in those languages. This is true even for Spanish, which Andreas Saab and Carol Rose Little both discussed their recent experiences in teaching beginning formal semantics, and the pedagogical tools that were simply not available to them.  While I am here, I will note that in response to this challenge, Andres Saab and Fernando Carranza  have come up with a textbook on formal semantics in Spanish, which you can download from this lingbuzz link https://ling.auf.net/lingbuzz/005205

We certainly need more textbooks in non-English languages. Michel de Graaf in the context of Haitian creole has pointed to research that shows that children learn formal topics like mathematics much better in their own native creole than in the formal French of normal school instruction.  If we want to train new generations of formal semanticists who can contribute to sorely needed crosslinguistic research, we need to start with diversifying the language of the teaching tools available in this area. Donka Farkas raised the important point that even in English language settings, formal semantics instruction would benefit from a diversification of the languages chosen to exemplify the theory. There is enough work around these days to do so in nearly all domains. She gave some examples, but most of us can think of a few, and the field would benefit a lot from pooling resources on this.

With respect to our current conference SALT32, we can take a look at the spread of different languages chosen as the object language for formal semantic study. In these counts, the first number represents talks where  no substantial data is  introduced from a language other than English, and the second number is the sum of talks where there was data produced for analysis in at least one non English language.

Main  Session Talks English vs. Other:  7 vs 8

Short Talks English vs. Other: 21 vs 14

So we see that at least with respect to the object language, we are currently hovering at about 60 percent English focus. I note in passing that this is still a better diversity level than what I found in ELM last month (Experiments in Linguistic Meaning).

The other languages in evidence as object language: Spanish, Russian, German, Dutch, japanese, Mandarin, Cantonese, Italian, Hindi/Urdu, Finnish, Djambarruyngu, Uzbek, Farsi, Amahuana, A´ingae, ASL, French Sign Language, Italian Sign Language and Sign Language of the Netherlands.  Ch´ol  also showed up in Little´s invited talk, and Andres Saab´s invited talk focused on Romance.

With respect to the meta language, it will not surprise my readers to know that all  of the talks were given in English. One short talk, in addition, was presented in parallel in sign (https://osf.io/wxn56/) ,  and most of the recorded poster presentation videos had captioning in English for the hearing impaired.  While there happened to be  no hearing impaired attendees in the in person audience,  there were many in the audience for whom English was a 2nd or 3rd language. It struck me that at a conference taking place in Mexico, subtitles in Spanish for all talks would have been a relatively inexpensive thing to do, given current technology. In talking with the organizers, it was pointed out to me that local students, while they are pretty good at English (better than I am at Spanish), still struggle with fast in person speech in English in many contexts. It would massively facilitate uptake of this highly technical formal content, if there were subtitles in Spanish (or even in English) for in person talks.  It also seems like it should be an option for researchers to present in their native language, especially in this case Spanish, and simply lay on English subtitles for the English speaking participants who happen to be Spanish-deaf.   After all, keeping English as the language of science in publication, does not need to mean monolinguality, but is also compatible with multilingualism in broader settings.  It seems to me that allowing for deviations from the norm whereby everybody is forced to wield an awkward in English at the same time as presenting their new research, would have the advantage of allowing non native English speakers to feel more relaxed and expressive, and also the advantage of undermining the monolith-ality of English and a kind of experienced monolingualism.  The current situation also somehow seems to contribute to the impression that English is the clear, logical, rational, language of science, while other people´s languages are just there to be studied.

So, people, what do we think? Shouldn´t we allow non English presentations at SALT. NELS GLOW and WCCFL?  After all, if Eurovision can do it…….

Here is my picture of super delicious taco sauces as a metaphor for linguistic diversity

Experiments in Linguistic Meaning ELM2

As the 2nd edition of Experiments in Linguistic Meaning wraps up, it is worthwhile thinking about the future of forum. What research strands and issues were prominent in the second edition of the conference, and what do we want from it in the future? Will there be future ELMs, and if so what will its remit and focus be?

First of all, thanks to Anna Papafragou and Florian Schwarz for the initiative and all the folks at UPenn for hosting one of the very first in person conferences (hybrid) in ages. Hybrid is more work than a digital and in person conference combined, but ELM2 was committed to making it work. There were many also committed to the idea of this new themed conference, who made the trip from across the US and even from Europe to attend in person. An equal number participated virtually.  I for one thoroughly enjoyed myself— all the papers were interesting to me, and I had many stimulating and fun conversations over the course of the 3 days.

There were 97 papers on show at the conference, of which 21 were main session long talks. There was one panel on computational semantics (3 talks) and three additional invited speakers.  The remaining 70 were short talks in parallel sessions. In terms of the topics covered, there was quite a spread ranging from quantifier scope to sarcasm and expressive words, to computational modeling (see here for a full list of presentations and abstracts).  Having said that, there were some clear clusters reflecting certain centres of gravity for research that was attracted to this conference: a full third of all papers mentioned implicature, presupposition or context in their titles or keywords; a further dozen or so made reference to discourse and/or logical connectives. This showed that, like non experimental specialist conferences in semantics, research seems to be most focused on intersentential meaning and inferencing.  A further mini area that was well represented at ELM was event cognition, telicity, causality and tense interpretation.  This broad area had about 15 hits (I’m not complaining!) and was most likely due to Anna Papafragou’s indirect influence on submissions at this conference.

In terms of methodologies used, the dominant experiments were behavioural, offline tasks,  albeit a wide range of those ranging from truth/felicity judgements to matching pictures with sentences, to language production. In a handful of cases, people’s behavioural measures were assessed against computational models. There were very few online measures (4 eye tracking papers of which 3 were visual world and one eye tracking while reading, one pupillometry study, and one EEG study). There was virtually no neurolinguistics, despite the N400 being the world’s most famous evoked potential within EEG.

When it came to diversity, the languages under the experimental spotlight were extremely restricted. Apart from one or two studies taking another European language  as their empirical ground (German, Spanish, Russian, Norwegian), and a couple looking at signed languages, the vast majority of the papers were based on data from English. Not only that, the research questions and claims themselves were most often broad and universalistic, by which I mean they did not depend in a deep way on the actual language being studied, as opposed to crosslinguistic or comparative (the exception was the sign language papers, which explicitly engaged with the question of different modalities of expression). It seemed to me that there was a lower rate of language diversity here than in either the standard kind of semantics conference or  the standard psycholinguistic conference (certainly the former).

So does the world need a conference on experiments in linguistic meaning? I think in principle the answer is Yes.  Experiments are still a minority at specialist semantics/pragmatics conferences,  and semantics/pragmatics is still in the minority at language processing conferences.  It strikes me that there are probably many people who are interested in overarching questions that pertain to meaning and human cognition, where we would all benefit from being able to share results and methodologies across paradigms.  It is worth being explicit about what the big picture questions that motivate the future potential ELM goer are:

  1. Investigating how linguistically specific semantic categories match up to the categories of domain general concepts, or are constrained by other properties of mind/brain.
  2. Understanding the logic and flow of human reasoning in context.
  3. Modeling detailed human judgements of truth, felicity and message conveyed by means of mathematical modeling or the training of neural nets, with a view to understanding the former.
  4. Understanding how semantics gets learned by both young humans and computers (again with a dominant interest in understanding the former).
  5. Investigating the  correlates of meaning and meaning composition in actual human brains.

The five categories above are neither exhaustive nor mutually exclusive, but represent a broad swathe of different kinds of research that do not always show up at the same conferences. The umbrella concern with meaning and meaning making is the major justification for having all of these kinds of papers being given under the same roof, allowing researchers in one of these speciality areas to benefit from the insights of the others, assumimg that there are  crossovers and synergies that are relevant here.

In order to make this work, I think the organizers of future ELMs need to continue their policy of inviting panels in specific areas, and invited speakers with varied kinds of expertise.  As we researchers get used to this particular umbrella, we need to get used to learning from adjacent methodologies and research questions when it comes to semantics.  What we do not want is just for various subsets of talks from other conferences to show up here year after year, but for the topic of meaning to grow interconnections across this web of research paradigms.  The hope is also that the conference will get more  diverse with respect to these parameters as it goes forward, and that we begin to see the payoffs from getting insight into each others’ work.

For me specifically, I would love to see ELM being a place where neurolinguistics also takes its seat at the table, and where crosslinguistic semantics is more systematically explored.

Experiments on Linguistic Meaning ELM2 Day 1

Can Neural Nets do Meaning?

The pandemic has been hard on many of us. It has been a long time since I traveled to an in person conference, or blogged about my experiences. The plan is to create a blog post for each of the three days, but let’s see– I am a little out of practice. Today I concentrate on the invited panel on computational semantics. There were other talks in the main session today but they will have to wait for another blog post.

The day started with a panel discussion on computational semantics. See the listing on the programme here. The three invited speakers, it turned out had different research goals, which was interesting, and I wonder how representative it is of the field. The question I posed to the panel after their (all very) interesting talks, was whether they considered themselves to be pursuing the goal of making the performance of computers on language related tasks better because it would lead to better functionality in various applications, or whether they were interested in modeling meaning tasks computationally in order to understand the human mind better.  Marie Catherine de Marneffe said she was was unequivocally in the former camp, Aaron White in the latter, while Ellie Pavlick was somewhere in transition— she started off being more interested in the former kinds of problems but was getting increasingly interested in the latter.

De Marneffe was interested in getting  computers to perform in a human like way with respect to judgements about speaker commitment to the truth of certain embedded propositions. As is well known, the new deep learning systems, trained on mountains of data (available for languages like English), end up doing stunningly well on standard benchmarks for performance. The speaker commitment judgement is no different, performance is strikingly good. The neural network gets given  simple parametric information about the lexical embedding verb (whether is is factive , or whether it lexical entails speaker commitment in principle), but also gets exposed to the distributional data, since linguistic context such as the presence of negation and other embeddings are necessary to make the judgements in question.  It turns out that these kinds of neural networks perform extremely well for example on neg raising contexts, generating human equivalent judgements for sentences like

I don’t think he is coming.

 However, there are a few kinds of sentence where the neural networks fail spectacularly. These are instructive. Two examples from the talk are given below, with the clause for which speaker commitment judgement fails shown as underlined.

(1) I have made many staff plans in my life and I do not believe I am being boastful if I say that very few of them needed amendment.

(2) I was convinced that they would fetch up at the house, but it appears that I was mistaken.

De Marneffe pointed out these examples and speculated that the problem for the neural nets is pragmatics and/or real world knowledge. (2) is striking because even the smallest most ignorant child would get this one right, so it seems to show  that whatever the neural net is doing, it really is not doing anything remotely human like. Maybe having a real embodied life and connections to  truth in the world is necessary to fix (2). But the problem with  (1)  seems to me not to be not so much about pragmatics as about embedding and hierarchical structure, which the neural net simply is not tracking or using as part of its calculation. Personally, I think the `problem’ with pragmatics, in terms of inferential strategies is overstated. I am pretty sure you can teach neural nets some inferential algorithms, but compositional structure and real grounding for meaning both seem to be the real sticking points.  But we only see this in cases when the linear distance cooccurrence data is non informative of the actual meaning.  It is sobering to notice how seldom those cases actually come up, and how often the simplistic heuristic delivers as a proxy for the more complex reality. How worried you are about the existence of these examples really depends on which of the two issues outlined above you are trying to solve.

With regard to Being in the World, Ellie Pavlick presented her work on trying to teach meaning grounding to neural nets, as a way of probing whether such training on physical properties of events denoted by motion verbs would help in acquiring the right behaviours and underlying representations. The evidence seems to be that modest gains in performance are indeed possible in certain domains based on this kind of training.  But here one wonders whether we can follow up those gains in all other domains without fully recreating the learning environment of the child in all its gory and glorious detail. The reductio of this approach would be a situation where you require so much data and nuance that it would be impossible to construct short of birthing your own small human and nurturing it in the world for five years.  As Ellie rightly pointed out in discussion however, the great advantage and excitement of being able to program and manipulate these neural nets is the controlled experiments you can do on the information you feed it, and how you can potentially selectively interrogate the representations of a successful model to try to come up with a decomposition of a complex effect, which might in the end be relevant to understanding the cognitive decomposition of the effect in humans.

Aaron White’s talk was on an experiment in training a neural net to match acceptability ratings leading to the induction of a  type structure for different constructions. The basic model was a combinatory categorial grammar with standard basic types and modes of combination. The intermediate interchange format was vector space representations, which are flexible and don’t require prejudging the syntax or the compositional semantics. The point of the training is to somehow see what gets induced when you try to create a system that best predicts the behavioural data. The test case presented was clausal embedding,  and peering under the hood afterwards, we can ask what kinds of `types’ were assigned to clausal complements of different varieties, and with different embedding verbs.  The types induced for clausal complements were very varied and not always comprehensible. Some seemed to make sense If you were thinking in Inquisitive Semantics terms, but others were harder to motivate. All in all, it seems like the job of interpreting why the model came up with what it did is as hard as the original problem,  and moreover bearing an ill understood and equally complicated relationship to the original problem of how humans `do’ meaning composition.  There are a lot of details that I clearly do not understand here.

All in all, it was a fascinating panel raising a lot of big picture issues in my own mind. But I come away with the suspicion that while BERT and his descendents are getting better and better at performing, their success is like the equivalent of getting the answer 42 to meaning of Life the Universe and Everything. It still does not help if we don’t know what exactly their version of the question was.  

Jabberwocky, the Beast that Tames the Beast?

Here is a short blog version (without slides) of a talk I gave at the recent Jabberwocky workshop hosted jointly by UMass Amherst and the University of Bucharest (thank you Camelia Bleotu and Deborah Foucault for a great initiative!). The ideas in this talk were rather non-standard and I suspect rather unpopular, but the concept was interesting and it was a great group of people to potentially interact with. Unfortunately, the time zone and weekend timing of the workshop did not allow me to participate as fully as I would have liked, I am airing those ideas here on this blog just in case someone is interested.

Jabberwocky sentences consist of syntactically well formed sentences with nonsense content words like this one I just made up: She didn’t glorph their lividar

If you are a syntactician, the nonce words here are a clever way to eliminate the effect of real lexical items and conceptual content, and zero in on combinatorial processes which underlie sentential structure and generativity. The very fact that we can make these sentences, seems to show that this aspect of language is distinct and modularizable away from the Lexicon per se. It is good to be able to abstract away from contentful LIs in a variety of methodologies, because controlling for frequency, semantic prediction, association etc. can be hard. From the point of view of syntactician, Jabberwocky sentences seem to offer a way of surgically removing the messy bits and to target pure syntax.

So the lexicon is hard, but in modern Chomskian views of grammar, the Lexicon is also the boring bit, where memorized chunks exist, but where no generative processes reside. This is taken to extremes in the Distributive Morphology tradition, where roots are devoid even of syntactic information that would tell you how to insert them in a sentence. The formal semanticists tend to concur: in that tradition we prove theorems of the form: Snow is white is TRUE iff ‘snow is white’ (Davidson 1967). Where the contentful lexical items are simply repeated in the metalanguage, languishing there for someone else (presumably NOT the formal semanticist) to elucidate.

However, there are some reasons to be a little suspicious of the possibility of excising the LI in a clean modular fashion.

Jabberwocky and fMRI

Fedorenko et al. (2010) develop a localizer task for helping in the analysis of regions of interest (ROIs) for linguistic experiments using fMRI. They use four conditions:

1. Sentences (The Sentences condition):

2. Scrambled Sentences (Word list condition):

3. Jabberwocky Sentences:

4. Scrambled Jabberwocky Sentences (the Non-words Condition):

Sentences > NonWords showed the language regions. Words and Jabberwocky both showed intermediate activation of the sentence regions but could not be reliably distinguished from each other. Words > NonWords and Jabberwocky > Nonwords showed ‘inconsistent and variable results across subjects’. This is disappointing if we think that jabberwocky sentences should show the brain doing its pure syntactic thing.

Jabberwocky Sentences and Neural Oscillations

There has been recent work in neurolinguistics exploring the idea that the processing of hierarchical linguistic structure is correlated with the synchronization of brain rhythms in various frequency bands. Kaufeld et al. (2019) (2020) recorded (EEG) while 29 adult native speakers (22 women, 7 men) listened to naturally spoken Dutch sentences, jabberwocky controls with morphemes and sentential prosody, word lists with lexical content but no phrase structure, and backward acoustically matched controls.

I quote: “Mutual information (MI) analysis revealed sensitivity to linguistic content: MI was highest for sentences at the phrasal (0.8–1.1 Hz) and lexical (1.9–2.8 Hz) timescales, suggesting that the delta-band is modulated by lexically driven combinatorial processing beyond prosody, and that linguistic content (i.e., structure and meaning) organizes neural oscillations beyond the timescale and rhythmicity of the stimulus.´´

The jabberwocky sentences on the other hand were no different from word lists with lexical content and no phrase structure on this measure.

One reaction to this kind of disappointing result is to say that syntax is just not really modularizable the way we thought. This seems to be the position of Blank and Fedorenko (2020), Mahowaldi et al 2022, essentially embracing work in Construction Grammar (Goldberg 1995, Goldberg and Jackendoff 2004).

These kinds of authors are also quick to point out that we don’t ‘need syntax’ to understand complex sentences most of the time, since lexical content and real world knowledge do the job for us. These ‘fake debates’ present us I think with a false set of analytic options. Grammar is not all constructions with lots of rich lexical content interacting with statistical properties of the world, nor is it Super Syntax the heavy lifter (oh wow recursion) with lexical items a relic of fuzziness that syntax can mold and structure for its creative purposes.

My own interpretation is to say that syntax exists (and is a cool weird thing), but that it needs to feed off content for the whole engine to get rolling. this means that our job as linguists requires us to understand (at least) two things:

(1) What are lexical meanings?

(2) How do they integrate with syntax in a compositional and combinatorical way?

So, we should use Jabberwocky sentences not to erase the lexical item, but as a way of trying to understand it better. All words are nonce before we know them.

The Real Beast: Jabbo Sapiens

This talk is a plea to use Jabberwocky sentences and nonce words to help usunderstand not the comprehensible residue, but the things they are replacing—- content words themselves! These little monsters, these Jabbo Sapiens turn out to pose loads of hard problems for compositional semantics and understanding how we communicate with other minds.

One might argue with Donald Davidson that building truth theorems is already hard, and good enough, and that it really is not the immediate job of the formal semanticist to elucidate the meanings of the individual lexical concepts snow and white.

The problem with the meanings of open class lexical items:

(i) they are conceptually polysemous while still being atomic with respect to how they function within the system , and

(ii) they undergo productive compositional processes with each other.

The latter point shows that understanding their behaviour is an important component of understanding the central properties of the human language system and its powers of productive meaning generation.

The psycholinguistics literature is very clear at showing us that there is a hub, or unity to the lemma with a localized point of access. This point of lexical access seems to be in the mid temporal gyrus (MTG) is independent of whether the sensory input is visual or auditory (Indefrey and Levelt 2004, Hickok and Poeppel 2007. Friederici 2012). Activation in this area can also be tracked using MEG and fMRI. Based on both neurolinguistic and behavioural evidence, we have strong support for the existence of the lemma which is the lexeme family underlying a symbol and all of its inflectional forms. Specifically, we know that lemma frequency as a whole (not the frequency of individual forms) modulates effects in the 300/450 ms time window in the MTG (Solomyak and Marantz 2010).

This literature is important because it shows that there is a lemma hub for all inflectional forms of the ‘same’ lexeme. But what constitutes ‘sameness’ in this sense? While in practice, it is not always easy to decide whether a pair of meanings associated with a form are homonyms or polysemic variants, or what leads learners/speakers to classify them as such, the evidence now seems clear that we can distinguish between cases where there must be two ‘lexical entries’ versus cases where there must be one. The cases where we have clear evidence for one lexical entry involve lemmas which characteristically embrace a large number of polysemic variants. Thus, polysemy is sharply distinguished in terms of cognitive consequences from homonyny, or genuine ambiguity, in which two distinct lemmas happen to share the same form. Polysemous readings are bunched together for the purposes of priming. Polysemous meanings are facilitory in word recognition, while genuine homonyms are inhibitory and cause slow downs in processing because of more alternatives remaining active. (Rodd et al. 2002 (lexical decision), Beretta et al. 2005 (MEG)),

Jabba Sapiens and Polysemy in Acquisition

How does a learner decide to group forms heard under the same umbrella lemma, the ‘same lexical entry’ if you will. Both the typological evidence and evidence from developing semantic competence in children show that polysemy is natural and ubiquitous. Novel word learning in children shows generalization across polysemous senses, even when the denotational percepts are quite different (Snedeker and Srinivasan 2014). Children also distinguish clearly between homonymy and polysemy at an early age, before they pass any tests of metalinguistic competence, showing that the difference cannot be metalinguistic, as claimed by Fodor and Lepore (2002) (Srinivasan and Snedeker 2011). Moreover, certain types of polysemy seem to be not idiosyncratically memorized, but are plausibly part of a pervasive conceptual system underlying all languages. For example, the container/containee polysemy was found by Mahesh Srinivasan and Rabagliati (2019) across 14 different languages (see also Zhu and Malt 2014 for crosslinguistic evidence).

.The take home point of this blog post is the following. Current formal semantic and morphosyntactic models fall short on explaining how symbolic primes of the open class system are compositionally integrated into sentences. Lexical Items are usually relegated to a sealed off no-man’s land that is somebody else’s business. But how the two domains interact in practice is never made explicit and turns out to be both HARD and IMPORTANT

References

Asher, N. (2011). Lexical Meaning in Context: A Web of Words. Cambridge: Cambridge University Press.

Beretta, A., R. Fiorentino, and D. Poeppel (2005). The effects of homonymy and polysemy on lexical access: an MEG study. Cognitive Brain Research 24, 57–65.

Blank, I. and E. Fedorenko (2020). No evidence for differences among language regions in their temporal receptive windows. NeuroImage 219, 116925.

Fedorenko, E., P.-J. Hsieh, A. N.-C. n on, S. Whitfield-Gabrieli, and N. Kanwisher (2010). New method for fmri investigations of language: Defining ROIs functionally in individual subjects. Journal of Neurophysiology, 1177– 1194.

Fodor, J. and E. Lepore (2002). The emptiness of the lexicon: Reflections on Pustejovsky. In The Compositionality Papers, pp. 89–119. Oxford University Press.

Friederici, A. (2012). The cortical language circuit: from auditory perception to sentence comprehension. Trends in Cognitive Sciences 16 (5), 262–268.

Goldberg, A. (1995). Constructions: A Construction Grammar Approach to Argument Structure. Chicago: University of Chicago Press.

Goldberg, A. and R. Jackendoff (2004). The English resultative as a family of constructions. Language 80, 532–568.

Hickok, G. and D. Poeppel (2007). The cortical organization of speech processing. Nature Reviews Neuroscience 8(5), 393–402.

Indefrey, P. and W. J. Levelt (2004). The spatial and temporal signatures of word production components. Cognition 92(1-2), 101–144.

Kaufeld, G., H. R. Bosker, P. M. Alday, A. S. Meyer, and A. E. Martin (2020). Structure and meaning “entrain” neural oscillations: a timescale-specific hierarchy. Journal of Neuroscience  40 (49) 9467-9475.

Leminen, A., E. Smolka, J. D. nabeitia, and C. Pliatsikas (2018). Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding). Cortex 116, 4–44.

Mahesh Srinivasan, C. B. and H. Rabagliati (2019). Children use polysemy to structure new word meanings. Journal of Experimental Psychology: General 148(5), 926–942.

Marslen-Wilson, W. T., M. Ford, L. Older, and X. Zhou (1996). The combinatorial lexicon: Priming derivational affixes. Proceedings of the Eighteenth Annual Conference of the Cognitive Science Society 18, 223–227.

Marslen-Wilson, W. T. and Tyler (2007). Morphology, language and the brain: the decompositional substrate for language comprehension. Transactions of the Royal Society of London. Biological Sciences 1481 (362), 823–836.

Pustejovsky, J. (1995). The Generative Lexicon. Cambridge, Ma.: MIT Press.

Rodd, J., G. Gaskell, and W. Marslen-Wilson (2002). Making sense of semantic ambiguity: semantic competition in lexical access. Journal of Memory and Language 46, 245–266.

Sahin, N. T., S. Pinker, S. Cash, D. Schomer, and E. Halgren (2009). Sequential processing of lexical, grammatical, and phonological information within Broca´s area. Science 5951 (326), 445–449.

Snedeker, J. and M. Srinivasan (2014). Polysemy and the taxonomic constraint: children’s representation of words that label multiple kinds. Language Learning and Development 10(2), 97–128.

Solomyak, O. and A. Marantz (2010). Evidence for early morphological decomposition in visual word recognition. Journal of Cognitive Neuroscience 22(9), 2042–2057.

Srinivasan, M. and J. Snedeker (2011). Judging a book by its cover and its contents: The representation of polysemous and homophonous meanings in four-year-old children. Cognitive Psychology 62 (4), 245 – 272.

Whiting, C., Y. Shtyrov, and W. Marslen-Wilson (2014). Real-time functional architecture of visual word recognition. Journal of Cognitive Neuroscience2 (27), 246–265.

Zhu, H. and B. Malt (2014). Cross-linguistic evidence for cognitive foundations of polysemy. Cognitive Science 36.

Academia During the Dark Ages

The term Dark Ages is associated with a period of time in medieval Europe sandwiched roughly between the fall of the Roman Empire and before the Renaissance, and in which allegedly ideas and growth were stultified by fear and superstition, and religious dogma. I am reliably informed that those  early  historians who invented the term (Petrarch) overstated their case and exaggerated this narrative for effect and that in fact lots of good things happened in e.g. 9th century England, but of all of that I am not personally in a position to judge. And in any case it is not the topic of the present blog post.  Rather, I use the term Dark Ages to refer to the period that we currently find ourselves in—the 21st century, but more properly starting from the period roughly at the end of the 1990s—  with respect to the realm of Academia.  In fact, the term Dark Ages is actually a gesture towards positivity,  because it implies that it will followed by a period of enlightenment. I certainly hope that is the case, but if so, it does not seem to me that it is imminent.  It is also not lost on me that the phrase Dark Ages conjures up impressions that are rather different  (plague/pandemic notwithstanding) from the shiny, clean, big data, money driven, professionalized academic spaces that our universities are curating.   But bear with me. It’s a metaphor.

The impetus to put some thoughts down on paper came after I received a bulk email from my head of department asking for volunteers from the faculty to join working groups that would collaborate on writing up the first draft of our university’s long term strategy under its new leadership team that had just taken over. I must confess I was curious about the leadership direction my university was going to go in under the new vice chancellor. I felt that under the previous leadership team, our university had moved in all kinds of bad directions ideologically: seeking to run the university more like a business,  taking more and more decisions in a top-down fashion in a more opaque and less accountable way, blindly implementing reports and checks and increased paperwork to measure and assess every quantifiable aspect of success and failure. Maybe the new team would take a fresh approach! So I glanced at the strategy plan that was going to be fleshed out into this important document.

Brave New World?

From the government we inherit a remit to deliver on education and lifelong learning, research, public dissemination, relevance to society and work life, innovation and creation of value. My university in particular promises to pair this with an emphasis on the North and the Arctic (obviously), has a commitment to open access science (you can’t hate it), and promises to use its multiplicity and heterogeneity of campuses, people and skills as an engine for problem solving and creating innovative solutions to the needs of society now and in the future (blah blah blah).    If you trawl around the internet for university strategy documents you will find every university using the very same buzz words, so its hard to figure out what it all means in practical terms, especially the last one. It is also hard to actively disagree with the positive statements in such strategy documents— they all sound like such good things, don’t they? 

In interrogating myself about why I felt underwhelmed and discontented with the outline I read, I decided to force myself to articulate what I thought should be the central remit of a university, and what I thought was missing.  Everyone is emphasizing  relevance, innovation, vision, and  how grounded their university is in their own community’s particular needs and sources of expertise.  It seems to be the zeitgeist. Maybe that is just the sign of  good ideas taking off.  And the language is so vigorous, forward thinking and optimistically engaged (albeit on the vague side).  Maybe I am just being churlish to be picking holes in the vision (`Churlish’ is just one of the kinder words that could be applied to me these days. I embrace it.).  But as I continued to think about it, I came up with at least two major areas where I believe universities have a solemn duty to contribute, but which have completely disappeared from public discourse.

The bottom line is that Universities are being conceived of  on the education side as engines for creating workforces, and on the research side as crucibles for technological innovation. The University is seen as the tool of Capital, and it is funded to the extent that it fuels Growth, and precisely the workforce that the holders of Capital need.   Those making the strategy decisions at the top will tell you that of course this is what the students want too, they want  jobs. Sure. That seems to be the bare minimum though. There are other things that the university in a mature democracy should be doing beyond that bare minimum. But these things have disappeared from strategy documents, or even the strategic thinking of educators and they are actively being eroded because of it.   I give them an airing here.

Education: Critical Thinking and the Challenge of True Democracy

This is clearly relevant for society these days! Just not for getting you a job.

I think democracy is hard and it’s fragile. But it’s the only system worth having and democracies need to work actively to maintain its health. Democracy and voting rely on a nation’s communities having access to education and information, and a sense of responsibility about what is at stake. In most university programmes before the Dark Ages, any degree that was taught by good lecturers provided transferrable skills of critical thinking and assessment of argumentation, and knowledge of how to go about reliably finding out whether something is true.  In the modern age, we have our own special problems concerning how information is disseminated and checked, and in the rise of propaganda and the difficulty of escaping bubbles. Our universities need to take the lead on giving people the skills to be able to navigate the increasingly tricky situation of finding reliable information sources, and also in recognizing bias, learning how to overcome emotion or prejudice in assessing arguments, etc. Our universities also need to take the lead in actively supporting the humanities (by which I mean not just not taking money away but actually channeling money into it). Because every young person needs to understand the cultural and historical context of the world they will be living (and voting!) in, and they need to be exposed to other minds and voices through fiction, which develops empathy and helps transcend tribalism.   (At MIT, where I did my undergraduate education, we had a humanities requirement. Everyone had to take a couple of humanities courses, of their choice, regardless of whether they were majoring in Engineering, or Math or Chemistry or whatever. I think this should be built in to all university degrees).

I am not so naïve to think I will ever convince a country or a university to make this their major remit. But I sure would like to see it as a bullet point in a strategy document.

Education (Lifelong): Tools  for Understanding,  Personal Development and Satisfaction

If you get down to first principles, growth and the economy are not really the things we need as a society. We want everyone to have the necessities of life and to have the opportunities to live full, happy and fulfilled lives. The modern world is  one where we humans have an increasing proportion of time that we can devote to leisure, because hard time consuming physical labour has been taken over by machines and tedious labour by computers.

How do we make ourselves happy?  

Education gives people tools and resources to keep learning, and understanding their world.  This leads to happiness.  Curiosity-based learning, acquisition of new skills,  leads to the appreciation of complex and satisfying forms of leisure and helps us be less bored and passive in our consumption of entertainment.  

Education is not job training. Education is something that human minds thrive on and we do it not to promote growth or technological advance, but just because human minds love to be so engaged.

I would love to rethink the remit of modern universities based around feeding curiosity and developing young people’s skills and resources with the goal of helping them find the thing that they are good at. I am guessing that then, whatever they end up specializing in, they will find a niche in society where they can do a job that contributes to the society’s goals and that makes them happy.   

Research: Curiosity Based Research

When it comes to research, the primary emphasis should be on curiosity driven research. And there should be no competition for grants and funding.

Right now, senior academics as well as early career researchers are forced into an endless cycle of applying for grants and producing publication points. The grants nowadays are skewed towards societal `relevance´ and impact, regardless of whether this fits in with the researcher´s own set of scientific questions. At first, a decade or so ago,  we just added an Impact paragraph to our applications, but now, increasingly, the whole research agenda must be re-thought and new strands of inquiry invented just to get on the grant bandwagon. Primary basic research is not really respected unless it brings in big grant money, and big grant money increasingly depends on subjective decisions of relevance and coolness. The standards of grant applications get higher and higher. Most of these grants are worthy and interesting, and if history of science tells us anything, it is generally impossible to predict what new thoughts or ideas are going to lead to big advances in some body of knowledge. In the mean time, the lottery for what gets funded is driven by forces that are random at best and skewed in an overly superficial direction at worst. And most importantly, researchers´ time is eaten up in this fruitless and soul destroying activity.

I read somewhere that if you take all the money that is spent in organizing the application system, the reviewing system and the administration and reporting of grants across the academic world, you could just give EVERYONE a research grant and resources to pursue a question and save big bucks.

Well, I guess that´s not going to happen. But the system is rotten and we all know it.  It is built on competition and insecurity. Young and early career researchers are experiencing financial insecurity, stress and burnout from increasingly unrealistic expectations,  with very little in terms of intellectual reward.

Climate Change and Man’s Relationship to the Planet

Given the scientific consensus on this, I am sort of surprised that a University like the one I am in,  that wants to be a leader on the North and the Arctic,  does not explicitly come out and say that it wants to lead on helping to reverse the damage we have done to the planet and mitigate the effects of the ongoing climate crisis, especially as some of the clear first signs of polar melt etc. are Arctic issues. Maybe this will come in the details of the strategy document that I am not going to sign up to help write.  But I am not holding my breath. It would probably be considered too political to state such a remit. Although the scientists do seem to agree that these are basic facts, not opinions.  I suspect that the present emphasis on local rootedness is directly connected to universities´ non engagement with issues of a global, universal nature. The world has become increasingly globalized. As long as universities shy away from the big hard questions and see their remit as providing growth, jobs and research grant money to their own local patch, they will not be the engine for critical pushback and change that we so desperately need.

But Some Things Have Got Better, Right?

Since the nineties, some things have improved in certain parts of the world. Rights for LGBT+ trans folks have improved, and diversity in the higher echelons of power in terms of representation of women, people of colour, etc. has improved somewhat (I feel personally that the status of women in academia has stagnated somewhat since the nineties and has lagged behind other kinds of progress).   Access to higher education has improved in many parts of the world.  In many arenas, new, fresh and progressive voices are being heard for the first time above the drone of the wisdom of the perennially entitled.  This is as it should be. As society changes and the people who were not privileged from birth come to have access to education, so will there be changed discourse and re-evaluations and upheaval.  This is also what universities are for. But  I fear that these forces are being managed and de-toothed as we speak. And even access to privilege through education is being clawed back on two fronts— both by turning universities into job making factories, and also by containing and demoralizing its employees who strive to teach and think while getting grants, being relevant,  and preventing their academic areas from getting the axe (is my course popular enough? Does this degree add value in terms of increasing the projected salary of those who take it?). 

The `Dark Ages´ in my long saga refers to our modern era with its commodification of intellectual capital and the Control of Academe by those who currently control the economy.

This piece will no doubt read to most of you who make it this far, as  quixotic, irresponsibly naïve and deeply impractical. But I would remind you that as a GEN X-er (who are they, again?), I actually do remember a time when these things were explicitly talked about in educational circles. So these ideas had not yet vanished from the discourse when I was an undergraduate. And they are not inherently impractical either.  I have watched the narrative shift, continuously and inexorably (just as the political narrative has shifted), to the extent that we have all been made to swallow as a basic premise the idea that anything other than the worklife relevance zeitgeist is untenable (just like the false belief that anything other than the free market and global capital is an untenable system— the two narratives are btw not unrelated).   

Petrarch talked about the Dark Ages primarily in relation to the light that had come before in the form of classical antiquity, not in relationship to the Enlightenment to come. I don’t know if there is any backlash on the horizon that could lead us out of  the stranglehold that this package of ideas has on the world at the moment.  I do not know what it would take to turn this particular boat around. I fear that things will have to get a lot lot worse before they will be allowed to get better.  But I am in the market for ideas!