There´s a Dumpster Fire at the End of the Information Superhighway

In the early nineties when millions of computers were connected to form the internet, it ushered in a digital revolution promising the democratization of access to information and leading to metaphors such as the `information superhighway´ as synonyms for the newly connected interwebs. I  was a young graduate student at the time and one of the exciting things about the internet for me was not in fact the analogy to driving,  or rapid information transfer,  but the experience of it as a messy,  unregulated anarchic door-opening, almost like the multiverse or a piece of untidy knotted string, which you could follow in your own idiosyncratic way.  I know, very GenX of me.

Even though we did not know exactly how it would play out long term, I think we all sensed at the time that our way of storing, accessing and searching for information had changed irrevocably from that moment on.  I think we are facing a similar informational tipping point again now, but not perhaps in the way that many folks are imagining.  Nobody wants to go back to index cards in physical libraries, but the question is whether our methods for searching for information are now going to make another quantum leap of improvement, made faster and more efficient via posing questions to an interactive Chatbot.  This  is precisely what Big Tech is telling us that we need, and they are currently fighting each other to be first to roll out the next generation of search-technology chatbots (Bing vs. Bard). These applications are fed by massive natural language models (from OpenAI) which because of the trillions of words they are trained on,  can generate plausible grammatical responses to our queries and potentially summarize information from that huge pool of networked text.   When it comes to pure search functionality, though,  there are good reasons to believe that the ways in people actually search and the way information-search interacts dynamically and dialectically with other cognitive goals such as learning and exploration will not all be equally well served by the `ask a bot’  model of communication interaction. (See this article by Emily Bender and Chirag Shah for a discussion of the issue https://dl.acm.org/doi/pdf/10.1145/3498366.3505816 . )

But I darkly suspect that helping people to search for information is not just a `selfless´ goal in the name of progress that these tech companies are pursuing.  In other words, Bing and Bard are not the only uses that OpenAI is going to be put to. The developers of the natural language models that make ChatGPT possible will sell that technology to others, and it will be modifiable beyond the `constrained´ pretty guardrailed version that underpins. ChatGPT itself.

There is no doubt that ChatGPT, the interactive content generator, has taken the world by storm since its launch in November 2022 and its ability to produce plausible and seemingly helpful text has been massively impressive. There’s been hype;  there’s  been backlash. There have been hard questions asked and performance flaws, leading to fixes and  improved guardrails.   In this blog post I will end up summarizing some of the major worries that have been aired and then go on to emphasize what I take to be the most serious threat that the  technology poses if it is not regulated now.  Some of these worries have appeared already on social media and published sources, which I will try to indicate as I proceed.  But the bottom line is going to be a version of my very own dystopian worry,  and involves the experiment of thinking consequentially about what will happen to information itself,  when more and more content-carrying text is handed over to artificial intelligence and dissociated from actual minds. Call it the Semanticists Take.

The Robots Are Coming Worry

So maybe you think I am going to go for the-chatbots-will-become-sentient-and-try-to-destroy-us worry (think Hal, or the latest behaviour of Bing). Or the more gentle sci-fi version where we potentially embrace new forms of sentience and come to understand and welcome them in our shared cognitive future.  But both these scenarios are just a form of the HYPE.  No! These chatbots understand nothing. They scrape content produced by actual minds,  and regurgitate it in statistically acceptable sounding forms.  They have no notion of truth or `aboutness´, let alone emotion. The fact that they seem to, is due to echoes from all the human texts they have consumed,  and testimony to our own human response mechanisms which impute content and feeling, and make the assumption of `another mind’ when faced with language produced for us.

The March of Capitalism Worry

There is an actual  real worry here, namely  that real Bad Actors (humans working for capitalist organizations who are trying to earn money for their shareholders) will  actually use it to continue taking over the world in the form of controlling and curating creative content (text, images, tunes),  and relegating actual humans to poorly paid monitors and editors with no job security or health insurance, but that is more a continuation of our present capitalist dystopian political reality than science fiction woo hoo.

The Bias and Toxic Content Worry

Here’s another concern that has rightly made the rounds. Because it is cannibalized from human content, chatbot content will repeat and recycle content from all the racist, misogynistic,  toxic and other questionable biases of the humans who created it. Huge amounts of resources will have to be spent regulating these AI technologies if they are to come equipped with `guardrails´. Even scarier is the thought that many of the purchasers of this technology will not  equip their use of it with guardrails. The fact is that this technology has not been created by public funds or non profit universities or even governments who are answerable in principle to an electorate. No, these applications have been created by and are owned by private companies whose only aims are to make money with it.

Here is Timnit  Gebru on why big tech cannot be trusted to regulated itself.

Also Gary Marcus recommending the pause button.

Is It time to Hit the Pause Button on AI

ChatGPT Inherently Does Not Know What Information Is

As a semanticist, I regularly have to think about meaning and what it means for something to have meaning. Formal semanticists ground their theories of meaning in some kind of model of reality: to give a theory of meaning in language you cannot simply redescribe it in terms of language itself; there needs to be a reckoning,  a final `reality´ check in terms of, well, Reality (or at least the thing(s) we humans apprise as Reality).  Actually, the way I like to think about it is more along the lines of Mark Dingemanse’s statement that language is the best mind to mind interface that we have.. The important next step is to realise that language does that by anchoring itself in consequences for the shared reality that the  two human minds are embedded in. There is an aboutness to language that is a crucial design feature,  and theory of mind is one of the cognitive abilities that humans need to decode it.  You need to know/assume/understand/trust  that there is another mind there apprising the same common reality as you are, and labeling it in similar ways.

Take ChatGPT now. ChatGPT has no Theory of Mind (Gary Marcus again on testing ChatGPT), and it has no notion of  any kind of reality or `aboutness´ to what it is generating. This means that it does not actually understand anything. It has no connection to truth.  All it is doing is scraping content in the form of text and generating plausible natural language sentences from its training material. It repeats and recycles but does not genuinely infer (bad at math and reasoning). It also cannot distinguish a fact from a non fact as a matter of principle. It produces false citations and false data unnecessarily and gratuitously, although it most often repeats correct things if that’s where it is getting its most statistically likely sentence.  

Emily Bender, also a linguist, has been a tireless campaigner against the breathless hype over large language models, even before the launch of ChatGPT in November. Read her viral article about Stochastic Parrots here

Ok, so one could imagine building an interactive search engine that was instructed only to summarize,  and where in addition,  all the sources were vetted and verified information. However, the technology as we see it now  seems also to hallucinate content even when it could not possibly have grabbed it from somewhere unreliable. It is unclear to me why the technology does this,  or whether it can be fixed. Is it to do with a built in feature that tells it to not repeat verbatim because of plagiarism risk, or is it due to the kinds of information compression and decompression algorithms that are being used?  Hallucinated content from Chatbots means that even if you  tell the search engine to only search a particular list of reputable sources, it could still give you erroneous information. 

It is apparent to me and every serious scientist that we would never use ChatGPT as our search engine for anything we need to find out in our academic field. It is moreover not clear to me at any rate, that I need my search interface to be in this impressively plausible linguistic form at all.  I do not necessarily think, in other words that universities and libraries should be racing to use, modify, or invent their own versions of Bing or Bard to search scientifically curated content. We know that developing a natural language model on this scale is extremely expensive. The reality is more likely to be that once it has been developed once by Microsoft, they will then sell it to everyone else and we will feel that we need it so much that we will rush to buy it.

Who is going to buy the technology?  And what are they going to use it for in the future? It is already being used by some companies to generate content for articles in online magazines (leading famously to retractions, when the content was not sufficiently overseen by a human), and by all kinds of folks to write summaries for meetings and presentations etc.  It will also no doubt be used to produced advertising texts and disinformation texts which will run rampant over the internet. We already have a problem with disinformation and unverifiability on the internet and these problems will increase exponentially since the present technology is much more believable and also, crucially, automatizable. Not only will the content so produced not be verified, it will also be increasingly non-verifiable.  Since these very  helpful chatbots will be the ones you turn to to find out whether the sources check out. As we have seen, ChatGPT regularly authoritatively spits out totally made up citations.

One can imagine fondly that some other tech bros will invent software that will detect whether something has been written by AI or not, but it will be a moving target, with so many different versions out there, and next generation versions that can cleverly outwit the automated checkers in a spiralling  arms race of  ever-increasing subtlety.  That way lies madness.

As more and more people use this technology to generate content, whether with the best of intentions or the worst of intentions (and we would be naïve to assume that Microsoft are not going to sell their new toy to anyone who is willing to pay for it), I predict that in the next few years the information highway is going to be more and more littered with content that has been created by artificial intelligence (think Plastic as a tempting environmental analogy).

The problem is that this is simply not information any more.

 It is faux-information.

It is content which bears some sort of causal relationship to information, but where the relationship is indirect and untrustworthy.   

What is going to happen when the information superhighway is contaminated with about 5 percent of faux-information? What about when it is 10 percent? 50 percent? What is going to happen when half of the content that ChatGPT is scraping its `information’ from is itself AI generated scraped content? Will the hallucinations start to snowball?

Here’s my prediction. We will lose the small window we have at the moment for governments to regulate, and in five years time (maybe more maybe less)  the internet superhighway will be more like something out of a Mad Max movie than a place where you can find information about how to fix your fridge yourself. 

AI will have consumed itself and destroyed the very notion of information.

(Well, at least the idea that you can find information on the internet.)

So the problem is NOT: how can we get this great new thing for ourselves and adapt it so that it does the good stuff and not the bad stuff? The problem is what happens when this thing is let out of the box. In five or ten years time,  how will we be able to distinguish the content from the faux-content from the faux-faux-content, using search applications  that also have no idea.

For those of us who watched the dumpster fire that consumed Twitter a couple of months ago, this is going to be similar and for similar reasons due to wilful lack of regulation, but now exacerbated by automated plagiarism generators.  Bigger. Maybe slower to unfold.  And we are sleepwalking into it.

There is hope for the preservation and advance of human knowledge at least if publicly funded universities and research institutions band together now to safeguard the content that it now houses (physically and digitally) in the form of libraries. There are two aspects to this: (i) we need to keep making principled decisions about what we allow to be searchable and (ii) we need to create our own versions of search engines for searching that content. We should not make the mistake of trying to use OpenAI technology to do this, because plausible linguistic interaction or essay writing ability is not what we need here. We just need slightly better functionality than current indexing systems, otherwise we will lose out to the bots. No need for plausible human interactive language, but  a much simpler ability wherein the search interface  simply repeats core findings verbatim and shows us the actual citation. Creating this kind of search engine (owned publicly and not by Google or Microsoft) would be way less resource-intensive than employing large language models. And arguably more scientifically useful.

We need to build good search engines that are NOT Artificial Intelligence machines,  but computer data sifters and organizers designed to aid intelligent human agents.   These search applications need to be publicly owned and maintained and open access.

The only people who are going to have the will or motivation to do this are the public universities, and we may need to work together. Everyone else is compromised (or drinking the KoolAid), including many of the worlds’ governments.

Now I know you are all probably thinking I am paranoid overreacting GenXer who is just yearning for a return to the internet of nineties. Like every other middle aged person before me, I am being negative about change and the past was always golden.  ChatGPT is great! We can use it for good.

I really really really hope you guys are right.

.

Foundations of Extended Projections

We at CASTLFish at UiT were recently thrilled to host a workshop on the Foundations of Extended Projections on  October 27-28, 2022. Due to funding cuts and  cuts even to the places where one can apply for basic research funding, it appears that CASTLFish will be very very poor from now on. To spend the little stash of cash we had left (which must be handed back by the end of 2022) we thought it would be appropriate to hold a conference on extended projection, since Peter Svenonius and I have a fairly well cited collaborative paper on the topic, and lots of strong opinions!

So two days of fun and stimulation were had by all!  Hour-long talks, lots of discussion, and lots of late night conversations. Just what we like up here in the Arctic as the days are drawing in. We also like controversy, and new radical ideas, and thinking from first principles. We got lots of that as well!

The programme can be found here, where you can read the authors´own abstracts.

.

A small in person workshop with people who are interested in thinking about the same general issue but from a variety of different perspectives is a great model for a stimulating and productive workshop.  The notion of the functional sequence and cartography has been a question of great theoretical interest over the last couple of decades, although detailed questions of descriptive cartography have not in themselves created much of a buzz. The main interest has been generated by the more controversial positions on the fine grainedness and universality of the hierarchy of projections.   At one extreme, ardent cartographers embrace a highly articulated and specific order of functional projections which forms an innate template for all speakers of human language. At the other extreme, distaste with overly specific representational innateness and universality claims leads to syntacticians essentially discarding the whole subfield and concentrating on other topics like Agree, Merge, Locality, or Labeling.

However, in my view, questions of Agree or Merge cannot be usefully discussed if the representational primes of the system are not agreed on. Thus cartography in the mundane sense of just figuring out what the categories and labels active in a particular grammatical system are, is  an important component of any computational or descriptive claim for the system. This `boring´ descriptive work is often left undone because both  camps seem to assume that whatever they have in their list of categories box  is universal (whether coarse grained or fine grained). If it’s  universal then the individual syntactician  does not need to figure it out on a language by language basis, they can just take it off their chosen ideological shelf.  But if Ramchand and Svenonius (2014), and Wiltschko (2018) are right, then we cannot in fact take those details for granted.

One of the outcomes of this small workshop was an emerging consensus that the language particular details are non-trivial, and that arguments for the lexical vs. functional distinction and the existence of a particular functional item must be argued for on language particular grounds, and without the help of a universal semantic template. This is because the notional categories themselves cannot be defined in a non circular fashion (Pietraszko, Ramchand, Tsai) without diacritics for `zone´—- cause, possibility, inception are notional categories that exist at many levels, and in certain languages many verbs can be used both functionally and lexically (Aboh, Pietraszko). I myself argued that conceptual, essential content as enshrined in the lexical symbol (located in declarative memory) is architecturally distinguished in every language from the referential and instantiational information in which it is clothed. This abstract distinction cuts across many of the notional semantic labels that are in common use within cartographic templates.

Other outcomes of the workshop were the beginning to an investigation into the crosslinguistic variation in the kinds of verbs that allow ECM, and whether this can be handled by notions of size, or truncation (Wurmbrand). Diercks and Tang presented detailed linguistic descriptive work investigating the representation of information structure in Bantu and Chinese respectively. Their proposed solutions convinced me that with respect to Focus and Givenness, the connection to functional items in the hierarchy of projections is far from obvious. Diercks asked us to believe in countercyclic Merge, which instead prompted a very productive discussion about alternatives.  Paul argued that the proposed FOFC language universal really is undermined by very basic constructions in Chinese, and that arguments putting those constructions aside do not work.   

Another major feature of the workshop was the willingness of the participants to think from first principles in fresh ways about the foundational questions in this domain. I have been growing weary of large  conferences where researchers present their work in an environment closely tied to the job market, to the demonstration of professional skills and talents, to competition for air time,  hyper-sensitivity towards market forces and what is currently trendy in our field. We unfortunately inhabit an academic space where this has become a necessary feature of professional meetings— the narrowing of jobs and resources, and the commodification of academia, has led to a hypercompetitive environment, and lots of stress and burnout.  Our small workshop was a refreshing change from that other kind of conference and one which all of our attendees appreciated, across a widely  diverse speaker group from well seasoned to early researcher status. Many of our speakers  expressed the idea that they were going to `say something controversial or crazy’ , or `try something new’ (Wurmbrand, Diercks, Pietraszko).   Adger told us about his new mereological foundations of phrase structure as an alternative to the set based metaphor. The mereological algebra, he argued, was better suited to the part/whole relationships we build through hierarchies. Svenonius showed how we could model the hierarchical orderings of the extended projection with all its gaps and repetitions and language specific detail using a finite state machine. Zhang speculated about what would happen if we countenanced the existence of functional items without category.

All in all, I feel grateful that CASTL  had the luxury and privilege to host such an event, and pay for all accepted papers to attend. We, and our own students, could witness linguists describing, explaining, arguing and generally doing what they do best trying to figure stuff out.

SALT32 in Mexico City: Some Thoughts on Linguistic Diversity

Flying in to Mexico city for one of the world´s most exciting annual formal semantics conferences (Semantics and Linguistic Theory SALT32)— it does not get much better than this, especially after more than two years of digital conference participation. I was not disappointed. The conference delegates stayed mostly in Coyoacan, one of the older suburbs on the edge of the city, with a rich cultural history. It was a wonderful lively place, with lots of restaurants and bars and one felt completely safe walking around the neighbourhoods both in daylight and darkness. The weather was warm, with the occasional dramatic thunderstorm (see Sant and Ramchand´s poster on occasional here). The food and drink were wonderful, and the people were warm and friendly. Many many thanks to the organizers at el Colegio de Mexico and El Colegio es conocimiento ciencia y cultura for moving heaven and earth to get this to work  so well in a hybrid format and for being such gracious hosts.

As far as I know, this was the first time that SALT was held in a location outside of North America, and the first in a country where the local language was not English. Therefore it was appropriate that the conference was host to a special SALTED workshop on Prestige English as an Object and Meta Language. The invited speakers were, Enoch Aboh, Donka Farkas, Carol Rose Little and Andres Saab. We can all acknowledge  that English has emerged as the dominant language for dissemination in our field, and that there are indeed some advantages to having a common language of science.  However, the situation does present extra hurdles for linguists whose native language is not English— they have to write and present In a language that is not  simply a transparent conduit to thought, but whose comprehension and production is an ´extra thing to do´.   

When it comes to choice of Object language, all would also agree that more diversity in the object languages being studied semantically is something we should all work towards. Diversity in object language has certainly increased over the past few decades,  but the situation is still rather skewed.  Enoch Aboh´s position was that we as linguists need to work harder to train native speaker linguists in the understudied languages of the world. Especially when it comes to semantics, there are nuances and insights that are simply not available to the non native speaker. We desperately need a more diverse set of linguists to be working on a more diverse set of languages in semantics. There are challenges in teaching formal semantics to students whose native languages are not English because of the lack of teaching materials for semantics in those languages. This is true even for Spanish, which Andreas Saab and Carol Rose Little both discussed their recent experiences in teaching beginning formal semantics, and the pedagogical tools that were simply not available to them.  While I am here, I will note that in response to this challenge, Andres Saab and Fernando Carranza  have come up with a textbook on formal semantics in Spanish, which you can download from this lingbuzz link https://ling.auf.net/lingbuzz/005205

We certainly need more textbooks in non-English languages. Michel de Graaf in the context of Haitian creole has pointed to research that shows that children learn formal topics like mathematics much better in their own native creole than in the formal French of normal school instruction.  If we want to train new generations of formal semanticists who can contribute to sorely needed crosslinguistic research, we need to start with diversifying the language of the teaching tools available in this area. Donka Farkas raised the important point that even in English language settings, formal semantics instruction would benefit from a diversification of the languages chosen to exemplify the theory. There is enough work around these days to do so in nearly all domains. She gave some examples, but most of us can think of a few, and the field would benefit a lot from pooling resources on this.

With respect to our current conference SALT32, we can take a look at the spread of different languages chosen as the object language for formal semantic study. In these counts, the first number represents talks where  no substantial data is  introduced from a language other than English, and the second number is the sum of talks where there was data produced for analysis in at least one non English language.

Main  Session Talks English vs. Other:  7 vs 8

Short Talks English vs. Other: 21 vs 14

So we see that at least with respect to the object language, we are currently hovering at about 60 percent English focus. I note in passing that this is still a better diversity level than what I found in ELM last month (Experiments in Linguistic Meaning).

The other languages in evidence as object language: Spanish, Russian, German, Dutch, japanese, Mandarin, Cantonese, Italian, Hindi/Urdu, Finnish, Djambarruyngu, Uzbek, Farsi, Amahuana, A´ingae, ASL, French Sign Language, Italian Sign Language and Sign Language of the Netherlands.  Ch´ol  also showed up in Little´s invited talk, and Andres Saab´s invited talk focused on Romance.

With respect to the meta language, it will not surprise my readers to know that all  of the talks were given in English. One short talk, in addition, was presented in parallel in sign (https://osf.io/wxn56/) ,  and most of the recorded poster presentation videos had captioning in English for the hearing impaired.  While there happened to be  no hearing impaired attendees in the in person audience,  there were many in the audience for whom English was a 2nd or 3rd language. It struck me that at a conference taking place in Mexico, subtitles in Spanish for all talks would have been a relatively inexpensive thing to do, given current technology. In talking with the organizers, it was pointed out to me that local students, while they are pretty good at English (better than I am at Spanish), still struggle with fast in person speech in English in many contexts. It would massively facilitate uptake of this highly technical formal content, if there were subtitles in Spanish (or even in English) for in person talks.  It also seems like it should be an option for researchers to present in their native language, especially in this case Spanish, and simply lay on English subtitles for the English speaking participants who happen to be Spanish-deaf.   After all, keeping English as the language of science in publication, does not need to mean monolinguality, but is also compatible with multilingualism in broader settings.  It seems to me that allowing for deviations from the norm whereby everybody is forced to wield an awkward in English at the same time as presenting their new research, would have the advantage of allowing non native English speakers to feel more relaxed and expressive, and also the advantage of undermining the monolith-ality of English and a kind of experienced monolingualism.  The current situation also somehow seems to contribute to the impression that English is the clear, logical, rational, language of science, while other people´s languages are just there to be studied.

So, people, what do we think? Shouldn´t we allow non English presentations at SALT. NELS GLOW and WCCFL?  After all, if Eurovision can do it…….

Here is my picture of super delicious taco sauces as a metaphor for linguistic diversity

Experiments in Linguistic Meaning ELM2

As the 2nd edition of Experiments in Linguistic Meaning wraps up, it is worthwhile thinking about the future of forum. What research strands and issues were prominent in the second edition of the conference, and what do we want from it in the future? Will there be future ELMs, and if so what will its remit and focus be?

First of all, thanks to Anna Papafragou and Florian Schwarz for the initiative and all the folks at UPenn for hosting one of the very first in person conferences (hybrid) in ages. Hybrid is more work than a digital and in person conference combined, but ELM2 was committed to making it work. There were many also committed to the idea of this new themed conference, who made the trip from across the US and even from Europe to attend in person. An equal number participated virtually.  I for one thoroughly enjoyed myself— all the papers were interesting to me, and I had many stimulating and fun conversations over the course of the 3 days.

There were 97 papers on show at the conference, of which 21 were main session long talks. There was one panel on computational semantics (3 talks) and three additional invited speakers.  The remaining 70 were short talks in parallel sessions. In terms of the topics covered, there was quite a spread ranging from quantifier scope to sarcasm and expressive words, to computational modeling (see here for a full list of presentations and abstracts).  Having said that, there were some clear clusters reflecting certain centres of gravity for research that was attracted to this conference: a full third of all papers mentioned implicature, presupposition or context in their titles or keywords; a further dozen or so made reference to discourse and/or logical connectives. This showed that, like non experimental specialist conferences in semantics, research seems to be most focused on intersentential meaning and inferencing.  A further mini area that was well represented at ELM was event cognition, telicity, causality and tense interpretation.  This broad area had about 15 hits (I’m not complaining!) and was most likely due to Anna Papafragou’s indirect influence on submissions at this conference.

In terms of methodologies used, the dominant experiments were behavioural, offline tasks,  albeit a wide range of those ranging from truth/felicity judgements to matching pictures with sentences, to language production. In a handful of cases, people’s behavioural measures were assessed against computational models. There were very few online measures (4 eye tracking papers of which 3 were visual world and one eye tracking while reading, one pupillometry study, and one EEG study). There was virtually no neurolinguistics, despite the N400 being the world’s most famous evoked potential within EEG.

When it came to diversity, the languages under the experimental spotlight were extremely restricted. Apart from one or two studies taking another European language  as their empirical ground (German, Spanish, Russian, Norwegian), and a couple looking at signed languages, the vast majority of the papers were based on data from English. Not only that, the research questions and claims themselves were most often broad and universalistic, by which I mean they did not depend in a deep way on the actual language being studied, as opposed to crosslinguistic or comparative (the exception was the sign language papers, which explicitly engaged with the question of different modalities of expression). It seemed to me that there was a lower rate of language diversity here than in either the standard kind of semantics conference or  the standard psycholinguistic conference (certainly the former).

So does the world need a conference on experiments in linguistic meaning? I think in principle the answer is Yes.  Experiments are still a minority at specialist semantics/pragmatics conferences,  and semantics/pragmatics is still in the minority at language processing conferences.  It strikes me that there are probably many people who are interested in overarching questions that pertain to meaning and human cognition, where we would all benefit from being able to share results and methodologies across paradigms.  It is worth being explicit about what the big picture questions that motivate the future potential ELM goer are:

  1. Investigating how linguistically specific semantic categories match up to the categories of domain general concepts, or are constrained by other properties of mind/brain.
  2. Understanding the logic and flow of human reasoning in context.
  3. Modeling detailed human judgements of truth, felicity and message conveyed by means of mathematical modeling or the training of neural nets, with a view to understanding the former.
  4. Understanding how semantics gets learned by both young humans and computers (again with a dominant interest in understanding the former).
  5. Investigating the  correlates of meaning and meaning composition in actual human brains.

The five categories above are neither exhaustive nor mutually exclusive, but represent a broad swathe of different kinds of research that do not always show up at the same conferences. The umbrella concern with meaning and meaning making is the major justification for having all of these kinds of papers being given under the same roof, allowing researchers in one of these speciality areas to benefit from the insights of the others, assumimg that there are  crossovers and synergies that are relevant here.

In order to make this work, I think the organizers of future ELMs need to continue their policy of inviting panels in specific areas, and invited speakers with varied kinds of expertise.  As we researchers get used to this particular umbrella, we need to get used to learning from adjacent methodologies and research questions when it comes to semantics.  What we do not want is just for various subsets of talks from other conferences to show up here year after year, but for the topic of meaning to grow interconnections across this web of research paradigms.  The hope is also that the conference will get more  diverse with respect to these parameters as it goes forward, and that we begin to see the payoffs from getting insight into each others’ work.

For me specifically, I would love to see ELM being a place where neurolinguistics also takes its seat at the table, and where crosslinguistic semantics is more systematically explored.

Experiments on Linguistic Meaning ELM2 Day 1

Can Neural Nets do Meaning?

The pandemic has been hard on many of us. It has been a long time since I traveled to an in person conference, or blogged about my experiences. The plan is to create a blog post for each of the three days, but let’s see– I am a little out of practice. Today I concentrate on the invited panel on computational semantics. There were other talks in the main session today but they will have to wait for another blog post.

The day started with a panel discussion on computational semantics. See the listing on the programme here. The three invited speakers, it turned out had different research goals, which was interesting, and I wonder how representative it is of the field. The question I posed to the panel after their (all very) interesting talks, was whether they considered themselves to be pursuing the goal of making the performance of computers on language related tasks better because it would lead to better functionality in various applications, or whether they were interested in modeling meaning tasks computationally in order to understand the human mind better.  Marie Catherine de Marneffe said she was was unequivocally in the former camp, Aaron White in the latter, while Ellie Pavlick was somewhere in transition— she started off being more interested in the former kinds of problems but was getting increasingly interested in the latter.

De Marneffe was interested in getting  computers to perform in a human like way with respect to judgements about speaker commitment to the truth of certain embedded propositions. As is well known, the new deep learning systems, trained on mountains of data (available for languages like English), end up doing stunningly well on standard benchmarks for performance. The speaker commitment judgement is no different, performance is strikingly good. The neural network gets given  simple parametric information about the lexical embedding verb (whether is is factive , or whether it lexical entails speaker commitment in principle), but also gets exposed to the distributional data, since linguistic context such as the presence of negation and other embeddings are necessary to make the judgements in question.  It turns out that these kinds of neural networks perform extremely well for example on neg raising contexts, generating human equivalent judgements for sentences like

I don’t think he is coming.

 However, there are a few kinds of sentence where the neural networks fail spectacularly. These are instructive. Two examples from the talk are given below, with the clause for which speaker commitment judgement fails shown as underlined.

(1) I have made many staff plans in my life and I do not believe I am being boastful if I say that very few of them needed amendment.

(2) I was convinced that they would fetch up at the house, but it appears that I was mistaken.

De Marneffe pointed out these examples and speculated that the problem for the neural nets is pragmatics and/or real world knowledge. (2) is striking because even the smallest most ignorant child would get this one right, so it seems to show  that whatever the neural net is doing, it really is not doing anything remotely human like. Maybe having a real embodied life and connections to  truth in the world is necessary to fix (2). But the problem with  (1)  seems to me not to be not so much about pragmatics as about embedding and hierarchical structure, which the neural net simply is not tracking or using as part of its calculation. Personally, I think the `problem’ with pragmatics, in terms of inferential strategies is overstated. I am pretty sure you can teach neural nets some inferential algorithms, but compositional structure and real grounding for meaning both seem to be the real sticking points.  But we only see this in cases when the linear distance cooccurrence data is non informative of the actual meaning.  It is sobering to notice how seldom those cases actually come up, and how often the simplistic heuristic delivers as a proxy for the more complex reality. How worried you are about the existence of these examples really depends on which of the two issues outlined above you are trying to solve.

With regard to Being in the World, Ellie Pavlick presented her work on trying to teach meaning grounding to neural nets, as a way of probing whether such training on physical properties of events denoted by motion verbs would help in acquiring the right behaviours and underlying representations. The evidence seems to be that modest gains in performance are indeed possible in certain domains based on this kind of training.  But here one wonders whether we can follow up those gains in all other domains without fully recreating the learning environment of the child in all its gory and glorious detail. The reductio of this approach would be a situation where you require so much data and nuance that it would be impossible to construct short of birthing your own small human and nurturing it in the world for five years.  As Ellie rightly pointed out in discussion however, the great advantage and excitement of being able to program and manipulate these neural nets is the controlled experiments you can do on the information you feed it, and how you can potentially selectively interrogate the representations of a successful model to try to come up with a decomposition of a complex effect, which might in the end be relevant to understanding the cognitive decomposition of the effect in humans.

Aaron White’s talk was on an experiment in training a neural net to match acceptability ratings leading to the induction of a  type structure for different constructions. The basic model was a combinatory categorial grammar with standard basic types and modes of combination. The intermediate interchange format was vector space representations, which are flexible and don’t require prejudging the syntax or the compositional semantics. The point of the training is to somehow see what gets induced when you try to create a system that best predicts the behavioural data. The test case presented was clausal embedding,  and peering under the hood afterwards, we can ask what kinds of `types’ were assigned to clausal complements of different varieties, and with different embedding verbs.  The types induced for clausal complements were very varied and not always comprehensible. Some seemed to make sense If you were thinking in Inquisitive Semantics terms, but others were harder to motivate. All in all, it seems like the job of interpreting why the model came up with what it did is as hard as the original problem,  and moreover bearing an ill understood and equally complicated relationship to the original problem of how humans `do’ meaning composition.  There are a lot of details that I clearly do not understand here.

All in all, it was a fascinating panel raising a lot of big picture issues in my own mind. But I come away with the suspicion that while BERT and his descendents are getting better and better at performing, their success is like the equivalent of getting the answer 42 to meaning of Life the Universe and Everything. It still does not help if we don’t know what exactly their version of the question was.  

Jabberwocky, the Beast that Tames the Beast?

Here is a short blog version (without slides) of a talk I gave at the recent Jabberwocky workshop hosted jointly by UMass Amherst and the University of Bucharest (thank you Camelia Bleotu and Deborah Foucault for a great initiative!). The ideas in this talk were rather non-standard and I suspect rather unpopular, but the concept was interesting and it was a great group of people to potentially interact with. Unfortunately, the time zone and weekend timing of the workshop did not allow me to participate as fully as I would have liked, I am airing those ideas here on this blog just in case someone is interested.

Jabberwocky sentences consist of syntactically well formed sentences with nonsense content words like this one I just made up: She didn’t glorph their lividar

If you are a syntactician, the nonce words here are a clever way to eliminate the effect of real lexical items and conceptual content, and zero in on combinatorial processes which underlie sentential structure and generativity. The very fact that we can make these sentences, seems to show that this aspect of language is distinct and modularizable away from the Lexicon per se. It is good to be able to abstract away from contentful LIs in a variety of methodologies, because controlling for frequency, semantic prediction, association etc. can be hard. From the point of view of syntactician, Jabberwocky sentences seem to offer a way of surgically removing the messy bits and to target pure syntax.

So the lexicon is hard, but in modern Chomskian views of grammar, the Lexicon is also the boring bit, where memorized chunks exist, but where no generative processes reside. This is taken to extremes in the Distributive Morphology tradition, where roots are devoid even of syntactic information that would tell you how to insert them in a sentence. The formal semanticists tend to concur: in that tradition we prove theorems of the form: Snow is white is TRUE iff ‘snow is white’ (Davidson 1967). Where the contentful lexical items are simply repeated in the metalanguage, languishing there for someone else (presumably NOT the formal semanticist) to elucidate.

However, there are some reasons to be a little suspicious of the possibility of excising the LI in a clean modular fashion.

Jabberwocky and fMRI

Fedorenko et al. (2010) develop a localizer task for helping in the analysis of regions of interest (ROIs) for linguistic experiments using fMRI. They use four conditions:

1. Sentences (The Sentences condition):

2. Scrambled Sentences (Word list condition):

3. Jabberwocky Sentences:

4. Scrambled Jabberwocky Sentences (the Non-words Condition):

Sentences > NonWords showed the language regions. Words and Jabberwocky both showed intermediate activation of the sentence regions but could not be reliably distinguished from each other. Words > NonWords and Jabberwocky > Nonwords showed ‘inconsistent and variable results across subjects’. This is disappointing if we think that jabberwocky sentences should show the brain doing its pure syntactic thing.

Jabberwocky Sentences and Neural Oscillations

There has been recent work in neurolinguistics exploring the idea that the processing of hierarchical linguistic structure is correlated with the synchronization of brain rhythms in various frequency bands. Kaufeld et al. (2019) (2020) recorded (EEG) while 29 adult native speakers (22 women, 7 men) listened to naturally spoken Dutch sentences, jabberwocky controls with morphemes and sentential prosody, word lists with lexical content but no phrase structure, and backward acoustically matched controls.

I quote: “Mutual information (MI) analysis revealed sensitivity to linguistic content: MI was highest for sentences at the phrasal (0.8–1.1 Hz) and lexical (1.9–2.8 Hz) timescales, suggesting that the delta-band is modulated by lexically driven combinatorial processing beyond prosody, and that linguistic content (i.e., structure and meaning) organizes neural oscillations beyond the timescale and rhythmicity of the stimulus.´´

The jabberwocky sentences on the other hand were no different from word lists with lexical content and no phrase structure on this measure.

One reaction to this kind of disappointing result is to say that syntax is just not really modularizable the way we thought. This seems to be the position of Blank and Fedorenko (2020), Mahowaldi et al 2022, essentially embracing work in Construction Grammar (Goldberg 1995, Goldberg and Jackendoff 2004).

These kinds of authors are also quick to point out that we don’t ‘need syntax’ to understand complex sentences most of the time, since lexical content and real world knowledge do the job for us. These ‘fake debates’ present us I think with a false set of analytic options. Grammar is not all constructions with lots of rich lexical content interacting with statistical properties of the world, nor is it Super Syntax the heavy lifter (oh wow recursion) with lexical items a relic of fuzziness that syntax can mold and structure for its creative purposes.

My own interpretation is to say that syntax exists (and is a cool weird thing), but that it needs to feed off content for the whole engine to get rolling. this means that our job as linguists requires us to understand (at least) two things:

(1) What are lexical meanings?

(2) How do they integrate with syntax in a compositional and combinatorical way?

So, we should use Jabberwocky sentences not to erase the lexical item, but as a way of trying to understand it better. All words are nonce before we know them.

The Real Beast: Jabbo Sapiens

This talk is a plea to use Jabberwocky sentences and nonce words to help usunderstand not the comprehensible residue, but the things they are replacing—- content words themselves! These little monsters, these Jabbo Sapiens turn out to pose loads of hard problems for compositional semantics and understanding how we communicate with other minds.

One might argue with Donald Davidson that building truth theorems is already hard, and good enough, and that it really is not the immediate job of the formal semanticist to elucidate the meanings of the individual lexical concepts snow and white.

The problem with the meanings of open class lexical items:

(i) they are conceptually polysemous while still being atomic with respect to how they function within the system , and

(ii) they undergo productive compositional processes with each other.

The latter point shows that understanding their behaviour is an important component of understanding the central properties of the human language system and its powers of productive meaning generation.

The psycholinguistics literature is very clear at showing us that there is a hub, or unity to the lemma with a localized point of access. This point of lexical access seems to be in the mid temporal gyrus (MTG) is independent of whether the sensory input is visual or auditory (Indefrey and Levelt 2004, Hickok and Poeppel 2007. Friederici 2012). Activation in this area can also be tracked using MEG and fMRI. Based on both neurolinguistic and behavioural evidence, we have strong support for the existence of the lemma which is the lexeme family underlying a symbol and all of its inflectional forms. Specifically, we know that lemma frequency as a whole (not the frequency of individual forms) modulates effects in the 300/450 ms time window in the MTG (Solomyak and Marantz 2010).

This literature is important because it shows that there is a lemma hub for all inflectional forms of the ‘same’ lexeme. But what constitutes ‘sameness’ in this sense? While in practice, it is not always easy to decide whether a pair of meanings associated with a form are homonyms or polysemic variants, or what leads learners/speakers to classify them as such, the evidence now seems clear that we can distinguish between cases where there must be two ‘lexical entries’ versus cases where there must be one. The cases where we have clear evidence for one lexical entry involve lemmas which characteristically embrace a large number of polysemic variants. Thus, polysemy is sharply distinguished in terms of cognitive consequences from homonyny, or genuine ambiguity, in which two distinct lemmas happen to share the same form. Polysemous readings are bunched together for the purposes of priming. Polysemous meanings are facilitory in word recognition, while genuine homonyms are inhibitory and cause slow downs in processing because of more alternatives remaining active. (Rodd et al. 2002 (lexical decision), Beretta et al. 2005 (MEG)),

Jabba Sapiens and Polysemy in Acquisition

How does a learner decide to group forms heard under the same umbrella lemma, the ‘same lexical entry’ if you will. Both the typological evidence and evidence from developing semantic competence in children show that polysemy is natural and ubiquitous. Novel word learning in children shows generalization across polysemous senses, even when the denotational percepts are quite different (Snedeker and Srinivasan 2014). Children also distinguish clearly between homonymy and polysemy at an early age, before they pass any tests of metalinguistic competence, showing that the difference cannot be metalinguistic, as claimed by Fodor and Lepore (2002) (Srinivasan and Snedeker 2011). Moreover, certain types of polysemy seem to be not idiosyncratically memorized, but are plausibly part of a pervasive conceptual system underlying all languages. For example, the container/containee polysemy was found by Mahesh Srinivasan and Rabagliati (2019) across 14 different languages (see also Zhu and Malt 2014 for crosslinguistic evidence).

.The take home point of this blog post is the following. Current formal semantic and morphosyntactic models fall short on explaining how symbolic primes of the open class system are compositionally integrated into sentences. Lexical Items are usually relegated to a sealed off no-man’s land that is somebody else’s business. But how the two domains interact in practice is never made explicit and turns out to be both HARD and IMPORTANT

References

Asher, N. (2011). Lexical Meaning in Context: A Web of Words. Cambridge: Cambridge University Press.

Beretta, A., R. Fiorentino, and D. Poeppel (2005). The effects of homonymy and polysemy on lexical access: an MEG study. Cognitive Brain Research 24, 57–65.

Blank, I. and E. Fedorenko (2020). No evidence for differences among language regions in their temporal receptive windows. NeuroImage 219, 116925.

Fedorenko, E., P.-J. Hsieh, A. N.-C. n on, S. Whitfield-Gabrieli, and N. Kanwisher (2010). New method for fmri investigations of language: Defining ROIs functionally in individual subjects. Journal of Neurophysiology, 1177– 1194.

Fodor, J. and E. Lepore (2002). The emptiness of the lexicon: Reflections on Pustejovsky. In The Compositionality Papers, pp. 89–119. Oxford University Press.

Friederici, A. (2012). The cortical language circuit: from auditory perception to sentence comprehension. Trends in Cognitive Sciences 16 (5), 262–268.

Goldberg, A. (1995). Constructions: A Construction Grammar Approach to Argument Structure. Chicago: University of Chicago Press.

Goldberg, A. and R. Jackendoff (2004). The English resultative as a family of constructions. Language 80, 532–568.

Hickok, G. and D. Poeppel (2007). The cortical organization of speech processing. Nature Reviews Neuroscience 8(5), 393–402.

Indefrey, P. and W. J. Levelt (2004). The spatial and temporal signatures of word production components. Cognition 92(1-2), 101–144.

Kaufeld, G., H. R. Bosker, P. M. Alday, A. S. Meyer, and A. E. Martin (2020). Structure and meaning “entrain” neural oscillations: a timescale-specific hierarchy. Journal of Neuroscience  40 (49) 9467-9475.

Leminen, A., E. Smolka, J. D. nabeitia, and C. Pliatsikas (2018). Morphological processing in the brain: the good (inflection), the bad (derivation) and the ugly (compounding). Cortex 116, 4–44.

Mahesh Srinivasan, C. B. and H. Rabagliati (2019). Children use polysemy to structure new word meanings. Journal of Experimental Psychology: General 148(5), 926–942.

Marslen-Wilson, W. T., M. Ford, L. Older, and X. Zhou (1996). The combinatorial lexicon: Priming derivational affixes. Proceedings of the Eighteenth Annual Conference of the Cognitive Science Society 18, 223–227.

Marslen-Wilson, W. T. and Tyler (2007). Morphology, language and the brain: the decompositional substrate for language comprehension. Transactions of the Royal Society of London. Biological Sciences 1481 (362), 823–836.

Pustejovsky, J. (1995). The Generative Lexicon. Cambridge, Ma.: MIT Press.

Rodd, J., G. Gaskell, and W. Marslen-Wilson (2002). Making sense of semantic ambiguity: semantic competition in lexical access. Journal of Memory and Language 46, 245–266.

Sahin, N. T., S. Pinker, S. Cash, D. Schomer, and E. Halgren (2009). Sequential processing of lexical, grammatical, and phonological information within Broca´s area. Science 5951 (326), 445–449.

Snedeker, J. and M. Srinivasan (2014). Polysemy and the taxonomic constraint: children’s representation of words that label multiple kinds. Language Learning and Development 10(2), 97–128.

Solomyak, O. and A. Marantz (2010). Evidence for early morphological decomposition in visual word recognition. Journal of Cognitive Neuroscience 22(9), 2042–2057.

Srinivasan, M. and J. Snedeker (2011). Judging a book by its cover and its contents: The representation of polysemous and homophonous meanings in four-year-old children. Cognitive Psychology 62 (4), 245 – 272.

Whiting, C., Y. Shtyrov, and W. Marslen-Wilson (2014). Real-time functional architecture of visual word recognition. Journal of Cognitive Neuroscience2 (27), 246–265.

Zhu, H. and B. Malt (2014). Cross-linguistic evidence for cognitive foundations of polysemy. Cognitive Science 36.

Academia During the Dark Ages

The term Dark Ages is associated with a period of time in medieval Europe sandwiched roughly between the fall of the Roman Empire and before the Renaissance, and in which allegedly ideas and growth were stultified by fear and superstition, and religious dogma. I am reliably informed that those  early  historians who invented the term (Petrarch) overstated their case and exaggerated this narrative for effect and that in fact lots of good things happened in e.g. 9th century England, but of all of that I am not personally in a position to judge. And in any case it is not the topic of the present blog post.  Rather, I use the term Dark Ages to refer to the period that we currently find ourselves in—the 21st century, but more properly starting from the period roughly at the end of the 1990s—  with respect to the realm of Academia.  In fact, the term Dark Ages is actually a gesture towards positivity,  because it implies that it will followed by a period of enlightenment. I certainly hope that is the case, but if so, it does not seem to me that it is imminent.  It is also not lost on me that the phrase Dark Ages conjures up impressions that are rather different  (plague/pandemic notwithstanding) from the shiny, clean, big data, money driven, professionalized academic spaces that our universities are curating.   But bear with me. It’s a metaphor.

The impetus to put some thoughts down on paper came after I received a bulk email from my head of department asking for volunteers from the faculty to join working groups that would collaborate on writing up the first draft of our university’s long term strategy under its new leadership team that had just taken over. I must confess I was curious about the leadership direction my university was going to go in under the new vice chancellor. I felt that under the previous leadership team, our university had moved in all kinds of bad directions ideologically: seeking to run the university more like a business,  taking more and more decisions in a top-down fashion in a more opaque and less accountable way, blindly implementing reports and checks and increased paperwork to measure and assess every quantifiable aspect of success and failure. Maybe the new team would take a fresh approach! So I glanced at the strategy plan that was going to be fleshed out into this important document.

Brave New World?

From the government we inherit a remit to deliver on education and lifelong learning, research, public dissemination, relevance to society and work life, innovation and creation of value. My university in particular promises to pair this with an emphasis on the North and the Arctic (obviously), has a commitment to open access science (you can’t hate it), and promises to use its multiplicity and heterogeneity of campuses, people and skills as an engine for problem solving and creating innovative solutions to the needs of society now and in the future (blah blah blah).    If you trawl around the internet for university strategy documents you will find every university using the very same buzz words, so its hard to figure out what it all means in practical terms, especially the last one. It is also hard to actively disagree with the positive statements in such strategy documents— they all sound like such good things, don’t they? 

In interrogating myself about why I felt underwhelmed and discontented with the outline I read, I decided to force myself to articulate what I thought should be the central remit of a university, and what I thought was missing.  Everyone is emphasizing  relevance, innovation, vision, and  how grounded their university is in their own community’s particular needs and sources of expertise.  It seems to be the zeitgeist. Maybe that is just the sign of  good ideas taking off.  And the language is so vigorous, forward thinking and optimistically engaged (albeit on the vague side).  Maybe I am just being churlish to be picking holes in the vision (`Churlish’ is just one of the kinder words that could be applied to me these days. I embrace it.).  But as I continued to think about it, I came up with at least two major areas where I believe universities have a solemn duty to contribute, but which have completely disappeared from public discourse.

The bottom line is that Universities are being conceived of  on the education side as engines for creating workforces, and on the research side as crucibles for technological innovation. The University is seen as the tool of Capital, and it is funded to the extent that it fuels Growth, and precisely the workforce that the holders of Capital need.   Those making the strategy decisions at the top will tell you that of course this is what the students want too, they want  jobs. Sure. That seems to be the bare minimum though. There are other things that the university in a mature democracy should be doing beyond that bare minimum. But these things have disappeared from strategy documents, or even the strategic thinking of educators and they are actively being eroded because of it.   I give them an airing here.

Education: Critical Thinking and the Challenge of True Democracy

This is clearly relevant for society these days! Just not for getting you a job.

I think democracy is hard and it’s fragile. But it’s the only system worth having and democracies need to work actively to maintain its health. Democracy and voting rely on a nation’s communities having access to education and information, and a sense of responsibility about what is at stake. In most university programmes before the Dark Ages, any degree that was taught by good lecturers provided transferrable skills of critical thinking and assessment of argumentation, and knowledge of how to go about reliably finding out whether something is true.  In the modern age, we have our own special problems concerning how information is disseminated and checked, and in the rise of propaganda and the difficulty of escaping bubbles. Our universities need to take the lead on giving people the skills to be able to navigate the increasingly tricky situation of finding reliable information sources, and also in recognizing bias, learning how to overcome emotion or prejudice in assessing arguments, etc. Our universities also need to take the lead in actively supporting the humanities (by which I mean not just not taking money away but actually channeling money into it). Because every young person needs to understand the cultural and historical context of the world they will be living (and voting!) in, and they need to be exposed to other minds and voices through fiction, which develops empathy and helps transcend tribalism.   (At MIT, where I did my undergraduate education, we had a humanities requirement. Everyone had to take a couple of humanities courses, of their choice, regardless of whether they were majoring in Engineering, or Math or Chemistry or whatever. I think this should be built in to all university degrees).

I am not so naïve to think I will ever convince a country or a university to make this their major remit. But I sure would like to see it as a bullet point in a strategy document.

Education (Lifelong): Tools  for Understanding,  Personal Development and Satisfaction

If you get down to first principles, growth and the economy are not really the things we need as a society. We want everyone to have the necessities of life and to have the opportunities to live full, happy and fulfilled lives. The modern world is  one where we humans have an increasing proportion of time that we can devote to leisure, because hard time consuming physical labour has been taken over by machines and tedious labour by computers.

How do we make ourselves happy?  

Education gives people tools and resources to keep learning, and understanding their world.  This leads to happiness.  Curiosity-based learning, acquisition of new skills,  leads to the appreciation of complex and satisfying forms of leisure and helps us be less bored and passive in our consumption of entertainment.  

Education is not job training. Education is something that human minds thrive on and we do it not to promote growth or technological advance, but just because human minds love to be so engaged.

I would love to rethink the remit of modern universities based around feeding curiosity and developing young people’s skills and resources with the goal of helping them find the thing that they are good at. I am guessing that then, whatever they end up specializing in, they will find a niche in society where they can do a job that contributes to the society’s goals and that makes them happy.   

Research: Curiosity Based Research

When it comes to research, the primary emphasis should be on curiosity driven research. And there should be no competition for grants and funding.

Right now, senior academics as well as early career researchers are forced into an endless cycle of applying for grants and producing publication points. The grants nowadays are skewed towards societal `relevance´ and impact, regardless of whether this fits in with the researcher´s own set of scientific questions. At first, a decade or so ago,  we just added an Impact paragraph to our applications, but now, increasingly, the whole research agenda must be re-thought and new strands of inquiry invented just to get on the grant bandwagon. Primary basic research is not really respected unless it brings in big grant money, and big grant money increasingly depends on subjective decisions of relevance and coolness. The standards of grant applications get higher and higher. Most of these grants are worthy and interesting, and if history of science tells us anything, it is generally impossible to predict what new thoughts or ideas are going to lead to big advances in some body of knowledge. In the mean time, the lottery for what gets funded is driven by forces that are random at best and skewed in an overly superficial direction at worst. And most importantly, researchers´ time is eaten up in this fruitless and soul destroying activity.

I read somewhere that if you take all the money that is spent in organizing the application system, the reviewing system and the administration and reporting of grants across the academic world, you could just give EVERYONE a research grant and resources to pursue a question and save big bucks.

Well, I guess that´s not going to happen. But the system is rotten and we all know it.  It is built on competition and insecurity. Young and early career researchers are experiencing financial insecurity, stress and burnout from increasingly unrealistic expectations,  with very little in terms of intellectual reward.

Climate Change and Man’s Relationship to the Planet

Given the scientific consensus on this, I am sort of surprised that a University like the one I am in,  that wants to be a leader on the North and the Arctic,  does not explicitly come out and say that it wants to lead on helping to reverse the damage we have done to the planet and mitigate the effects of the ongoing climate crisis, especially as some of the clear first signs of polar melt etc. are Arctic issues. Maybe this will come in the details of the strategy document that I am not going to sign up to help write.  But I am not holding my breath. It would probably be considered too political to state such a remit. Although the scientists do seem to agree that these are basic facts, not opinions.  I suspect that the present emphasis on local rootedness is directly connected to universities´ non engagement with issues of a global, universal nature. The world has become increasingly globalized. As long as universities shy away from the big hard questions and see their remit as providing growth, jobs and research grant money to their own local patch, they will not be the engine for critical pushback and change that we so desperately need.

But Some Things Have Got Better, Right?

Since the nineties, some things have improved in certain parts of the world. Rights for LGBT+ trans folks have improved, and diversity in the higher echelons of power in terms of representation of women, people of colour, etc. has improved somewhat (I feel personally that the status of women in academia has stagnated somewhat since the nineties and has lagged behind other kinds of progress).   Access to higher education has improved in many parts of the world.  In many arenas, new, fresh and progressive voices are being heard for the first time above the drone of the wisdom of the perennially entitled.  This is as it should be. As society changes and the people who were not privileged from birth come to have access to education, so will there be changed discourse and re-evaluations and upheaval.  This is also what universities are for. But  I fear that these forces are being managed and de-toothed as we speak. And even access to privilege through education is being clawed back on two fronts— both by turning universities into job making factories, and also by containing and demoralizing its employees who strive to teach and think while getting grants, being relevant,  and preventing their academic areas from getting the axe (is my course popular enough? Does this degree add value in terms of increasing the projected salary of those who take it?). 

The `Dark Ages´ in my long saga refers to our modern era with its commodification of intellectual capital and the Control of Academe by those who currently control the economy.

This piece will no doubt read to most of you who make it this far, as  quixotic, irresponsibly naïve and deeply impractical. But I would remind you that as a GEN X-er (who are they, again?), I actually do remember a time when these things were explicitly talked about in educational circles. So these ideas had not yet vanished from the discourse when I was an undergraduate. And they are not inherently impractical either.  I have watched the narrative shift, continuously and inexorably (just as the political narrative has shifted), to the extent that we have all been made to swallow as a basic premise the idea that anything other than the worklife relevance zeitgeist is untenable (just like the false belief that anything other than the free market and global capital is an untenable system— the two narratives are btw not unrelated).   

Petrarch talked about the Dark Ages primarily in relation to the light that had come before in the form of classical antiquity, not in relationship to the Enlightenment to come. I don’t know if there is any backlash on the horizon that could lead us out of  the stranglehold that this package of ideas has on the world at the moment.  I do not know what it would take to turn this particular boat around. I fear that things will have to get a lot lot worse before they will be allowed to get better.  But I am in the market for ideas!

Minimalism and the Syntax-Semantics Interface: Part IV Formal Semantics vs. I-Semantics

So far we have argued that the formal semanticists use of an intermediate logical language (the semantic representation) as discussed in earlier posts, is widely considered by the field to be at the level of a computational theory in the sense of Marr 1982, and is not intended to encode internal mental representations of meaning in any psychologically real fashion.

So what understanding of the human mind then do we gain from a study of the syntax-semantics interface construed in this way? The whole enterprise is made more difficult by the fact that we are essentially attempting to solve an equation in three unknowns: we don’t yet know what syntactic representations (their primes and primitive operations) actually look like, we don’t know what those abstract language specific semantic representations look like, and we do not understand the principles of the mapping between the two, except that we know that they must be systematic.

The history of generative grammar shows that there are a multiplicity of different formal proposals concerning what syntactic representations actually look like, with no emerging consensus currently in sight. And we can see from the history of formal semantics as well that the mapping rules change drastically depending on the type of syntactic theory it is interfacing with (cf. Lechner 2015; Partee 2014). The Semantic representation language was taken over from formal logic systems and it too has adapted slowly over time to form a better fit for the syntax (the particular kinds of syntax) that the formal semanticists are mapping from. As the history of syntactic theorizing has shown, there is always a choice between enriching the syntactic representation, or enriching the mapping rules between it and the semantic representation language. Within generative grammars alone, at least two different trends can be distinguished: more derivational and/or abstract syntactic theories whose abstractions in the form of covert rules and implicit structures (the Logical Forms of classic GB syntax, but also abstractness in the form of empty categories and implicit or unpronounced structure) are motivated by generalizations over interpretations; less abstract, more direct and monostratal syntactic representations (e.g. Categorial Grammars, Lexical Functional Grammar, Montague Grammar itself, and Head-driven Phrase Structure Grammar) form the input to mapping rules which in turn must be more flexible and rich in their output possibilities. It easy to see that in this kind of situation, the nature of the mapping rules and the intermediate representations built can be quite different from each other. The primes of the semantic representation language are also subject to variability from the pragmatics side. Every time a proposal is made about a pragmatic principle that can deliver the correct truth conditional results from a more indeterminate semantic representation, this also forces the primes of the semantic representation to be adjusted (e.g. see the effect that Discourse Representation Theory had on the interpretation of the definite and indefinite articles in English). Every time a change occurs in one of these three areas, the details of the whole package shift slightly. The only thing that remains constant is the anchoring in truth conditions. We have to get there in the end, but if these are purely computational or instrumental theories, then we should not put too much stock in exactly how we get there, implementationally speaking. Even compositionality, as a Fregean principle constraining the relationship between syntactic and semantic representations (see Heim and Kratzer 1998) can always be saved by employing the lambda calculus (Church 1936)— a mathematical innovation which allows the decomposition of the logical representation of complex expressions into pieces (higher order functions) that can match the constituents that syntax provides (whatever those turn out to be). So compositionality in this technical sense turns out not to be the criterion according to which these theories can be distinguished from each other. Only if we believe that these convenient semantic representations that we posit have some kind of cognitive or algorithmic reality, or that at least there is some cognitive reality to the boundaries being drawn between the different components, is the specific research area the syntax-semantics interface distinguishable from formal semantics simpliciter. In fact, most formal semanticists are unwilling to do any `neck baring’ let alone `sticking out’, in the area of psychological or cognitive prediction.

Unlike the formal semanticists proper, those of us working at the interface are interested in intermediate representations that we believe bear some organic relation to actual representations in actual minds. For many of us, the quest is to understand how the syntactic system of language discretizes and categorizes in order to create a workable symbolic tool for the mind, while still liaising with the brain’s general, language independent cognitive machinery.

Of special note here is a recent minority movement within formal semantics/philosophy of language towards exploring natural language ontology (Moltmann 2017, 2020). Moltmann in particular has argued that natural language ontology is an important domain within descriptive metaphysics (using the term from Strawson 1959), which is distinct from the kind of foundational metaphysics that the philosophical tradition tends to engage itself in with its spare ontological commitments of Truth and Reference. I see natural language ontology as primarily interrogating our assumptions about the nature of intermediate Semantic Representation that mediates between syntax and truth evaluable representations part, building its primes based on the ontological commitments implicit in natural language(s) itself. As Fine (2017) argues, there is a case to be made that progress in foundational metaphysics relies on a close and nuanced understanding of the descriptive metaphysics involved in natural language ontologies. But even if that were not the case, it seems to me that the project of natural language ontology is crucial if we are to understand the compositional products of meaning and meaning building in language and the mechanisms by which it is embedded in our cognition and cognitive processing more generally. The spare and elegant axiomatization of semantic descriptions anchored just in truth and reference to particulars simply does not do justice to content and partial and incremental contents that we see in language. Exploring natural language ontology in its own right, taking the internal evidence as primary is a prerequisite to getting this kind of deeper understanding. Thus, even though we might think of the syn-semE as a computational theory, we can still have the goal of developing a language of primitives on the Semantic Representation side that is more responsive to the implicit categorization found in natural language. Formal semantics took its initial language for the semantic representation from formal logics, but has also repurposed that representation over time to fit natural language better. The research area of natural language ontology takes that goal to its natural conclusion and questions the basic ontology of these representations, and potentially moves the model closer to one that will eventually be more commensurate with cognitive and neurolinguistic theories.

In turn, the patterns that emerge robustly from this kind of natural language investigation, provide clues to both the nature of language itself but to the realities of the cognitive systems that it is embedded in. In part I, I laid out three types of question for I-semantics: Type A questions concerning descriptive generalizations relating semantic systems and the cognitive system they feed; Type B questions related to acquisition and cognitive development; Type C questions concerning the feed back effects of having a language on the very cognitive systems that it subserves. I close this post with a number of examples of phenomena that I think count as instance of Type A generalizations. Note that the existence of these `universals’ would be a surprising fact if general cognition were just one symmetric side of a listed form-meaning pairing. While there seem for example to be no deep generalizations concerning how syntactic primitives are mapped to externalized signals, there are candidates for universals in the mapping to I-semantics. I give some possible candidates in the following list:
(i) Without exception crosslinguistically tense information is represented hierarchically in the syntax outside of causation in the verbal domain, and referential facts such as novelty or familiarity of reference are represented outside of size, colour and substance in the nominal domain (see Julien 2002)
(ii) All human languages make category distinctions within their lexical inventory, minimally N(oun) vs. V(erb) (Baker 2003), and we know that these kinds of syntactic category distinctions cannot be predicted from external facts about the world. But what is this a discretization of in our I-semantics of the world?
(iii) All human languages show open-ended combinatorical ability of open class items to build creative new meanings.
(iv)Semantic modes of combination can be classified minimally into selectional, modificational and quantificational relationships. In other words, even though there is no single semantic combinatoric nexus that will cover all the attested forms of semantic combination, there seems to be a restricted set of semantic nexus types that all languages seem to use (see Higginbotham 1985; Jackendoff 2002) conditioned in systematic ways by syntax.
(v) Quantificational relationships in the semantics always correspond to a particular hierarchical format in the syntax, with the restrictor of the quantifier in combination with the operator, and the scope of the quantifier combined with that. This correlates with the semantic conservativity of all natural language quantifiers (Barwise and Cooper 1981, Lewis 1975).
(vi) The semantics of scalar structure is tracked by linguistic formatives across the syntactic categories of N(oun), V(erb), A(djective) and P(reposition), in all the languages that have been studied.


These are basic empirical generalizations at a fairly abstract level about how human languages compile meanings, and independent of the existence of the Minimalist programme, these are things that it seems to me are the job of the theoretical linguist to pursue in some way. Thus, properly understood, the Minimalist Programme does carve out an interesting and important domain of inquiry, one that might legitimately be called the syntax-semantics interface (Syn-SemI).

Minimalism and the Syntax-Semantics Interface Part III: I-Semantics

There are two ways of thinking about the syntax-semantics interface. One involves grounding meaning of sentences in descriptions of the world that make them true and investigating the feeding relationship between syntactic representations and truthmakers in this sense. This is a descriptive enterprise, and gives rise to computational proposals involving language specific abstract semantic representations mediating between the syntactic representation and truth conditions. We could call this Syn-SemE. This is not inconsistent with the strong minimalist thesis construed in the way I have argued it can be in Part II, but it is not directly pursuing it.

It is important to reiterate that the formal semanticists use of an intermediate logical language (the semantic representation) is widely considered by the field to be at the level of a computational theory in the sense of Marr (1982), and is not intended to encode internal mental representations of meaning in any psychologically real fashion.

I suspect however, that many people working at what they think of as the syntax-semantics interface think they are doing something different from straight up formal semantics. Unlike the classical formal semanticist, some of these do not use the semantic representational node as merely an instrumental device to mediate between the syntax and truth conditions (seeking accuracy, efficiency and elegance in doing so but no more). They are often also interested in exploring the relation between two systems of representation both of which are internal, and in understanding how the syntactic system of language discretizes and categorizes in order to create a workable symbolic tool for the mind, while still liaising with the brain’s general, language independent cognitive machinery. (I think, for example, that this is what Jackendoff in his work means by Semantics.)

An internalized semantic representation system that is tightly coupled to linguistic representations could be called I-semantics, since it does not represent facts about the world, but is an aspect of the mature I-language system of an individual speaker/hearer (in Chomsky’s 1986 sense in Knowledge of Language).

Thus, the other way of looking at the syntax-semantics interface is in terms of. seeking explanations for the way syntax turns out to be. For this particular set of research questions we are interested in humans’ mental representations of meaning as the mutually determining factor interfacing with syntax. We could call this Syn-SemI . The pursuit of the questions concerning this latter interface could be considered a direct pursuit of a strong minimalist agenda. It will inevitably feed off the results of Syn-SemE, as described above, but also of the results of cognitive science, psycho and neuro-linguistics.

When it comes to explanatory influence, there is I contend, a clear asymmetry between the two interfaces classically referred to in minimalist theorizing. The externalization of the system is a highly variable and contingent, and known to preserve functionality across the modality of auditory vs. gestural sign. Externalization per se is arguably one of the design factors of language, but the exact mode of externalization demonstrably not. On the other hand, the domain of generalized cognition that syntax is embedded within presumably does not and cannot vary from language to language. Our highly complex human systems of thought and categorization are the source domain for language, and language is an extension of our superior cognitive abilities in apprising, categorizing the world. The human mind is the crucible within which language evolved in the first place, and while one does not need to accept the idea that language `evolved for communication’ of our thoughts about the world, it certainly plays a symbiotic role in our ability to both cognize and represent our own thoughts to ourselves.

Importantly, human language is not just a big bag of conventionalized symbols triggered by episodic stimulus in the world, it differs from other living creatures’ signalling systems in a number of striking ways, and share properties with those systems in others. Arbitrariness of the sign, and externalization of signal can be found in systems throughout the natural world, from monkey calls to mating dances, from songbird tunes to pheromones. But these collections of arbitrary signs do not have a syntax, and they do not systematically require the detailed tracking of the perspective of other minds; the property of creative open ended composition of meaning is unique to humans (apparently).

It is important therefore to emphasize that mere sign-sign relationships (systematic combinatorics) is not sufficient to create the natural language `magic’. Birdsong has been shown to have some form of syntax, in the sense of brute combinatorics, but does not have semanticity. By this I mean that the units that combine and relate to each other in systematic ways do not correspond to meaning units that also undergo composition in parallel. It is the combination of syntax and semanticity that creates open ended meaning composition which is the core innovation of the human species (cf. also Miyagawa et al. 2014 ). The recursive composition process that syntax affords is the feature that delivers creative meaning composition that can be coded and decoded reliably by human minds.

So in terms of deep properties of natural languages (NLs), I would argue that the interface with the internal systems of cognition is not an interface that is completely parallel to the interface with sound perception and production. The former is the interface with the source domain for the content being represented, while the latter is the contingent mode of externalization. Intuitively, the point of language is not to produce sound (this, for example, might be point of music), the point of language is the expression of internal patterns of thought (whether for oneself or others).

If we are to probe the question of what particular aspects of the mature working grammar emerge based on independent features of human thought, then we need to ask a different kind of question about the nature of the representations that the mature syntactic competence traffics in. Broadly speaking, we could characterize these questions as follows:

A. Are there mutually determining relations in the way syntax maps to thought, either with respect to primitive categories or the relations between them?

B. Are these hardwired or developmentally guided? What is the scope of variation in what is learnable in this domain?

C. To what extent do the features and categories reified in the syntax feed back into general cognition and start having an effect on our abilities to think abstractly and creatively?

These questions are at the heart of what I take to be the field of syntax-semantics within a specifically minimalist agenda, and of morphosemantics too, if morphology is a kind of syntax. Crucially, since the minimalist programme is stated in terms of the mind/brain of the individual, the core questions here refer to internal representations of meaning. So the Syn-Sem interface here is actually a somewhat different research programme from the one formal semantics is classically engaged, and is based on an understanding of what I-Semantics looks like. Unlike the intermediate representations of E-semantics, there is a right or wrong of the matter when it comes to postulating I-semantic representations. This already brings its own methodological differences and challenges. More of which in my next post.

Minimalism and the Syntax-Semantics Interface Part II: The Autonomy of Syntax

The question of semantics and its relationship to syntax is often confusingly entangled with questions concerning the autonomy of syntax, controversies around the nature and scope of Minimalism’s universalist claims, and further confounded by the many different implicit definitions of what semantics itself is taken to be.

At a descriptive level, language systems are complex, involving both form-based facts as well as generalizations related to effects on truth conditions. The overall system is clearly modular. Some effects on truth conditions are not part of the syntax itself but of pragmatics and inferential processes. Some effects on acoustics/articulation have to do with phonetic implementational algorithms that are not part of the narrow linguistic computation. At least when it comes to inputs and outputs anchored in the external world, we can pose objectively clear questions. For sound, the externalized and objective acoustic reality is measurable and quantifiable, and we can ask questions about the gap between that measurable output/stimulus and the internal representations that generate it (or perceive it). Similarly, meanings of sentences can be anchored through detailed descriptions of the external realities that make them true. We can thus ask about the relationship (gap) between the linguistic representations produced/comprehended and the truth conditions associated with them in context. These interface questions are important questions of synchronic grammatical description. So for meaning specifically we can ask what part of the mature speaker’s behaviour is controlled directly by their linguistic competence, and what part is controlled by their pragmatic competence. Our pragmatic competence of course is finely tuned to subserve language, but it also arguably operates in non linguistic domains.

But, I would argue that when it comes to the nature of the syntax-semantics interface. specifically, it is much less clear what the question is about, since linguists seem to disagree on what is meant by these labels, and they do not have neutral architecture/ideology independent definitions.

In a mature grammar, syntax as a computational system operates with its own relations and primitives which, by hypothesis, are not reducible to either semantics or phonology. This is the gist of the autonomy of syntax proposal put forward by Chomsky as early as 1957 in Syntactic Structures, and maintained explicitly by Chomsky in 1982. Adger ( 2017) gives the following articulation of the principle:

“. . . syntax as a computational system that interfaces with both semantics and phonology but whose functioning (that is the computations that are allowed by the system) is not affected by factors external to it“. Adger (2017)

In terms of the functioning of a particular grammar as part of a person´s synchronic competence, it is relevant to ask what sort of system the combinatorial engine is, and what determines its functioning. For example, if we consider the interface with systems of externalization such as phonology, most would agree that syntax does not have rules like the following (1)

(1) Allow a syntactic unit which begins with a plosive to syntactically bind a syntactic unit that begins with a liquid, but not vice versa.

It appears that a kind of modularity that operates between the phonological system and its categories and relationships, and the syntactic system proper, which operates in a way that is blind to the difference between a stop and a liquid. This makes sense given that modes of externalization can vary, and that externalizing via sign for example gives rise to human language, with all the distinctive characteristics we are trying to understand, just the same as vocalized externalizations. In each case, the internal representations have to be systematically externalized for uptake by others, and then decoded and interpreted reliably by their own minds.

Do we have a parallel to (1) when it comes to semantics? Indeed, nobody believes that there should be a rule of syntax that distinguishes directly in terms of referential content.

(2) Perform syntactic operation R on a syntactic unit A, if it refers to a mammal in an actual context of use.

This would cause syntax to treat some of its nominal units the same way based on whether they ended up referring to a mammal in a particular utterance. It would mean that the name Fido would have to behave differently syntactically if it were the name for a dog, or for my pet lizard. And it would force my dogto pattern with one of those Fidos and not with my lizard . It seems bizarre to think of a language with such syntactic rules.

This seems fully parallel to the non-rule of syntax making reference to phonological segments, and at first blush argues for a parallel autonomy of syntax from semantics. But is this the right, or equivalent, analogy? Phonological features are, after all, abstract mentally represented generalizations over auditory percepts, not acoustic reference itself. It is important to note then, that there are cases of syntactic rules that appear to make reference to `semantic’, or interpretable features. For example in languages that care about animacy for subject selection. Even in those cases though, language seems to care about the abstract syntactic category rather than the actual denotational facts, or even conceptual category facts about what the culture conceives of as `animate’ or not, so that mismatches are found between abstract linguistic classification and cognitive judgements.

Consider also, as in (3) here, a toy rule which needs to make reference to a [+Q] feature, correlated with being interpreted as a question.

(3) Move the syntactic element in T to C if the latter bears the feature [+Q]

But, this is not a counterexample to the autonomy of syntax from semantics because [+Q] is a syntactic feature by hypothesis. And though it might systematically be translated via an abstract interrogative semantic representation, it is not exceptionlessly correlated with an actual questioning speech act in practice.

Pragmatics may intervene as in (4-a) so that the outcome of [+Q] being present in the structure is not actually a request for information; conversely, the request for information in (4-b) does not in fact require the syntactic feature [+Q] that is responsible for the movement of an overt tensed element past the subject. (Although distinctive intonation may be present, as is well known, this is actually dissociable from overt question-movement).

(4) a. Is the pope catholic?

b. You broke my favourite vase?!

It is widely acknowledged that the actual form of utterances radically underdetermines their truth conditions. Aspects of context, anchoring of indexical elements, resolution of anaphoric dependencies and conversational implicatures triggered by the particular the discourse context are all required before truth conditions can be specified. Nevertheless, the syntactic representation does provide a foundational skeleton of meaning contribution which is an important ingredient of the concrete meaning intended and apprised in context.

Because of their inter-subjectivity, truth conditions have seemed a convenient, plausible (and indeed necessary) way of grounding discussions about what sentences of a natural language `mean'. There is a well established use of the term semantics to pick out facts of reference and truth in an external, non-linguistic domain. I will refer to this use of the term semantics as E-semantics. The use of the feature [+Q] in syntactic theorizing can never be replaced by E-semantic facts concerning requests for information. Obviously. But let us imagine that we can isolate a contribution to truthmaking that was always the `translation’ of the syntactic feature [+Q], as a component of some kind of intermediate semantic representation. Syntax would still be autonomous in the sense that it manipulates syntax-specific units of syntactic representation, it is the translation algorithm that maps these syntactically active features onto something regular in a corresponding (intermediate, and still language-specific) semantic representation.

Very many features standardly assumed in syntactic representations are suggestively labeled with words that gesture towards a kind of interpretation (interpretable features), but which are fully paid up syntactic club members in practice. Their status as [+interpretable] only refers to the fact that such a feature is in the domain of the translation function from the syntactic representation to whatever representational form the underspecified linguistic `meaning’ occurs in.

I suspect that most working linguists and semanticists believe that there is an independent semantic representation which operates with different primes and primitives from syntax, and the question of the syntax-semantics interface is a question of how the primes and primitives of the one kind of representation translate into the other. The challenge here is to understand the systematicity of that relationship, which somehow must hold, if children are to acquire the ability to creatively generate meanings of complex utterances from component parts. Given the role of context, no direct mapping between form and E-semantics is possible. But if there were an intermediate semantic representation generated by a systematic translation algorithm, then the gap to E-semantics could be filled in by studying the systematic relationship between that semantic representation and actual truth conditions. (This latter is what I take to be the traditional understanding of the field of pragmatics.)

One standard understanding of the syntax-semantics interface, then, is the study of the translation algorithm that operates between syntactic and this intermediate semantic modes of representation; pragmatics is the study of the inferential processes that fill the gaps between the intermediate semantic representation and the fully precisified representations that can be paired up with truthmakers.

What is the status of this intermediate semantic representation itself? If it is on the language side of things, then doesn’t that mean that the strong minimalist thesis is false?. If it is on the non-language side of things, then its abstractness, its variability from language to language (which I think is an undeniable empirical fact), and its mismatch with non-linguistic categories of meaning are difficult to account for.

Posing the question in this way however, would be in my opinion, a misapplication of the strong minimalist thesis. For the strong minimalist thesis is not intended to hold at the level of descriptive modularity in the mature speaker’s system of competence, but rather refers to the role that other properties of mind/brain play in how the final system emerges. It refers to how we understand the initial state of the language faculty, and what (if anything) we need to put in there to ensure that the language systems we describe have the properties that they do. The strong minimalist thesis is about explanatory modularity, about what is present innately in the human brain that makes language possible. It differs from earlier incarnations of Chomskian writings in postulating a larger role for independent properties of mind/brain (which also could be unique to us, and also possibly innate). The minimalist programme says it is more `minimal’ to explain language properties through things we have to assume anyway about human minds, than to invoke language specific devices.

Barbara Partee, in one of her recent papers on the history of formal semantics within the generative paradigm makes the same point, in attempting to explain why Chomsky’s own attitude towards formal semantics (of e.g. the Montagovian type) has been often quite ambivalent:

“. . . it has seemed to me that it was partly a reaction to a perceived attack on the autonomy of syntax, even though syntax is descriptively autonomous in Montague grammar. But syntax is not explanatorily autonomous in Montague grammar, or in any formal semantics, and I do not see any rational basis for believing that it should be. The child learns syntax and semantics simultaneously, with undoubtedly a great deal of `innate knowledge’ guiding the acquisition of both (Partee 2014 pg 9)”

If we take the strong minimalist thesis in the sense of explanatory autonomy, then it is still perfectly consistent with that thesis to assume a language specific system of abstract semantic representations, correlated with syntactic forms, and not identical to the way non-linguistic cognition is structured, but in turn interacting with it. The mature system, the semantic representations and their internal vocabulary are in some sense hybrid representations that are not the same as that provided by cognition more generally, because they have been constructed over the course of acquisition to interface with syntax in order to solve the particular problem of codification and creativity. To quote Partee (2014) again,

“. . . syntax should provide the relevant `part – whole’ structure for compositionality to work.”

One can work on problems of the syn-sem interface in this sense without that being inconsistent or contradictory with the strong minimalist thesis. That is because it is perfectly possible that the final complexity of the system is emergent based on some rather simple abstract initial ingredients, only a small part of which is unique to language itself. The way in which language is set up is heavily determined by the need to interact with non-linguistic cognition (among other things), and this constrains the concrete systems that emerge in practice. Still, I would say that someone working on the syntax-semantics interface in the sense of constructing a computational theory of the mature grammar is not actually pursuing the minimalist agenda directly, even though they might be sympathetic to it or have it in the back of their mind as an important project. The descriptive patterns of actual syntaxes and generalizations about how they map to our agreed format for semantic representations are however surely part of the data that will be important empirical ground for other families of theories exploring the question of the explanatory role of cognition in constraining the general form of natural languages. So the existence of the subfield of research exploring the syntax-semantics interface is not a threat to either the autonomy of syntax , or in contradiction to the strong minimalist agenda.

In Part III, I will explore a somewhat different approach to the syntax-semantics interface, which is more directly engaged with the other project of exploring the questions of explanatory modularity at the heart of the strong minimalist agenda.