Minimalism and the Syntax-Semantics Interface: Part IV Formal Semantics vs. I-Semantics

So far we have argued that the formal semanticists use of an intermediate logical language (the semantic representation) as discussed in earlier posts, is widely considered by the field to be at the level of a computational theory in the sense of Marr 1982, and is not intended to encode internal mental representations of meaning in any psychologically real fashion.

So what understanding of the human mind then do we gain from a study of the syntax-semantics interface construed in this way? The whole enterprise is made more difficult by the fact that we are essentially attempting to solve an equation in three unknowns: we don’t yet know what syntactic representations (their primes and primitive operations) actually look like, we don’t know what those abstract language specific semantic representations look like, and we do not understand the principles of the mapping between the two, except that we know that they must be systematic.

The history of generative grammar shows that there are a multiplicity of different formal proposals concerning what syntactic representations actually look like, with no emerging consensus currently in sight. And we can see from the history of formal semantics as well that the mapping rules change drastically depending on the type of syntactic theory it is interfacing with (cf. Lechner 2015; Partee 2014). The Semantic representation language was taken over from formal logic systems and it too has adapted slowly over time to form a better fit for the syntax (the particular kinds of syntax) that the formal semanticists are mapping from. As the history of syntactic theorizing has shown, there is always a choice between enriching the syntactic representation, or enriching the mapping rules between it and the semantic representation language. Within generative grammars alone, at least two different trends can be distinguished: more derivational and/or abstract syntactic theories whose abstractions in the form of covert rules and implicit structures (the Logical Forms of classic GB syntax, but also abstractness in the form of empty categories and implicit or unpronounced structure) are motivated by generalizations over interpretations; less abstract, more direct and monostratal syntactic representations (e.g. Categorial Grammars, Lexical Functional Grammar, Montague Grammar itself, and Head-driven Phrase Structure Grammar) form the input to mapping rules which in turn must be more flexible and rich in their output possibilities. It easy to see that in this kind of situation, the nature of the mapping rules and the intermediate representations built can be quite different from each other. The primes of the semantic representation language are also subject to variability from the pragmatics side. Every time a proposal is made about a pragmatic principle that can deliver the correct truth conditional results from a more indeterminate semantic representation, this also forces the primes of the semantic representation to be adjusted (e.g. see the effect that Discourse Representation Theory had on the interpretation of the definite and indefinite articles in English). Every time a change occurs in one of these three areas, the details of the whole package shift slightly. The only thing that remains constant is the anchoring in truth conditions. We have to get there in the end, but if these are purely computational or instrumental theories, then we should not put too much stock in exactly how we get there, implementationally speaking. Even compositionality, as a Fregean principle constraining the relationship between syntactic and semantic representations (see Heim and Kratzer 1998) can always be saved by employing the lambda calculus (Church 1936)— a mathematical innovation which allows the decomposition of the logical representation of complex expressions into pieces (higher order functions) that can match the constituents that syntax provides (whatever those turn out to be). So compositionality in this technical sense turns out not to be the criterion according to which these theories can be distinguished from each other. Only if we believe that these convenient semantic representations that we posit have some kind of cognitive or algorithmic reality, or that at least there is some cognitive reality to the boundaries being drawn between the different components, is the specific research area the syntax-semantics interface distinguishable from formal semantics simpliciter. In fact, most formal semanticists are unwilling to do any `neck baring’ let alone `sticking out’, in the area of psychological or cognitive prediction.

Unlike the formal semanticists proper, those of us working at the interface are interested in intermediate representations that we believe bear some organic relation to actual representations in actual minds. For many of us, the quest is to understand how the syntactic system of language discretizes and categorizes in order to create a workable symbolic tool for the mind, while still liaising with the brain’s general, language independent cognitive machinery.

Of special note here is a recent minority movement within formal semantics/philosophy of language towards exploring natural language ontology (Moltmann 2017, 2020). Moltmann in particular has argued that natural language ontology is an important domain within descriptive metaphysics (using the term from Strawson 1959), which is distinct from the kind of foundational metaphysics that the philosophical tradition tends to engage itself in with its spare ontological commitments of Truth and Reference. I see natural language ontology as primarily interrogating our assumptions about the nature of intermediate Semantic Representation that mediates between syntax and truth evaluable representations part, building its primes based on the ontological commitments implicit in natural language(s) itself. As Fine (2017) argues, there is a case to be made that progress in foundational metaphysics relies on a close and nuanced understanding of the descriptive metaphysics involved in natural language ontologies. But even if that were not the case, it seems to me that the project of natural language ontology is crucial if we are to understand the compositional products of meaning and meaning building in language and the mechanisms by which it is embedded in our cognition and cognitive processing more generally. The spare and elegant axiomatization of semantic descriptions anchored just in truth and reference to particulars simply does not do justice to content and partial and incremental contents that we see in language. Exploring natural language ontology in its own right, taking the internal evidence as primary is a prerequisite to getting this kind of deeper understanding. Thus, even though we might think of the syn-semE as a computational theory, we can still have the goal of developing a language of primitives on the Semantic Representation side that is more responsive to the implicit categorization found in natural language. Formal semantics took its initial language for the semantic representation from formal logics, but has also repurposed that representation over time to fit natural language better. The research area of natural language ontology takes that goal to its natural conclusion and questions the basic ontology of these representations, and potentially moves the model closer to one that will eventually be more commensurate with cognitive and neurolinguistic theories.

In turn, the patterns that emerge robustly from this kind of natural language investigation, provide clues to both the nature of language itself but to the realities of the cognitive systems that it is embedded in. In part I, I laid out three types of question for I-semantics: Type A questions concerning descriptive generalizations relating semantic systems and the cognitive system they feed; Type B questions related to acquisition and cognitive development; Type C questions concerning the feed back effects of having a language on the very cognitive systems that it subserves. I close this post with a number of examples of phenomena that I think count as instance of Type A generalizations. Note that the existence of these `universals’ would be a surprising fact if general cognition were just one symmetric side of a listed form-meaning pairing. While there seem for example to be no deep generalizations concerning how syntactic primitives are mapped to externalized signals, there are candidates for universals in the mapping to I-semantics. I give some possible candidates in the following list:
(i) Without exception crosslinguistically tense information is represented hierarchically in the syntax outside of causation in the verbal domain, and referential facts such as novelty or familiarity of reference are represented outside of size, colour and substance in the nominal domain (see Julien 2002)
(ii) All human languages make category distinctions within their lexical inventory, minimally N(oun) vs. V(erb) (Baker 2003), and we know that these kinds of syntactic category distinctions cannot be predicted from external facts about the world. But what is this a discretization of in our I-semantics of the world?
(iii) All human languages show open-ended combinatorical ability of open class items to build creative new meanings.
(iv)Semantic modes of combination can be classified minimally into selectional, modificational and quantificational relationships. In other words, even though there is no single semantic combinatoric nexus that will cover all the attested forms of semantic combination, there seems to be a restricted set of semantic nexus types that all languages seem to use (see Higginbotham 1985; Jackendoff 2002) conditioned in systematic ways by syntax.
(v) Quantificational relationships in the semantics always correspond to a particular hierarchical format in the syntax, with the restrictor of the quantifier in combination with the operator, and the scope of the quantifier combined with that. This correlates with the semantic conservativity of all natural language quantifiers (Barwise and Cooper 1981, Lewis 1975).
(vi) The semantics of scalar structure is tracked by linguistic formatives across the syntactic categories of N(oun), V(erb), A(djective) and P(reposition), in all the languages that have been studied.


These are basic empirical generalizations at a fairly abstract level about how human languages compile meanings, and independent of the existence of the Minimalist programme, these are things that it seems to me are the job of the theoretical linguist to pursue in some way. Thus, properly understood, the Minimalist Programme does carve out an interesting and important domain of inquiry, one that might legitimately be called the syntax-semantics interface (Syn-SemI).

14 thoughts on “Minimalism and the Syntax-Semantics Interface: Part IV Formal Semantics vs. I-Semantics

  1. I want to note that from the perspective of this syntactician, the list of formalisms you provided under “less abstract, more direct and monostratal syntactic representations” (Categorial Grammars, Lexical Functional Grammar, Montague Grammar itself, and Head-driven Phrase Structure Grammar) is very much a mixed bag. In particular as it concerns actual syntactic phenomena beyond constituency (by which I mean things like case and agreement; more generally, phenomena that are hierarchical and structure-dependent but are dissociable from any semantic concept or principle that has been proposed by anyone to date).

    While I don’t agree with some of the foundational assumptions of LFG and HPSG, and I even think some of these can be concretely argued against, nobody in their right mind would say that either LFG or HPSG are nonstarters as theories of syntactic phenomena proper. We might (and I would) quibble about the way they capture various things in that general space, but they can do it. This is emphatically not the case for Categorial Grammar and Montague Grammar. To date there hasn’t been even a serious attempt (that I am aware of) to confront some of the more complex case/agreement systems out there within these formalisms, and if that’s indeed so, I suspect it’s no accident.

    Now, you, or anybody else, is entirely within their rights to say, e.g., “For me, case and agreement are not the core phenomena; I want to start with XYZ instead, and we’ll get to case and agreement at some future time.” There is of course no one right way to heuristically approach a massive puzzle, and nobody yet has a working theory of everything. But insofar as we want to constrain the three unknowns that we’re trying to solve for (and I agree that we cannot fix any of them yet; there is yet far too much we don’t know; but that doesn’t mean we know nothing, or cannot constrain the relevant variables at all), I think we can probably toss out Categorial Grammar and Montague Grammar as possible solutions for the “syntax” member of that triad.

    Like

    • I would never dream of saying that case and agreement are not important aspects of syntactic systems! My list of alternative syntactic frameworks was more in the interest of covering the historical record than endorsing or distinguishing between those paradigms. I am fully convinced that neither then, nor now, are we in possession of the correct computational theory of syntactic competence. I also don´t think that as we get better and better descriptive coverage of certain phenomena within one particular framework, that that means that the framework itself is correct or more correct. It is quite hard to prove that a particular framework CANNOT account for a particular set of facts in principle, only that it doesn´t without this or that modification. Also, it is inconsistent to both believe that frameworks are computational theories and to argue for a particular one as opposed to another when they are provably computationally equivalent (as has been argued for example for representational vs. derivational implementations). But even when theories are (or can be easily supplemented and modified to be) computationally equivalent, it is still useful to have different theories out there in the conversation. This is because (and here I take my lessons from pure mathematics), translating a puzzle or phenomenon by homomorphism into a different space can often allow the scientist to see that puzzle in a different way and sometimes even find a solution where none was apparent before. But the point in the post above was more modest: given that there is still a lot of uncertainty about what even the correct computational theory of syntax should look like, it makes questions of the syntax.semantics interface methodologically more complicated, and means that they must crucially involve dialectic and open conversations among the interested parties. (Given that none of us individually can do all the bits.)

      Like

    • Can you elaborate a bit on what specific issues in morphosyntax would be challenging for CG? The theoretical core of CG, the thing that distinguishes it from all its competitors and brings about all its strengths and weaknesses, is flexible constituency + transparent semantics. Neither one is obviously at odds with morphosyntax. I concur that the CG literature doesn’t say much about it (side note: it is interesting how the success of CCG in NLP has led to more work on less commonly studied languages like Turkish), but that does not entail that CG and its modern iterations cannot say anything about it.

      Like

      • @Thomas: It’s precisely because semantics is decidedly *not* transparent that the syntactic representation over which case & agreement must be computed is very far from some straightforward transduction of semantics. Take, for example, the distribution of accusative case in Sakha. Correctly characterizing its distribution requires reference to how many other noun phrases whose case is not lexical were present in the same phase as the putative accusative noun phrase when the latter was in its highest A-position (note: neither its thematic position nor its surface position after scrambling/topicalization/etc. are what’s relevant).

        Like

      • @Omer (I hope this shows up in the right place, his reply to me doesn’t have a reply button (too many levels of embedding I assume (meta joke)):

        The properties you describe can be encoded in CFGs, so it can be encoded in CG. And I think it can be done without abandoning the core assumptions of CG, so this isn’t a case of what can be done “in the limit”, as you put it. Phases can be related to specific type configurations, and intermediate A-positions are already used for binding in Steedman’s textbook, I believe.

        That said, my understanding of (C)CG as a theory (in comparison to (C)CG as a formalism) is cursory at best. I would love to see this and other phenomena compiled into an explicit challenge to the (C)CG community, the replies should be very informative.

        Like

      • As I said, I don’t particularly care what “can be encoded in a CFG” as such. If the CFG has a gazillion categories (and by a gazillion, I don’t mean several dozen, like cartography, or even a few hundreds, like some instantiations of Nanosyntax, but literally 10-to-the-power-of-take-your-pick), then – as I noted in the other comment thread – the simple fact that there is a CFG that can capture these facts doesn’t hold much weight, in my view.

        Like

      • Aren’t you applying your own argument inconsistently? If I understand your earlier remarks correctly, you want to disregard the power of encodings and focus on what is natural in a formalism, yet you get hung up on how the solution would be encoded rather than whether it is natural given the foundational assumptions of CG, i.e. flexible constituency and semantic transparency.

        This only makes sense if you define naturalness in terms of encodings, which hasn’t worked well historically. Whenever somebody tries to turn their own aesthetic preferences into something measurable and systematic, we get scientific dead ends like the idea that SPE rule length = naturalness. CG could just as well use separate constraints, type hierarchies, and a lot of other stuff to keep the number of basic categories small. Doesn’t really change anything about the fundamental ideas, it just gives you a more succinct encoding.

        Like

      • @Thomas: I don’t think I am. You said the factors relevant to the distribution of accusative in Sakha (which is but one example, but let’s stick with it) “can be encoded in CFGs.” I think I’m being consistent when I say that isn’t all that important. Your last comment seems to add something new to the mix, in saying that Categorial Grammar could add “separate constraints, type hierarchies, and a lot of other stuff” to its formalism, such that the rules and constraints involved in capturing the Sakha pattern would have a succinct format – of the kind that the child would have any hope of acquiring, and of the kind that could capture the variation we do and do not see across languages. (These are the advantages that I think the Baker-Vinokurova account can claim.) If this is true, then yes, I would withdraw this particular empirical pattern as an exemplar of what’s wrong with Categorial Grammar. But is it? And would the succinct rules & constraints that Categorial Grammar would add be anything but an importation of Marantz’s dependent-case mechanism wholesale? (See my comment on the other “sub-thread” here, about theories contorting themselves in ways that betray their most basic design principles.)

        Like

      • @Omer: If this is about scientific tastes, then there isn’t much of a debate to be had. I read your original claim about CG as more authoritative, though: “nobody in their right mind would say that either LFG or HPSG are nonstarters as theories of syntactic phenomena proper. […] This is emphatically not the case for Categorial Grammar and Montague Grammar.”

        As for learnability and typology, we’ve been disagreeing on that for a long time and it’s probably why our views diverge here. You say “the format of rules and constraints matters, if only for language acquisition and typological predictions”. I hear that a lot, but it’s never backed up by concrete examples. Just one example of a learning algorithm that depends on whether the grammar uses constraints, features, categories, treelets, automata, or flying spaghetti monsters. One typological generalization that can only be succinctly expressed via feature checking. Nothing, nada. There are meaningful differences between formalisms, but they’re completely orthogonal to these encoding choices. Every learning algorithm under the sun builds on meta reasoning that is about the structure of the learning space, not the encoding format of the individual grammars. And that’s because the latter is not required for the former. Even spaces that are defined via specific encodings, e.g. automata for the 0-reversible string languages, do not require the grammars to actually be automata. Once we understand the structure of the space, the encoding can be anything. Encodings are methodological crutches, not ontological commitments.

        But as I said, we probably won’t ever reach common ground on that, and depending on which position one takes, adding constraints to CG is either a significant departure or just adding a convenient API that makes certain techniques easier to use.

        @Gillian: The back-and-forth dynamic you describe is exactly why I think it’s good that not everybody resides in my happy lala kumbaya land of live-and-let-live where we want as many perspectives as possible, not a single ring to rule them all. Even though I believe linguists’ obsession with encodings to be greatly misguided, it is at least productively misguided.

        Like

    • I also feel that Omer is applying his arguments inconsistently, although I can´t exactly put my finger on it because `betraying their design principles´ seems not clearly defined. Sure, CG did not bother to think about certain things while it was busy engaged in a conversation with syntacticians about other things (in this case, relevantly, the division of labour between the syntactic toolbox and the translation algorithm that gets you to interpretation). This is not to say that CG could not form the core of a theory which did have add-ons to deal with agreement, as @Thomas suggests. The question is whether those add ons amount to capitulation to the Other Person´s idea, or is merely an acknowledgement of the reality of another domain of the grammar that needs to be built in, and indeed maybe an implicit acceptance of the reality of the generalization that has been uncovered by the OtherTheory. I grew up in the Stanford grad school dept of linguistics, and was continually exposed to syntactic discussions/arguments of the form:Look at this phenomenon and how pretty it looks in my theory; this other theory simply cannot handle it! Followed by Other Theory coming up with a way of handling it by adding something that first person did not think of doing, because somehow one´s OWN theory is always full of potentials, and creative possibilities, while the Other Person´s theory is rigid and wrong headed from the start and doesn´t have a chance. This conviction seems to go both ways and never seemed to me to be based in reality. The only difference in my attitude between then and now is that in the old days I was frustrated by the repetition of recreating the same generalization in two (let´s face it pretty equivalent ) notations and architectures, instead of going out there and solving new problems. Now, I have a better appreciation of the value of intertranslation, meta-discussion and the fact that the tool box was not hegemonic and that people actually were reading each others´ work.

      Like

      • I think the two of you are confusing inconsistency of argumentation with the fact that fidelity to design principles is sometimes ineffable and/or dependent on one’s scientific tastes. The former would be a logical flaw; the latter just implies subjectivity. I agree that neither is ideal; nevertheless, there is a big difference between the two.

        And, more generally, I hope I have made the point that the format of rules and constraints matters, if only for language acquisition and typological predictions, and so computational equivalence at the limit is not of primary importance to someone interested in those two factors.

        As you may know, Gillian, I am reasonably literate in both HPSG and LFG, for precisely the kinds of reasons you mentioned earlier – namely, that not every paper depends on buying the totality of its framework’s assumptions wholesale, and no one theoretical framework has a monopoly on where the good ideas will come from. That is very different, though, from the claim that there’s no wrong or right in theoretical frameworks and formalisms, or that the choice doesn’t have consequences of the kind cognitive scientists should be deeply invested in.

        Like

  2. I don’t think provable equivalence at the limit is all that important, not even for a computational theory. That may seem like an odd position to take, so let me elaborate a little.

    In our neck of the scientific woods (and indeed, in most kinds of inquiry outside of mathematics), we are seldom in a position to refute anything. Usually, the closest we get to refutation is forcing the opposing theory to contort itself in ways that betray its most basic design principles. (E.g. when in response to data showing that topics can both precede and follow foci, a theoretical framework based on a strict and total ordering of projections reacts by adding a second TopP projection on the other side of FocP.)

    So, for example, I don’t think it’s particularly important that HPSG/LFG can capture the data that motivate A-movement analyses of raising and ECM by brutalizing the notion of “argument” until it is no longer recognizable. Yes, computationally speaking, it works; but it’s not completely clear to me why we should care.

    Is this an affront to proper Marr(ian)ism? I’m not sure it is. For one thing, Marr’s cash register doesn’t need to acquire the rules of addition. For all of Chomsky’s SMT bluster, you know I’m on the record that, qua a successful theory of syntax, that is a pipe dream. And if your take (from the previous posts is right) that the SMT is about the initial state only, then you too are certainly in for there needing to be a good amount of language-specific, and even syntactic, acquisition.

    The question is, does it make sense to talk about the computational format of a rule or constraint independent of the algorithmic level. At least as it comes to linguistic cognition, I think the answer is a clear yes. As many have pointed out before, Marr’s levels do not map cleanly onto the cognitive science of linguistics (for several different reasons), but, to the extent that they do, the implementational level is wetware, the algorithmic level is the procedural implementation of production and comprehension (a.k.a. parsing), and the computational level is, well, the computational characterization of linguistic competence. I don’t think it makes sense, then, to characterize language acquisition relative to the algorithmic level. (Or, more accurately, it doesn’t make sense to characterize language acquisition relative to the algorithmic level alone.) And so language acquisition should be characterized (at least partially) in relation to the computational level. And so the format of rules and constraints matter, even if they end up computationally equivalent in the addition-in-base-10-vs.-addition-in-base-2 sense.

    It’s for this reason that I emphatically disagree with you, and maintain that one can take GB/Minimalism, HPSG, and LFG to be computational theories, and to be provably equivalent at the limit, and nevertheless argue in favor of one and against the other.

    (Btw, thanks so much for writing these posts, it’s super thought provoking and I’m very much enjoying them, as well as the discussion with you.)

    Like

    • You´re welcome! And thank you for being so willing to engage—- this is very enjoyable.

      On the point of provable equivalence, I still feel that most generative syntacticians are guilty of having their cake and eating it too. In other words, they insist that they cannot be held to account by, or be responsible for the results coming out of many production and comprehension studies, on the grounds of well `Duh., Marr. This is a computational theory. And by the way, competence not performance´. But at the same time, they argue that their theory is better than the computationally equivalent competitor because of greater explanatoriness and psychological plausibility. But one should not get to do that if one is not willing to be precise about exactly what kind of predictions one´s theory does make for production planning and comprehension. ALL data is performance, including grammaticality judgements. These behaviours being collected by psycholinguists are also evidence for the internal system that generates them (in combination with other performance systems). But the syntactician does not get to claim credit for being more explanatory unless it agrees to make predictions. Which I find most people are reluctant to do in any concrete way. They are also in many cases reluctant to engage with and take on board the data from these other domains. My point is that you don´t get to have it both ways.

      Now, I agree there is a space where you can claim that your theory is computational but still argue that it is a more useful and explanatory one because some aspects of its architecture or primitives mirror what we think we know about a more psychologically plausible implementation. For example, one could claim that the part-whole structure of the theory maps on to the part-whole structure of the psychologically correct implementation, or that primitives and processes in the computational theory pick out the things that are also natural classes of phenomena in the mental computation. But as far as I see it, nobody is being quite this explicit, because (i) explicitness makes one responsible for those predictions in a more direct way and also (ii) requires one to pay attention to a huge literature which by parity should then affect one´s theoretical choices. It seems that most people are just not willing to go down that road. Fair enough! But then you don´t get to make claims of explanatory superiority.

      My personal preference would be for syntacticians and psycholinguists to be in tighter dialogue, so that such correspondences can be tested and debunked and retooled in a dialectal fashion. But it won´t happen if we stick to the cage we have built for ourselves out of Marr. A cage, incidentally, that he himself did not sit inside of.

      Liked by 1 person

      • I agree with almost all of what you say here. My only dissent is that there are other ways to argue for and against the particular format of a computational theory, besides how seamlessly it maps onto psychological reality (which, to be clear, is also a valid way to do so). I mentioned this in an earlier comment, but I’ll repeat it here: acquisition and typology. If, as is the case in HPSG, the notion of “argument” has been weakened to the point that it no longer has identifiable necessary and sufficient semantic characteristics, this is a meaningful obstacle to language acquisition. If (C)CG can mimic certain things that (post-)minimalist syntax does “because those things can be encoded as CFGs”, but there are myriad other things that could be so encoded and nevertheless do not occur, this is a meaningful obstacle to correctly modeling variation and typology.

        In my opinion – and I suspect this is the only place we differ, really – these things are currently more productive than conversing with psychology, only because the state of play in most (not all, but most) of psychology right now is so theoretically barren. You know my current organizational affiliation, so you know that I’m not speaking out of sheer ignorance here: I’d wager that I get exposed to a wider breadth of work coming out of psychology departments than your run-of-the-mill theoretical syntactician does. With the exception of Gallistel and the handful of people who subscribe to his views, current neuropsychology has no proposal for how the brain stores symbolic information. Let that sink in.

        Now, I’d be happy to be proven wrong about all of that, and so your call to action in these posts is obviously a force for good. I think the “let a thousand flowers bloom” is a terrible approach when it comes to theory, but it’s certainly a good approach when it comes to methodology!

        Like

Leave a comment