• Last time, I looked at Derrida’s Gift of Death to understand the logic of sacrifice there.  Briefly, the decision to do one thing involves sacrificing all of the other thing one could do.  So when I choose to feed this cat, I sacrifice all the other cats.  My ethics are impeccable, but the decision to prefer one cat over all the others is one that cannot be ultimately justified.  This is the lesson Derrida takes from Kierkegaard’s Abraham.  I then suggested that Derrida thinks a similar logic works in language, with evidence from passages where he suggests that speaking here and now in a certain language (French, in his case and examples) involves not speaking in other ways and other languages.  As he says in Grammatology, the justification of a particular discourse is only possible on historic grounds, not absolute ones.

    What does any of this have to do with language models?  A viable chatbot does a lot more than next-token prediction. I’ve talked a lot about the various normative decisions that go into making models work – everything from de-toxifying training data to all of the efforts (of which RLHF is perhaps the best-known) to massage the outputs into something a person would find palatable.  The models also make a significant break with English language in that they operate using word tokens, and not words: the very architecture of the model involves a strategic process of winnowing the range of iterability (for more: one, two, three).  Here I want to look at something different, something analogous to the sense of “decision” in Derrida.

    (more…)
  • There’s starting to be a good bit of productive “continental” work on Large Language Models (LLMs) like ChatGPT.  In particular, there’s emerging work that takes on LLMs from the point of view of language.  I’ve said a lot about the usefulness of Derrida for understanding LLMs, generally through the lens of Derrida’s discussion of Platonism.  For skeptics, there’s now also a new paper by David Gunkel that makes a succinct case using Derrida’s différance.  For those who prefer structuralism to post-structuralism, there’s Leif Weatherby’s Language Machines (Weatherby dismisses Derrida’s utility; I offer the outlines of a response here).  For those who prefer Wittgenstein, Lydia Liu has some really interesting work and evidence of a direct influence of Wittgenstein on the development of language computation at Cambridge.  Here I want to continue the general exploration by taking it in a direction that I’m pretty sure is new, the way that Derrida understands decision and sacrificial logic.  The setup is a little long, and goes by way of the Binding of Isaac.  So bear with me.

    In the relatively late Gift of Death (1992), Derrida responds to Kierkegaard’s telling of the binding of Isaac.  To recall, in the Biblical story, God “tests” Abraham by instructing him to take his only son Isaac and sacrifice him at the top of Mount Moriah.  Abraham obliges without question; an angel intervenes at the last moment to save Isaac.  Abraham passes the test and is promised offspring “as numerous as the stars of heaven and as the sand that is on the seashore” because he obeyed the command.  Kierkegaard’s text is presented in the voice of one Johannes de Silentio, who claims not to be a philosopher and to be rendered speechless by Abraham’s faith.  Speaking of the authorial voices in his early texts, Kierkegaard suggests that they allow “the educative effect of companionship with an ideality which imposes distance” (CUP, 552).  Silentio suggests fairly early on that “Abraham was the greatest of all, great by that power whose strength is powerlessness, great by that wisdom whose secret is foolishness, great by that hope whose form is madness, great by the love that is hatred to oneself” (16-17).  There is a central paradox to Abraham: his greatness requires that he explicitly intend to do what is obviously unethical.  Hence Silentio’s unwillingness to explain Abraham in (Hegelian) conceptual terms.   Derrida explains the paradox this way:

    (more…)
  • No, the quote isn’t a new marketing slogan for OpenAI.  I’m actually referring to a budding issue in patent law.  The Patent Act says that “whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title (35 U.S.C. §101).  Although this is very broad, Supreme Court precedent says that it exempts abstract ideas, laws of nature, and natural phenomena. 

    As I argued in my IP book (from which I’m lifting some of the discussion below) the rise of the information economy has made understanding these exemptions quite difficult.  In an industrial setting, all of these patentable things tended to occur in certain objects that could then be claimed as patentable.  As Dan Burk notes, “products, at least to the extent that they constitute objects, are inherent in the concept of process …. Making and using entail some type of object: some thing is made, and some thing is used.  In classic industrial setting, the substrates of the process were fairly apparent, and extant in what is now §101; machines and materials visibly interacted as inputs generating outputs” (527).

    With the rise of “immaterial” goods and a post-Fordist economy, however, it is increasingly difficult to point to discrete things either at the level of product or process, and the ability to characterize immaterial goods informatically suggests that they could be understood as either thing or process.  Burk argues that the Supreme Court cases on §101 are therefore more about drawing judicial limits on what patents can cover.  As he puts it, “excluding conceptual inventions from patent eligibility pushes exclusivity further downstream to the stage of finished products, requiring narrower claims on concrete implementations, rather than allowing conceptual patents early in the development of a technology” (535).  Still, the devil lies in the details of how to make this work.

    (more…)
  • Leif Weatherby does not care for Derrida.  At least, in Language Machines (see here for a synopsis/initial take on this important book) he suggests that Derrida’s (mis)reading of Saussure is a significant part of “how the humanities lost language, allowing both cognitive science and NLP to update analytical and technological approaches that literary theory rarely engaged” (73).  In particular, Derrida’s move to the critique of metaphysics and his tendency to lump pretty much everything together under that umbrella risks abstraction – it’s a proposal that “itself floats above the fray” (73).  This gets to the same place Chomsky did, albeit by a different route:

    “By sweeping structuralism’s focus on a concrete object to one side in the name of opposition to metaphysics, poststructuralism fumbled the object itself. Where Chomsky avoids external language by excluding it from science, Derrida finds the law not in cognition but rather at a level of abstraction about culture that ends up having the same effect: a lack of a link between the ‘conditions of im/possibiity’ and the expressions so conditioned” (73).

    The Derridean critique, in other words, is so abstract that “it is simply not clear that we need Derrida’s revision of structuralism to proceed with a concrete analysis of computational language” (73).  Worse, post-structuralism in its Derridean version doesn’t have much to say about how language “interfaces with other sign-systems … primarily because it has never taken other sign-systems particularly seriously, perhaps especially mathematics” (73).

    There’s a lot going on here, and I’m certainly not in a position to defend Derrida’s level of abstraction.  After all, I lean Foucauldian.  In what follows, I want so say something about the abstraction problem, and then something about why I think Derrida nevertheless has something to offer.

    (more…)
  • Last time, I talked about Leif Weatherby’s fantastic Language Machines (for my initial synopsis and thoughts on the book, see here) and his identification of a Kantian problematic behind what he calls the syntax view of language, which is prominently associated with Chomsky.  Although Chomsky called his book Cartesian Linguistics, Weatherby thinks the better reference is to Kant.  I think this makes a lot of sense, and it helps (this was the trajectory last time) to understand why structuralist, post-structuralist and Wittgensteinian work seems to have real traction when applied to language models.

    Here I want to step back a little and note part of what motivates the Kantian account, because I think it shows the political stakes of Kantianism. On a standard epistemological reading, Kant was awakened from his dogmatic slumber by Humean empiricism. Causality demands necessity and empiricism can’t get you there (see B123-4). I have no quarrel with the epistemological reading, but it’s worth noting that the language of the First Critique also is full of juridical terminology.  For example, we need to “institute a tribunal which will assure to reason its lawful claims, and dismiss all groundless pretensions, not by despotic decrees, but in accordance with its own eternal and unalterable laws” (A xiii).  As David Lachterman showed, this kind of language is all over the First Critique and is critical to the project of disciplining reason.  In starting the Deduction, Kant distinguishes a question of right from a question of fact and applies the distinction to our use of the categories:

    (more…)
  • In Language Machines (see here), Leif Weatherby argues that what he calls the “syntax” view of language, which is most closely associated with Chomsky, is better viewed as a Kantian system than a Cartesian one:

    “Syntax, universal grammar, principles and parameters, and the more recent ‘minimalist program’ with its key idea of ‘merge’ – all these are attempts to isolate and formalize the ability to use language as a distinctively human operation shared neither by animals nor by machines. For this reason, I think that his linguistics is more Kantian than Cartesian. Chomskyan linguistics is the search for the categories of a transcendental logic as it exists extensively, to find the rules that we impose on sound or paper …. The search for the rules of that knowledge in the empirical order is futile, Kant argued, and Chomsky’s argument against statistics ha its analog here, not in Descartes or in Humboldt” (46-7).

    Chomsky’s aversion to empiricism (in this Kantian sense) is “at the cost of defining” language “not as actually spoken languages but as the formal production unit – in the brain or some computational formalism – that achieves the fit between knowing and saying, the internal and external aspects of the linguistic act” (51).  On the Chomskyan argument, it is not possible to bootstrap from semantics to syntax; the cost is explaining “how the deep structure of syntax actually imposes form on specific languages, like English or Lao” (51).

    (more…)
  • Regular readers of this space will know that I think large language models are deeply fascinating, in addition to being a little scary (depending on their use).  I also think that we can get some traction on both of those things by way of post-structuralist language theory, or at least, by way of Derrida.  I was thus very happy to finally read Leif Weatherby’s Language Machines: Cultural AIO and the End of Remainder Humanism, which came out earlier this year.  Weatherby’s thesis is, in brief, that the structuralists were right about language, and that we need to see this to have any hope of understanding language models and directing them to good use.  I’ll hopefully have more to say about various parts of the book later, but for now I want to offer a high level outline.

    Weatherby begins by arguing that “nothing less than the problem of meaning, in a holistic sense, surfaces when language is algorithmically reproducible,” such that “this problem can be addressed only if linguistics is extended to include poetics … reversing the assumption that reference is the primary function of language, grasping it rather as an internally structured web of signs” (2).  This is because “the new AI is constituted in an conditioned by language, but not as a grammar or a set of rules.  Taking in vast swaths of real language in use, these algorithms rely on language in extenso: culture, as a machine” (5).

    (more…)
  • The preprint is freshly posted on SSRN; the paper is forthcoming in a volume on Privacy Resignation (aka privacy cynicism). In it I argue that privacy resignation is usefully understood as an adaptive preference. Here is the abstract:

    Adaptive preferences are preferences that change because of the availability of what someone desires. The concept has had considerable uptake in the literature on human development, where it is used to understand how socially marginalized people come to accept their marginal status. Here, I apply the framework to privacy resignation in two ways. On a substantive interpretation, adaptive preferences indicate a normative problem. In the case of privacy, it is with substantive autonomy and the importance of privacy to a number of core human capabilities. On a formal interpretation, adaptive preferences are irrational because they involve changing one’s assessment of something without it having itself changed. Here I argue that this sort of preference-“privacy is unavailable, therefore it is bad”-is a goal of the data industry, which wants to change social norms against privacy to serve its own purposes and to deflect critical thinking away from its practices.

  • I was saddened to learn this past weekend of the death on March 3 of Timothy J. Reiss, emeritus Professor of Comparative Literature at NYU.  Tim was the outside reader for my dissertation and was incredibly generous in supporting the project.  I had encountered his work first as an undergraduate, when I somewhat randomly pulled a copy of his Discourse of Modernism off the shelf of a used bookstore.  I was working on a thesis about Heidegger’s metaphors for thought; Discourse seemed interesting and like the kind of thing I’d like. I read it and didn’t understand it all that well, though my underlining and marginal notes suggest I worked pretty hard at it. A few years later, I was researching my dissertation, which was half on Hobbes, relied heavily on minor 17c primary texts and made a claim about what “modern” political philosophy consisted in.  I had already been thoroughly influenced by David Lachterman’s Ethics of Geometry and had this idea that something in Discourse of Modernism might be helpful. So I picked it up again.  That time I did understand it, and remember evenings sitting by the gas fire in an underheated Oxford flat working my way through it.

    At some point during this process, during which I also picked up The Meaning of Literature and Knowledge, Voyage and Discovery in Early Modern Europe, I cold emailed him, describing my project and asking if he’d be willing to be my outside reader.  A couple of days later, I got a brief but polite reply to the effect that he was already on too many projects.  A few days after that, I got another email – this time to say that he’d been looking over my proposal again, that it sounded very interesting, and he’d very much like to be one of my readers.  After I finished my dissertation, he generously cited it in one of his later books.  He also offered both positive feedback and some needed encouragement for my heterodox treatment of Hobbes’s mathematics, which was my first post-dissertation work on Hobbes.

    Tim’s erudition was astonishing – he published a long list of books, much longer than I’ve recounted here.  All of them were meticulously documented and carefully argued across wide ranging primary and secondary sources.  He was a “renaissance scholar” in that he worked on texts of the European Renaissance, but the scope of his reading and thought was truly global, including to contemporary work.  In the late renaissance and early modern period in Europe, he moved seamlessly between literature and philosophy (both in Europe and outside it), locating them both in their shared cultural moment and refusing to abide by our much later disciplinary boundaries.  When I teach Modern, I’m still informed by his treatment of Descartes and the ways he shows both that Descartes was deeply attuned to his own cultural moment, and that Descartes ultimately could be read as working toward a community of thinkers, rather than the isolated ego best known for a “retreat from the polis to the poêle,” as Tim cited Lachterman as having once quipped.

    NYU has a memorial notice here, and you should really read the appreciations gathered at the bottom – they speak to his scholarly contributions, his personal generosity, and his immense influence in growing and transforming the NYU comp lit department.

  • Last time, I took a detour from the discussion (part one, two, three, four, five, six, seven) of Platonism (in Derrida’s sense) in language models to look at Plato’s work itself, emphasizing how important mythmaking and storytelling are to it.  Behind that, it seems to me that Derrida’s critique of Plato and Hegel on writing offers some useful points for thinking about what LLMs do.  On the one hand, LLMs show that the priority of speech over writing, insofar as that priority is based on some sort of metaphysical preference for speech as the correct representation of an eidetic truth function, makes no sense.  It’s better to read that priority as a political preference and to treat it as such.  A central point in the deconstruction of this priority is the displacement of the word as the fundamental unit of language.  This much is also evident in Wittgensteinian approaches to language, which (as Lydia Liu argues) shows up in early research in machine translation.  That research makes explicit use of written Chinese as a model for thinking about meaning as distributional.  Chinese is also near the bottom of the Hegelian hierarchy of languages, and one could image it the absolute bottom of a Platonic one.

    On the other hand, a second Platonism is evident in the assumed priority of a unified speaking subject behind language production.  Whatever else they are, language models aren’t unified speaking subjects in any meaningful sense of the term. To the extent that LLMs appear as unified subjects, that is an artifact of some very specific coding and training decisions made for (broadly construed) social and political reasons.

    Both of these suggest that the attention to Platonism is worthwhile for another reason: it draws attention to the ways that storytelling and mythmaking around language and computation are essential to the social meaning of language models. 

    Seeing all of this mythmaking and storytelling may very well require reading Derrida against himself, or at least against the grain.  As Claudia Barrachi says in a paper dedicated to the Phaedrus, one of the most emphasized aspects of Socrates’ ethos in the dialogue is his receptivity, his willingness to be infiltrated and informed by his environment, both the natural environment outside the city and the daimon influencing his speeches.  Socrates is “a subject of the world who is subject to the world” (40).  She adds:

    “It is easier now [after presenting this reading] to understand the degree to which such a [Socratic] saying and such enacting may be incompatible with a practice like that of the rhetoricians – writing in order to read, mechanically reproduce. Those who strictly adhere to this practice have virtually no access to the possibility disclosed to Socrates – the possibility of reconsidering, perhaps even reversing one’s position. Indeed, such a reversal becomes genuinely possible thanks to the vulnerability inherent in exposing oneself to the surrounding suggestions …. In this sense it is possible to see … how the critique of writing with which Socrates concludes the dialogue is not so much a quintessentially metaphysical attempt, as if in a proto-Husserlian vein, to subordinate the sign and its sensible exteriority to the primacy of the voice, incorporeal cipher of the interiority of meaning in its pure presence (as Derrida, more willfully, and better than others, has argued). According to what was said so far, the Socratic critique seems rather to give itself as perplexity before a practice of writing that abstracts itself from life and is unable to respond and correspond to it.  What is critically assessed seems to be writing as a tyrannical instrumentalization that, from its alleged atemporality, would impose itself on silence without encountering” (40-1).

    That is something language models seemingly either can’t do, or can do only with extreme difficulty.  This is a perverse result: LLMs are entirely products of their environment.  Yet at the same time, their construction resists change because it is based on normalized factors of language.  There is an in-built regression to the mean, to the “fuzzy gif” of the internet and all the post-training.  Speaking situations that call for novelty, like telling jokes, are ones that LLMs handle less well.

    (more…)