The Disavowed Sacrificial Logic of ChatGPT (part 2): On Sampling Strategy

Last time, I looked at Derrida’s Gift of Death to understand the logic of sacrifice there. Briefly, the decision to do one thing involves sacrificing all of the other thing one could do. So when I choose to feed this cat, I sacrifice all the other cats. My ethics are impeccable, but the decision to prefer one cat over all the others is one that cannot be ultimately justified. This is the lesson Derrida takes from Kierkegaard’s Abraham. I then suggested that Derrida thinks a similar logic works in language, with evidence from passages where he suggests that speaking here and now in a certain language (French, in his case and examples) involves not speaking in other ways and other languages. As he says in Grammatology, the justification of a particular discourse is only possible on historic grounds, not absolute ones.

What does any of this have to do with language models? A viable chatbot does a lot more than next-token prediction. I’ve talked a lot about the various normative decisions that go into making models work – everything from de-toxifying training data to all of the efforts (of which RLHF is perhaps the best-known) to massage the outputs into something a person would find palatable. The models also make a significant break with English language in that they operate using word tokens, and not words: the very architecture of the model involves a strategic process of winnowing the range of iterability (for more: one, two, three). Here I want to look at something different, something analogous to the sense of “decision” in Derrida.

Language models work on statistical probabilities, such that when I ask “where does the rain in Spain mainly fall,” the model is aware that “on the plain” is what I’m probably referring to (even if, as it turns out, I have slightly misremembered the line). It also might demur:

In this case, ChatGPT decided to keep its options open, and acknowledge both the literary allusion and the possibly literal meaning of the question, refusing to treat the latter as the ideal in quite the way that Austin did (on the other hand, the answer was wearyingly literal, in the anti-play sense that Derrida says of Platonism: “The best sense of play [for Plato] is play that is supervised and contained within the safeguards of ethics and politics” (Dissemination, 156)).

What this example highlights is that chatbots must have not just a statistical model, but also a “sampling strategy” to choose what words to use next. Given a statistical distribution, it still has to select which of the “likely” terms come next. Consider a simple example: “After I get up, I say to my children: ‘good…’” It’s likely, but not certain, that the next word is “morning.” It could also be “day.” Or perhaps I slept late, in which case it’s “afternoon.” Or maybe the children have overslept and the next words are “grief! You’re going to be late!” It is almost certainly not “brussels sprouts” or “2 + 2 = 5” because nothing like those are going to have occurred in the training data, at least not often enough to be more than distant outliers.

How to select among them? It turns out that there’s several possibilities. The least interesting is a deterministic “greedy search,” where the model mechanically applies the highest probability answer. If it doesn’t do that, it is likely to use some sort of random number to select among the possibilities. For example, if the probability of “morning” is 70% then 70% of the time the random number generator is going to point to “morning.”

Different kinds of outcomes can be generated by weighting the possibilities differently. Other options include “temperature scaling,” which modifies the sharpness of the probability distribution. If the first place option is 70% and the second is 30%, that could be rescaled to 80-20, or to 60-40. The former would make the model more predictable, the latter more “creative.” Another option is to set some sort of floor – through “top-k” sampling, limiting the options to the top k probabilities, or “top-p,” which takes the top n possibilities whose cumulative probability is > p. (either of these would eliminate the model choosing “brussels sprouts” here). These options aren’t necessarily mutually exclusive, and the point is simply that the model designers have to pick something a way so set the odds before going to the random number generator.

Apparently ChatGPT will vary its strategy, and the chatbot doesn’t let you change that, in order for it to sound more “human,” though the API both allows you to choose from temperature and top-p and to seed the random number generator, to enable the repetition of certain outcomes (this suggests that ChatGPT is using a pseudo-random generator, which is itself an interesting design decision).

This marks a significant different between LLM and human language use. There’s a lot of anthropocentrism in the development and presentation of language models (which may or may not be good for them as useful artifacts. Also they’re different from people in other ways). But this marks a limit. The Derridean narrative I reconstructed last time about human language is essentially that this moment – where I decide to say “good morning” or “good day” – is ultimately arbitrary. That’s not a metaphysical claim, but “arbitrary” here can indicate something like “unknowable” and “without full explanation/justification.” The LLM is not arbitrary in the same way: there is a deliberate strategy or strategies for navigating the moment, and those are presumably chosen for specific reasons. There is still iterability and there is still an ultimate lack of justification for the whole enterprise, but the LLM shifts where that demarcation is from human language use.

(aside: Leif Weatherby in Language Machines uses Bataille to help think through the distinction between structuralism (which for him is the correct theory to understand LLMs) and Derrida’s post-structuralism. I think Derrida is more useful than he does, but the use of Bataille indirectly supports calling this a sacrificial logic. After all, one of Bataille’s examples of a general economy and the excessive consumption it inspires is Aztec sacrifice rituals. There is a reduction strategy, and that sacrifice follows an internal logic, even if that logic is without ultimate justification.)

If the broad comparison is legit, and if some sort of sacrificial logic is embedded in all language, this suggests a few things apropos of LLMs.

First, there is a lot of Derridean language about hospitality in his later writings. I think they’re relevant here because they’re the flip side of the coin to the point about sacrificing cats in Gift of Death. There, I necessarily fail in my obligation to many others by choosing one. Here, I fail in my obligation to an individual other at the moment I address them in a specific way (since all address is specific, I will always fail in some way). The discussion is again couched in Biblical terms, with oblique reference to the obligation to the widow, the orphan and the stranger (though oddly, the words “widow” and “orphan” do not appear in On Hospitality, unless my command-F has failed me). The obligation of hospitality toward the stranger is a little different from the other two, as it requires an openness without knowing anything about who the stranger is. The distinction is between an absolute hospitality and one that’s regulated or governed by a schema of sorts. As he puts it, between “an unconditional law or an absolute desire for hospitality on the one hand and, on the other, a law, a politics, a conditional ethics, there is distinction, radical heterogeneity, but also indissociability” (On Hospitality, 147). I think the value of the Hospitality volume in this context is limited, but I will note that Derrida again ties the question to language. There is language in the sense of cultural norms and expectations vs. language in the proper sense (French vs. English); there is also the question of the (un)translatability of the proper name (137-8). And technology, “all the technological prostheses whose refinements and complication are in principle unlimited (the mobile phone is only a figure for this)” (137).

Absolute hospitality is never actually going to happen; there are always obligations to those we know, to those “within” the system of norms. The moment of linguistic address, even if a welcome, already limits the relationship by inscribing it within a language, and language includes both the words and the cultural baggage (of what an invitation means in one’s society, for example). The reduction to an actually existing language culture will nonetheless leave traces of this unfulfilled obligation to the absolute stranger, in their alterity. One way of characterizing some of the work on epistemic injustice in LLMs – for example, that which points out the epistemic injustice in the models training largely on English and (therefore) being less adept at morphologically complicated languages – is that these mark moments of sacrifice. The model is trained on this language and not others, and that is a choice that is ultimately without justification. Failures in morphologically complex languages are traces of a responsibility that remains unfulfilled, perhaps permanently – but it can nonetheless be marked (a hospitality “to come,” in Derrida-speak).

This suggests, second, that there is a need for archival work to discover injustices built into the sacrificial logic of language, and (this is the point here) how those injustices work differently in LLMs and human speech.

More concretely, one might suggest (third) that there ought to be a double-standard between humans and AI. In a recent paper, Mario Günther and Atoosah Kasirzadeh argue that there are situations where we should involve a double standard for algorithmic transparency, requiring more of algorithms than we do of humans. In their first example, design considerations are more helpful than intentional-state ones. In a Boeing 737 Max crash, a faulty algorithm that relied too much on one sensor turned out to have been significantly responsible for the problem. Hence:

“The example illustrates that intentional stance explanations [the algorithm falsely believed that the plane was descending] are inevitably determined by the algorithmic design. The same ‘belief’ of an algorithmic system might be determined by many design choices. And it matters in the Boeing case that it was this particular design which determined the algorithmic output. The specific design-level details matter to explain why an algorithm has decided falsely.”

In such situations, where livelihoods are on the line, we need to break with the anthropomorphism and look at the design if that would be helpful. As they note, “had the designer anticipated that the sensor might send an inaccurate signal, she could have changed the design to make it more robust. At the very least, we should demand a design to be so that one malfunctioning sensor cannot have disastrous effects.” Here, we can think about the sampling strategy as a design decision that affects what “beliefs” language models have.

Their second kind of example amplifies the point. It includes complex neural nets, the inner workings of which are fully opaque. In situations such as these, “for proper black boxes, we are lacking epistemic access to the genuine ‘beliefs’ and ‘reasons’ of the algorithm. Sometimes we simply do not know the propositional content of a certain ‘belief’. Hence, we can neither describe this ‘belief’ in mentalistic terms nor ascribe it to the algorithm.”

It seems that something like this applies to language models as well. Intentional stance language is pervasive in talking about LLMs – “ChatGPT thinks you have the flu” and there is a sense in which this is fine. But at another level, it obscures design-level decisions and the fact that whatever utility intentional stance explanations have, using them obscures the extent to which design decisions are shaping the output, even at the level of whether it finishes a sentence with “morning” or “day.” That isn’t true of human language users. We can of course trace the ways that humans use language, and suggest the reasons why people tend to use certain words over others (their cultural background, etc.), but those aren’t the same as design decisions. In the case of language models, the sacrificial logic is a matter of design, which means that it can be accountable.

Fourth and finally, if one is concerned with human accountability for AI use, or human agency, one way to increase that is to allow humans to pick the logic used, to have a greater influence over the outputs. That is, there ought to be a way to both open up and possibly control sampling strategies.

All of these push back against the invisible adoption of normativities in LLM design. Such normativity is inevitable. That’s the sacrificial logic. But it doesn’t have to be invisible. That’s the difference between humans and LLMs. Because LLMs are artifacts, the available space of accountability extends farther.

New APPS

recent posts

about