New APPS

Language Modeling Bias, Iterability, Grammatology and Epistemic Injustice

March 14, 2024

By Gordon Hull

In a recent paper in Ethics and Information Technology, Paul Helm and Gábor Bella argue that current large language models (LLMs) exhibit what they call language modeling bias, a series of structural and design issues that serve as a significant and underappreciated form of epistemic injustice. As they explain the concept, “A resource or tool exhibits language modeling bias if, by design, it is not capable of adequately representing or processing certain languages while it is for others” (2) Their basic argument is that the standard way of proceeding with non-English languages, which is more or less to throw more data at the model, build in structural biases against other languages, especially those that are more morphologically complex than English (=df those with lots of inflections).

The proof of concept is in multi-lingual tools:

“The subject of language modeling bias are not just languages per se but also the design of language technology: corpora, lexical databases, dictionaries, machine translation systems, word vector models, etc. Language modeling bias is present in all of them, but it is easiest to observe with respect to multilingual resources and tools, where the relative correctness and completeness for each language can be observed and compared” (6)

They identify several kinds of such structural bias. The first is that prominent current architectures “tend to train slower on morphologically complex (synthetic, agglutinate) languages, meaning that more training data are required for these languages to achieve the same performance on downstream language understanding tasks” (7). Given the percentage of the available training data that’s in English, this magnifies what’s already a problem. Second, the models perform poorly on untranslatable words. Third, they cite a study showing “that both lexicon and morphology tend to become poorer in machine-translated text with respect to the original (untranslated) corpora: for example, features of number or gender for nouns tend to decrease. This is a form of language modeling bias against morphologically rich languages” (7).

(more…)
Why did the chatbot cross the road?

March 7, 2024

By Gordon Hull

The diversity of language has been a philosophical problem for a while. Hobbes was willing to bite the bullet and declare that language was arbitrary, but he was an outlier. One common tactic in the seventeenth-century was to try to resolve the complexity of linguistic origins with a reference to Biblical Hebrew. Future meanings could be stabilized with reference to Adamite naming. I’ve essentially been arguing (one, two, three, four) that we’re seeing echoes of this fusion of orality, intentionality and origin in the various kinds of implicit normativity that make it into Large Language Models (LLMs) like ChatGPT. In particular, LLMs depend on iterability as articulated by Derrida, but we tend to understand them with models of intentionality that occlude subtle (and not so subtle) normativities that get embedded into them. Last time, I looked at Derrida’s critique of Searle for what it had to say about intentionality. I also suggested that there was a second aspect of Derrida’s critique that is relevant – Derrida accuses Searle of relying too much on standardized speech situations. I want to pursue that thought here.

Let’s start with a joke:

(more…)
LLM, Inc.

February 27, 2024

By Gordon Hull

In previous posts (one, two, three), I’ve been exploring the issue of what I’m calling the implicit normativity in language models, especially those that have been trained with RLHF (reinforcement learning with human feedback). In the most recent one, I argued that LLMs are dependent on what Derrida called iterability in language, which most generally means that any given unit of language, to be language, has to be repeatable and intelligible as language, in indefinitely many other contexts. Here I want to pursue that thought a little further, in the context of Derrida’s (in)famous exchange with John Searle’s speech act theory.

Searle begins his “What is a Speech Act” essay innocently enough, with “a typical speech situation involving a speaker, a hearer, and an utterance by the speaker.”

That is enough for Derrida! In Limited, Inc., he responds by accusing Searle over and over of falling for intentionality, on the one hand, and for illicitly assuming that a given speech situation is “typical,” on the other.

Let’s look at intentionality first. In responding to Searle, Derrida explains that he finds himself “to be in many respects quite close to Austin, both interested in and indebted to his problematic” and that “when I do raise questions or objections, it is always at points where I recognize in Austin’s theory presuppositions which are the most tenacious and the most central presuppositions of the continental metaphysical tradition” (38). Derrida means by this a reliance on things like subjectivity and representation – the sorts of things that Foucault is getting at when he complains in the 1960s about philosophies of “the subject” (think: Sartre and phenomenology). Derrida is involved in the same general effort against phenomenology, though he adds a page later that he thinks the archaeological Foucault falls into this tendency to treat speech acts or discursive events in a “fundamentally moralistic” way (39). No doubt Searle is relieved to know that he belongs in the same camp as Foucault. In any case, Derrida explicitly says a few pages later that “the entire substratum of Sarl’s discourse, is phenomenological in character” (56) in that it is over-reliant on intentionality.

(more…)
Study Philosophy in Charlotte!

February 15, 2024

The MA Program at UNC Charlotte has a number of funded lines (in-state tuition plus $14k a year) for our two-year MA program in philosophy. We're an eclectic, practically-oriented department that emphasizes working across disciplines and philosophical traditions. If that sounds like you, or a student you know – get in touch!

We also have a new Concentration in Research and Data Ethics is designed to prepare students for jobs in areas like research ethics and compliance offices, healthcare ethics, or other fields requiring training in the ethics of research, big data, and AI.

Feel free to email me (ghull@charlotte.edu) with questions about the program, or our Graduate Program Director, Lisa Rasmussen (lrasmuss@charlotte.edu). The flyer below has some information and a QR code. Or visit the department page or the graduate program page.

Note that you need to apply by March 15 to be eligible for funding.
New Paper: “Unlearning Descartes: Sentient AI is a Political Problem”

January 3, 2024

Just published in the Journal of Social Computing, as part of a special issue on the question of the sentience of AI systems. The paper is here (open access); here's the abstract:

The emergence of Large Language Models (LLMs) has renewed debate about whether Artificial Intelligence (AI) can be conscious or sentient. This paper identifies two approaches to the topic and argues: (1) A “Cartesian” approach treats consciousness, sentience, and personhood as very similar terms, and treats language use as evidence that an entity is conscious. This approach, which has been dominant in AI research, is primarily interested in what consciousness is, and whether an entity possesses it. (2) An alternative “Hobbesian” approach treats consciousness as a sociopolitical issue and is concerned with what the implications are for labeling something sentient or conscious. This both enables a political disambiguation of language, consciousness, and personhood and allows regulation to proceed in the face of intractable problems in deciding if something “really is” sentient. (3) AI systems should not be treated as conscious, for at least two reasons: (a) treating the system as an origin point tends to mask competing interests in creating it, at the expense of the most vulnerable people involved; and (b) it will tend to hinder efforts at holding someone accountable for the behavior of the systems. A major objective of this paper is accordingly to encourage a shift in thinking. In place of the Cartesian question—is AI sentient?—I propose that we confront the more Hobbesian one: Does it make sense to regulate developments in which AI systems behave as if they were sentient?
Iterability and Implicit Normativity

December 18, 2023

By Gordon Hull

In a couple of previous posts (first, second), I looked at what I called the implicit normativity in Large Language Models (LLMs) and how that interacted with Reinforcement Learning with Human Feedback (RLHF). Here I want to start to say something more general, and it seems to me like Derrida is a good place to start. According to Derrida, any given piece of writing must be “iterable,” by which he means repeatable outside its initial context. Here are two passages from the opening “Signature, Event, Context” essay in Limited, Inc.

First, writing cannot function as writing without the possible absence of the author and the consequence absence of a discernable authorial “intention:”

“For a writing to be a writing it must continue to ‘act’ and to be readable even when what is called the author of the writing no longer answer for what he has written, for what he seems to have signed, be it because of a temporary absence, because he is dead or, more generally, because he has not employed his absolutely actual and present intention or attention, the plenitude of his desire to say what he means, in order to sustain what seems to be written ‘in his name.’ …. This essential drift bearing on writing as an iterative structure, cut off from all absolute responsibility, from consciousness as the ultimate authority, orphaned and separated at birth from the assistance of its father, is precisely what Plato condemns in the Phaedrus” (8).

Second, iterability puts a limit to the use of “context:”

“Every sign, linguistic or nonlinguistic, spoken or written (in the current sense of this opposition), in a small or large unit, can be cited, put between quotation marks, in so doing it can break with every given context, engendering an infinity of new contexts in a manner which is absolutely illimitable. This does not mean that the mark is valid outside of a context, but on the contrary that there are only contexts without any center or absolute anchorage” (12)

It seems to me that Derrida’s remarks on iterability are relevant in the context of LLMs because they indicate that LLMs are radically dependent on iterability. This is true in at least three ways, each of which points to an important source of their implicit normativity.

(more…)
Shane MacGowan, 1957-2023

December 1, 2023

By Gordon Hull

I first listened to the Pogues late in high school. I had started moving beyond the music I could hear on the radio – basically top 40 and classic rock – and I discovered the Pogues’ Rum, Sodomy and the Lash at about the same time I discovered Midnight Oil’s Diesel and Dust. I didn’t know music could be like “Sally MacLennane” or “The Sick Bed of Cuchulainn” or “A Pair of Brown Eyes,” and I was hooked. I listened more and more, and even had a chance to see them perform in London at the Academy Brixton.

I say all of this of course because the Pogues’ lead singer and primary songwriter, Shane MacGowan, died yesterday. The Pogues managed to sound a little Irish and a little Punk without exactly being either, and their work is central to a lot of the contemporary Irish music community. A 60^th birthday tribute gala for MacGowan drew artists like Bono and Sinead O’Connor. The Cranberries’ Dolores O’Riordan praised the music (she also died hours before MacGowan’s gala; the entire who’s-who of Irish music paid tribute to her before switching to him). O’Connor and MacGowan were very close, and he credited her with getting him off of heroin. I remember that when she died, some reports were worried about what telling him would do to his very fragile health. The Pogues also spawned an entire genre of bands like the Dropkick Murphys, The Dreadnoughts and Flogging Molly.

(more…)
A Least-Bad Solution for Language Model Defamation?

November 29, 2023

By Gordon Hull

Large Language Models (LLMs) are well-known to “hallucinate,” which is to say that they generate text that is plausible-sounding but completely made-up. These difficulties are persistent, well-documented, and well-publicized. The basic issue is that the model is indifferent to the relation between its output and any sort of referential truth. In other words, as Carl Bergstrom and C. Brandon Ogbunu point out, the issue isn’t so much hallucination in the drug sense, but “bullshitting” in Harry Frankfurt’s sense. One of the reasons this matters is defamation: saying false and bad things about someone can be grounds to get sued. Last April, ChatGPT made the news (twice!) for defamatory content. In one case, it fabricated a sexual harassment story and then accused a law professor. In another, it accused a local politician in Australia of corruption.

Can LLMs defame? According to a recent and thorough analysis by Eugene Volokh, the answer is almost certainly yes. Volokh looks at two kinds of situation. One is when the LLM defames public figures, which is covered by the “actual malice” standard. Per NYT v. Sullivan, “The constitutional guarantees require … a federal rule that prohibits a public official from recovering damages for a defamatory falsehood relating to his official conduct unless he proves that the statement was made with ‘actual malice’ – that is, with knowledge that it was false or with reckless disregard of whether it was false or not” (279-80).

(more…)
Deepfake Research: ChatGPT can produce fake data

November 13, 2023

By Gordon Hull

There’s been a lot of concern about the role of language models in research. I had some initial thoughts on some of that based around Foucault and authorial responsibility (part 1, part 2, part 3). A lot of those concerns have to do with the role of ChatGPT or other LLM-based product and how to process that. The consensus of the journal editorial policies that are emerging is that AI cannot be an author, and my posts largely agreed with that.

Now there’s news of a whole other angle on these questions: a research letter in JAMA Ophthalmology reports that the authors were able to use ChatGPT-4’s Advance Data Analysis capabilities to produce a fake dataset validating their preferred research results. Specifically:

“The LLM was asked to fabricate data for 300 eyes belonging to 250 patients with keratoconus who underwent deep anterior lamellar keratoplasty (DALK) or penetrating keratoplasty (PK). For categorical variables, target percentages were predetermined for the distribution of each category. For continuous variables, target mean and range were defined. Additionally, ADA was instructed to fabricate data that would result in a statistically significant difference between preoperative and postoperative values of best spectacle-corrected visual acuity (BSCVA) and topographic cylinder. ADA was programmed to yield significantly better visual and topographic results for DALK compared with PK”

This is a very technical request! It took a bit of tweaking, but soon “the LLM created a seemingly authentic database, showing better results for DALK than PK,” P < .001.

The authors suggest some possible strategies to manage this but suffice it to say it is terrifying. There is already a longstanding, huge problem with fabricated, doctored or otherwise bogus scientific research out there. One report suggests that 70,000 “paper mill” (= almost completely faked) papers were published in the last year alone. In real papers, references are often inaccurate. Publishers already are having to grapple with lots of problematic doctored images, and Pharma has long tilted the entire scientific enterprise to produce results favorable to its products. At the end of last year, Stanford’s president was forced out over research misconduct in his labs. In an initial report into the Stanford investigation, STAT News reported data from Retraction Watch to the effect that a paper is retracted, on average, every other day for image manipulation. Retraction Watch had, at that time (Dec. 2022) 37,000 papers in its database. The top 5 most-retracted authors have at least 100 retracted papers each.

Into that mess, enter the ability to generate bespoke data on demand.
RLHF and Curation Transparency

November 9, 2023

By Gordon Hull

Last time, I followed a reading of Kathleen Creel’s recent “Transparency in Complex Computational Systems” to think about the ways that RLHF (Reinforcement Learning with Human Feedback) in Large Language Models (LLMs) like ChatGPT necessarily involves an opaque, implicit normativity. To recap: RLHF improves the models by involving actual humans (usually gig workers) in their training: the model presents two possible answers to a prompt, and the human tells it which one is better. As I suggested, and will pursue in a later post, this introduces all sorts of weird and difficult-to-measure normative aspects into the model performance, above and beyond those that are lurking in the training data. Here I want to pause to consider this as a question of opacity and transparency. I’m going to end up by proposing that there’s a fourth kind of transparency that we should care about, for both epistemic and moral reasons, which I’ll call “curation transparency.”

(more…)

recent posts

about