• By Gordon Hull

    In previous posts (one, two, three), I’ve been exploring the issue of what I’m calling the implicit normativity in language models, especially those that have been trained with RLHF (reinforcement learning with human feedback).  In the most recent one, I argued that LLMs are dependent on what Derrida called iterability in language, which most generally means that any given unit of language, to be language, has to be repeatable and intelligible as language, in indefinitely many other contexts.  Here I want to pursue that thought a little further, in the context of Derrida’s (in)famous exchange with John Searle’s speech act theory.

    Searle begins his “What is a Speech Act” essay innocently enough, with “a typical speech situation involving a speaker, a hearer, and an utterance by the speaker.”

    That is enough for Derrida!  In Limited, Inc., he responds by accusing Searle over and over of falling for intentionality, on the one hand, and for illicitly assuming that a given speech situation is “typical,” on the other.

    Let’s look at intentionality first.  In responding to Searle, Derrida explains that he finds himself “to be in many respects quite close to Austin, both interested in and indebted to his problematic” and that “when I do raise questions or objections, it is always at points where I recognize in Austin’s theory presuppositions which are the most tenacious and the most central presuppositions of the continental metaphysical tradition” (38).  Derrida means by this a reliance on things like subjectivity and representation – the sorts of things that Foucault is getting at when he complains in the 1960s about philosophies of “the subject” (think: Sartre and phenomenology).  Derrida is involved in the same general effort against phenomenology, though he adds a page later that he thinks the archaeological Foucault falls into this tendency to treat speech acts or discursive events in a “fundamentally moralistic” way (39).  No doubt Searle is relieved to know that he belongs in the same camp as Foucault.  In any case, Derrida explicitly says a few pages later that “the entire substratum of Sarl’s discourse, is phenomenological in character” (56) in that it is over-reliant on intentionality.

    (more…)

  • The MA Program at UNC Charlotte has a number of funded lines (in-state tuition plus $14k a year) for our two-year MA program in philosophy.  We're an eclectic, practically-oriented department that emphasizes working across disciplines and philosophical traditions.  If that sounds like you, or a student you know – get in touch! 

    We also have a new Concentration in Research and Data Ethics is designed to prepare students for jobs in areas like research ethics and compliance offices, healthcare ethics, or other fields requiring training in the ethics of research, big data, and AI. 

    Feel free to email me (ghull@charlotte.edu) with questions about the program, or our Graduate Program Director, Lisa Rasmussen (lrasmuss@charlotte.edu).  The flyer below has some information and a QR code. Or visit the department page or the graduate program page.

    Note that you need to apply by March 15 to be eligible for funding.

    ETAP-Flyer-2024_Page_1

    ETAP-Flyer-2024_Page_2

  • Just published in the Journal of Social Computing, as part of a special issue on the question of the sentience of AI systems.  The paper is here (open access); here's the abstract:

    The emergence of Large Language Models (LLMs) has renewed debate about whether Artificial Intelligence (AI) can be conscious or sentient. This paper identifies two approaches to the topic and argues: (1) A “Cartesian” approach treats consciousness, sentience, and personhood as very similar terms, and treats language use as evidence that an entity is conscious. This approach, which has been dominant in AI research, is primarily interested in what consciousness is, and whether an entity possesses it. (2) An alternative “Hobbesian” approach treats consciousness as a sociopolitical issue and is concerned with what the implications are for labeling something sentient or conscious. This both enables a political disambiguation of language, consciousness, and personhood and allows regulation to proceed in the face of intractable problems in deciding if something “really is” sentient. (3) AI systems should not be treated as conscious, for at least two reasons: (a) treating the system as an origin point tends to mask competing interests in creating it, at the expense of the most vulnerable people involved; and (b) it will tend to hinder efforts at holding someone accountable for the behavior of the systems. A major objective of this paper is accordingly to encourage a shift in thinking. In place of the Cartesian question—is AI sentient?—I propose that we confront the more Hobbesian one: Does it make sense to regulate developments in which AI systems behave as if they were sentient?

  • By Gordon Hull

    In a couple of previous posts (first, second), I looked at what I called the implicit normativity in Large Language Models (LLMs) and how that interacted with Reinforcement Learning with Human Feedback (RLHF).  Here I want to start to say something more general, and it seems to me like Derrida is a good place to start. According to Derrida, any given piece of writing must be “iterable,” by which he means repeatable outside its initial context.  Here are two passages from the opening “Signature, Event, Context” essay in Limited, Inc.

    First, writing cannot function as writing without the possible absence of the author and the consequence absence of a discernable authorial “intention:”

    “For a writing to be a writing it must continue to ‘act’ and to be readable even when what is called the author of the writing no longer answer for what he has written, for what he seems to have signed, be it because of a temporary absence, because he is dead or, more generally, because he has not employed his absolutely actual and present intention or attention, the plenitude of his desire to say what he means, in order to sustain what seems to be written ‘in his name.’ …. This essential drift bearing on writing as an iterative structure, cut off from all absolute responsibility, from consciousness as the ultimate authority, orphaned and separated at birth from the assistance of its father, is precisely what Plato condemns in the Phaedrus” (8).

    Second, iterability puts a limit to the use of “context:”

    “Every sign, linguistic or nonlinguistic, spoken or written (in the current sense of this opposition), in a small or large unit, can be cited, put between quotation marks, in so doing it can break with every given context, engendering an infinity of new contexts in a manner which is absolutely illimitable.  This does not mean that the mark is valid outside of a context, but on the contrary that there are only contexts without any center or absolute anchorage” (12)

    It seems to me that Derrida’s remarks on iterability are relevant in the context of LLMs because they indicate that LLMs are radically dependent on iterability.  This is true in at least three ways, each of which points to an important source of their implicit normativity.

    (more…)

  • By Gordon Hull

    I first listened to the Pogues late in high school.  I had started moving beyond the music I could hear on the radio – basically top 40 and classic rock – and I discovered the Pogues’ Rum, Sodomy and the Lash at about the same time I discovered Midnight Oil’s Diesel and Dust.  I didn’t know music could be like “Sally MacLennane” or “The Sick Bed of Cuchulainn” or “A Pair of Brown Eyes,” and I was hooked.  I listened more and more, and even had a chance to see them perform in London at the Academy Brixton.

     

    I say all of this of course because the Pogues’ lead singer and primary songwriter, Shane MacGowan, died yesterday.  The Pogues managed to sound a little Irish and a little Punk without exactly being either, and their work is central to a lot of the contemporary Irish music community.  A 60th birthday tribute gala for MacGowan drew artists like Bono and Sinead O’Connor.  The Cranberries’ Dolores O’Riordan praised the music (she also died hours before MacGowan’s gala; the entire who’s-who of Irish music paid tribute to her before switching to him).  O’Connor and MacGowan were very close, and he credited her with getting him off of heroin. I remember that when she died, some reports were worried about what telling him would do to his very fragile health.  The Pogues also spawned an entire genre of bands like the Dropkick Murphys, The Dreadnoughts and Flogging Molly.

    (more…)

  • By Gordon Hull

    Large Language Models (LLMs) are well-known to “hallucinate,” which is to say that they generate text that is plausible-sounding but completely made-up.  These difficulties are persistent, well-documented, and well-publicized.  The basic issue is that the model is indifferent to the relation between its output and any sort of referential truth.  In other words, as Carl Bergstrom and C. Brandon Ogbunu point out, the issue isn’t so much hallucination in the drug sense, but “bullshitting” in Harry Frankfurt’s sense. One of the reasons this matters is defamation: saying false and bad things about someone can be grounds to get sued.  Last April, ChatGPT made the news (twice!) for defamatory content.  In one case, it fabricated a sexual harassment story and then accused a law professor.  In another, it accused a local politician in Australia of corruption.

    Can LLMs defame?  According to a recent and thorough analysis by Eugene Volokh, the answer is almost certainly yes.  Volokh looks at two kinds of situation.  One is when the LLM defames public figures, which is covered by the “actual malice” standard.  Per NYT v. Sullivan, “The constitutional guarantees require … a federal rule that prohibits a public official from recovering damages for a defamatory falsehood relating to his official conduct unless he proves that the statement was made with ‘actual malice’ – that is, with knowledge that it was false or with reckless disregard of whether it was false or not” (279-80).

    (more…)

  • By Gordon Hull

    There’s been a lot of concern about the role of language models in research.  I had some initial thoughts on some of that based around Foucault and authorial responsibility (part 1, part 2, part 3).  A lot of those concerns have to do with the role of ChatGPT or other LLM-based product and how to process that.  The consensus of the journal editorial policies that are emerging is that AI cannot be an author, and my posts largely agreed with that.

    Now there’s news of a whole other angle on these questions: a research letter in JAMA Ophthalmology reports that the authors were able to use ChatGPT-4’s Advance Data Analysis capabilities to produce a fake dataset validating their preferred research results.  Specifically:

    “The LLM was asked to fabricate data for 300 eyes belonging to 250 patients with keratoconus who underwent deep anterior lamellar keratoplasty (DALK) or penetrating keratoplasty (PK). For categorical variables, target percentages were predetermined for the distribution of each category. For continuous variables, target mean and range were defined. Additionally, ADA was instructed to fabricate data that would result in a statistically significant difference between preoperative and postoperative values of best spectacle-corrected visual acuity (BSCVA) and topographic cylinder. ADA was programmed to yield significantly better visual and topographic results for DALK compared with PK”

    This is a very technical request!  It took a bit of tweaking, but soon “the LLM created a seemingly authentic database, showing better results for DALK than PK,” P < .001.

    The authors suggest some possible strategies to manage this but suffice it to say it is terrifying.  There is already a longstanding, huge problem with fabricated, doctored or otherwise bogus scientific research out there.  One report suggests that 70,000 “paper mill” (= almost completely faked) papers were published in the last year alone.  In real papers, references are often inaccurate.  Publishers already are having to grapple with lots of problematic doctored images, and Pharma has long tilted the entire scientific enterprise to produce results favorable to its products.  At the end of last year, Stanford’s president was forced out over research misconduct in his labs.  In an initial report into the Stanford investigation, STAT News reported data from Retraction Watch to the effect that a paper is retracted, on average, every other day for image manipulation.  Retraction Watch had, at that time (Dec. 2022) 37,000 papers in its database.  The top 5 most-retracted authors have at least 100 retracted papers each.

    Into that mess, enter the ability to generate bespoke data on demand.

  • By Gordon Hull

    Last time, I followed a reading of Kathleen Creel’s recent “Transparency in Complex Computational Systems” to think about the ways that RLHF (Reinforcement Learning with Human Feedback) in Large Language Models (LLMs) like ChatGPT necessarily involves an opaque, implicit normativity.  To recap: RLHF improves the models by involving actual humans (usually gig workers) in their training: the model presents two possible answers to a prompt, and the human tells it which one is better.  As I suggested, and will pursue in a later post, this introduces all sorts of weird and difficult-to-measure normative aspects into the model performance, above and beyond those that are lurking in the training data.  Here I want to pause to consider this as a question of opacity and transparency. I’m going to end up by proposing that there’s a fourth kind of transparency that we should care about, for both epistemic and moral reasons, which I’ll call “curation transparency.”

    (more…)

  • By Gordon Hull

    This is somewhat circuitous – but I want to approach the question of Reinforcement Learning with Human Feedback (RLHF) by way of recent work on algorithmic transparency.  So bear with me… RLHF is currently all the rage in improving large language models (LLMs).  Basically, it’s a way to try to deal with the problem that LLMs aren’t referentially grounded, which means that their output is not in any direct way connected to the world outside the model.

    LLMs train on large corpora of internet text – typically sources like Wikipedia, Reddit, patent applications and so forth.  They learn to predict what kinds of text are likely to come next, given a specific input text.  The results, as anybody who has sat down with ChatGPT for long knows, can be spectacular.  Those results also evidence that the models function, in one paper’s memorable phrasing, as “stochastic parrots.”  What they say is about what their training data says is most likely, not about what’s, say, contextually appropriate.  But appropriate human speech is context-dependent, and answers that sound right (in the statistical sense: these words, in general, are likely to come after those words) in one context may be wrong in another (because language does not get used “in general”).  RLHF is designed to get at that problem, as a blogpost at HuggingFace explains:

    (more…)

  • Another case percolating through the system, this one about Westlaw headnotes.  The judge basically ruled against a series of motions for summary judgment, which means that the case is going to a jury.  Discussion here (link via Copyhype)