New APPS

Another Platonism in LLM Debates, Part 2. The Wittgensteinian strategy

July 10, 2025

Last time, I started a look at the work of the early AI researcher Margaret Masterson of the Cambridge Language Research Unit (CLRU). As demonstrated by Lydia Liu in a pair of articles (here and here), Masterson proceeded from Wittgenstein to a thorough deconstruction of traditional ideas of word meaning, moving instead to treating meaning as a function of a word’s associations, as we might find in a thesaurus. This approach is a clear forerunner to the distributed view of language applied in current LLMs. Here I’ll outline the basics of the Masterman approach and show how it applies to LLMs.

Masterson’s starting point is a Wittgensteinian point about the distinction between a word and a pattern. Counting with words would be “one, two three.” Counting with patterns would be “-, –, —.” But what if we counted “one, one one, one one one.” Can words function as patterns? Masterman applies the thought to the classical Chinese character “zi” (字, which I’ll write here as “zi”), the meaning of which depends on its context and placement in a given text. Thus, “for Masterman, the zi is what makes the general and abstract category of the written sign possible, for not only does the zi override the Wittgensteinian distinction of word and pattern, but it also renders the distinction of word and nonword superfluous” (Witt., 442).

(more…)
AI and Copyright: Training Data and Transformative Fair Use

July 3, 2025

I’ve been loosely tracking the AI and copyright cases, most notably the Thaler litigation, where Thaler keeps losing the argument that work solely by an AI should get copyright protection. To summarize: everybody who has ruled on that said that only work involving humans can get copyright protection. As I said at the time, I think a good policy reason in support of this argument is that if pure AI could get copyright, it could produce millions of copyrighted images in almost zero time. That’s got nothing to do with incentivizing human creation. It was easy to miss given the deluge of atrocious Supreme Court decisions, but last week, a pair of district court judges ruled on a different (but not unrelated, in terms of markets) AI copyright question – whether scraping online text for training data is fair use. Both cases are in the Northern District of California, so we can expect the 9^th Circuit to have the first appellate decision on this topic.

By way of background: fair use is an affirmative defense against copyright infringement. That means that if you accuse me of infringement, I can defend myself as having engaged in “fair use,” which basically means “use that the copyright owner doesn’t like, but that we as a society think should be allowed for policy reasons.” It could also mean “use that everybody thinks is ok, but for which licensing would be so inefficient that a licensing market would never emerge.” Fair use is supposed to be decided case-by-case. It depends on four factors: the “nature and purpose” of the (allegedly infringing) use, the nature of the copyrighted work, the amount of it used, and the market effects of the infringing use. The middle two factors tend not to matter much. The first factor is usually decided by a deciding whether the use in question is “transformative.” For example, consider parody; the Supreme Court ruled back in 1994 that a 2 Live Crew parody of Roy Orbison’s “Pretty Woman.” was fair use. The most closely analogous case I know to the training data was an appellate decision about Google thumbnails.

(more…)
Another Platonism in LLM Debates, Part 1: A Wittgensteinian Rejection of Cartesian AI

June 26, 2025

By Gordon Hull

There’s an emerging literature on Large Language Models (LLMs, like ChatGPT) that basically argues that they undermine a bunch of our existing assumptions about how language works. As I argued in a paper a year and a half ago, there’s an underlying Cartesianism in a lot of our reflections on AI, which relies on a mind/body distinction (people have minds, other things don’t), and then takes language use as sufficient evidence that one’s interlocutor has a mind. As I argued there, part of what makes LLMs so alarming is that they clearly do not possess a mind, but they do use language. So they’re the first examples we have of an artifact that can use language; language-use is no longer sufficient to indicate mindedness. In that paper, drew the implication that we need to abandon our Cartesianism about AI (caring whether it “has a mind”) and become more Hobbesian (thinking about the sociopolitical and regulatory implications of language-producing artifacts). Treating LLMs as the origin points of speech has real risks, including making the human labor that produces them invisible, and making it harder to impose liability since machines can’t meet a standard scienter requirement for assigning tort liability.

Here I want to take up a somewhat different thread, one that I started exploring a while ago under the general topic of iterability in language models. This thread takes the literature on language models seriously; where I want to go with it is to talk about an under-discussed latent Platonism in how we tend to approach language(models). I’ll start with the literature, which divides into a couple of sections, a Wittgensteinian and a Derridean.

1. The Wittgensteinian Rejection of Cartesian AI

Lydia Liu makes the case for a direct Wittgensteinian influence on the development of ML, via the Cambridge Researcher Margaret Masterson. I only ran into this work recently, so on the somewhat hubristic assumption that other folks in philosophy also don’t know it, I’ll offer a basic summary here (in my defense: Liu says that “the news of AI researchers’ longtime engagement with Wittgenstein has been slow to arrive.” She then adds that “the truth is that Wittgenstein’s philosophy of language is so closely bound up with the semantic networks of the computer from the mid-1950s down to the present that we can no longer turn a blind eye to its embodiment in the AI machine” (Witt., 427)).

(more…)
Tidbit of good news on indirect costs

June 23, 2025

The NSF had attempted to reduce indirect costs (F&A) on all future grants to 15%, in a somewhat more coherent version of the NIH's effort to do so for all ongoing and future grants. A federal court today enjoined the rate cut, vacating the new rule, finding that "National Science Foundation’s 15% Indirect Cost Rate and the Policy Notice: Implementation of Standard 15% Indirect Cost Rate, NSF 25-034 are invalid, arbitrary and capricious, and contrary to law"
Helen de Cruz, 1978-2025

June 21, 2025

Former NewAPPS blogger Helen de Cruz passed away yesterday, Friday, June 20, 2025. I never met Helen personally, but they were part of a vibrant community of bloggers at NewAPPS that welcomed me and provided the supportive context for my own development as a blogger ten years ago. That experience was, and is, very important to me, and I will always be grateful for it.

I recommend Eric Schliesser's remarks here.

Helen's NewApps writings are here.
Your Brain on ChatGPT

June 19, 2025

I wish I’d come up with that title, but it actually belongs to a new study led by Natalia Kosmyna of the MIT Media Labs. The study integrates brain imaging with questions and behavioral data to explore what happens when people write essays using large language models (LLMs) like ChatGPT. I haven’t absorbed it all yet – and some of the parts on brain imaging are well beyond my capacity to assess – but the gist of it is to confirm what one might suspect, that writing essays with ChatGPT isn’t really very good exercise for your brain. The study assigned participants to one of three groups and had each write essays. One got to use an LLM, one used Google search, and one wasn’t allowed to use either.

The results weren’t a total surprise:

“Taken together, the behavioral data revealed that higher levels of neural connectivity and internal content generation in the Brain-only group correlated with stronger memory, greater semantic accuracy, and firmer ownership of written work. Brain-only group, though under greater cognitive load, demonstrated deeper learning outcomes and stronger identity with their output. The Search Engine group displayed moderate internalization, likely balancing effort with outcome. The LLM group, while benefiting from tool efficiency, showed weaker memory traces, reduced self-monitoring, and fragmented authorship. This trade-off highlights an important educational concern: AI tools, while valuable for supporting performance, may unintentionally hinder deep cognitive processing, retention, and authentic engagement with written material. If users rely heavily on AI tools, they may achieve superficial fluency but fail to internalize the knowledge or feel a sense of ownership over it” (138)

These results were corroborated by the brain imaging, and “brain connectivity systematically scaled down with the amount of external support: the Brain‑only group exhibited the strongest, widest‑ranging networks, Search Engine group showed intermediate engagement, and LLM assistance elicited the weakest overall coupling” (2 (from the abstract)). That is:

(more…)
New Paper: “AI and Healthcare Disparities: Lessons from a Cautionary Tale in Knee Radiology”

June 12, 2025

This paper, which has been forthcoming in Journal of Medicine and Philosophy for a while, is my foray into AI and healthcare, particularly medical imaging. It synthesizes some of what I have to say about structural injustice in AI use (and why "bias" isn't the right way to assess it), and uses a really interesting case study from the literature to explore why it's important to understand AI as part of sociotechincal systems – and how understanding it as part of sociotechnical systems makes a big difference in seeing when/how it can be helpful (or not). Here's the abstract:

Enthusiasm about the use of AI in medicine has been tempered by concern that algorithmic systems can be unfairly biased against racially minoritized populations. This paper uses work on racial disparities in knee osteoarthritis diagnoses to underline that achieving justice in the use of AI in medical imaging will require attention to the entire sociotechnical system within which it operates, rather than isolated properties of algorithms. Using AI to make current diagnostic procedures more efficient risks entrenching existing disparities; a recent algorithm points to some of the problems in current procedures while highlighting systemic normative issues that need to be addressed while designing further AI systems. The paper thus contributes to a literature arguing that bias and fairness issues in AI be considered as aspects of structural inequality and injustice and to highlighting ways that AI can be helpful in making progress on these.
Bullshit AI Jobs

June 5, 2025

Last week we heard the latest installment in the prophesized AI jobs apocalypse. This time, it was Dario Amodi, the CEO of Anthropic, who told Axios that “AI could wipe out half of all entry-level white-collar jobs — and spike unemployment to 10-20% in the next one to five years” (italics original). Axios adds: “Imagine an agent writing the code to power your technology, or handle finance frameworks and analysis, or customer support, or marketing, or copy editing, or content distribution, or research. The possibilities are endless — and not remotely fantastical. Many of these agents are already operating inside companies, and many more are in fast production …. Make no mistake: We've talked to scores of CEOs at companies of various sizes and across many industries. Every single one of them is working furiously to figure out when and how agents or other AI technology can displace human workers at scale. The second these technologies can operate at a human efficacy level, which could be six months to several years from now, companies will shift from humans to machines.” The piece then argues that this will be different from previous technological disruptions because of the speed with which it will occur.

Someone should tell that to the workers placed out of work all-but overnight by the development of machinery in the nineteenth century, as detailed by Marx (who helpfully notes in the Machinery chapter in Capital that the drive to full, steam-engine driven automation is motivated by the inability of capitalists to extract any more surplus value from over-exploited workers). One should also remember, with Jathan Sadowski, that these sorts of proclamations are in part designed to create their own reality, such that “the power of expectations can have a disciplining effect on what people think” and that “the capitalist system is designed to pummel us into submission, preventing us from imagining life could be any other way, let alone allowing us to go on the offensive” (The Mechanic and the Luddite, 196, 207). When Axios adds that “this will likely juice historic growth for the winners: the big AI companies, the creators of new businesses feeding or feeding off AI, existing companies running faster and vastly more profitably, and the wealthy investors betting on this outcome,” one can thus hardly be too surprised.

Here, I want to take a slightly different angle however, and think a little bit about the kinds of jobs that are supposed to go away. It’s hard not to notice the parallels between the Axios list and this one:

(more…)
Phenomenological Seams (part 3): Governance and the Home

May 29, 2025

By Gordon Hull

Over a couple of posts (first, second), I’ve used a recent paper by Brett Frischmann and Paul Ohm on “governance seams” – basically, inefficiencies (sometimes deliberate) in sociotechnical systems that are moments for governance – to think about what I called “phenomenological seams,” which are corresponding disruptions in our experience of the world. I suggested that the combination of the two ideas could be usefully explored by way of Albert Borgmann’s criticism of the stereo as a way to listen to music, rather than direct instrumentation, against the background of Heidegger’s account of breakdowns in our phenomenological experience that occur when tools aren’t as expected.

Borgmann’s objections are notable because of how the stereo, as opposed to the instrument, recalibrates the seams that structure the boundaries of the home. This is obvious enough in retrospect, in a world with Spotify and wireless earbuds, but the connection shows a couple of things.

First, as Heidegger’s examples also indicate, there is a strong connection between technological governance and phenomenology. Frischmann and Ohm emphasize that governance seams can be there both to announce their own presence and to achieve transparency. This interruption makes transparent the phenomenological relations that govern a particular experience in the same way that a broken hammer does. For all that’s wrong with them, the GDPR-mandated cookie notices on websites try to change the experience of websites and nudge users to think about the amount of data they are surrendering. This example also shows the ways that phenomenological experience can limit the regulatory effect of seams: privacy is way too much work, and users are (as a result) cynical, confused, and disillusioned. But the cookie requirement establishes a new relation to the websites, precisely because the regulatory seam has phenomenological import.

(more…)
GOP wants to Deregulate AI completely

May 22, 2025

The House budget bill is deeply stupid. No, I don’t mean the massive tax cut extensions for people who don’t need them, done on the backs of food and medical care for the poor, although it also does that and it’s stupid. I mean the provision that bans states from regulating AI. Tucked inside is a provision that says “no state or political subdivision may enforce any law or regulation regulating artificial intelligence models, artificial intelligence systems, or automated decision systems during the 10-year period beginning on the date of the enactment of this Act.” This is making the news if you’re in the right circles, but like so much of the deluge coming out of Washington these days, it’s not getting the national attention it deserves. Over on Lawfare, Katie Fry Hester and Gary Marcus have the details. The key point is that the law attempts to pre-empt all state AI regulations. It would probably also take out big chunks of state privacy laws, just as we’re starting to see some of them.

Hester and Marcus emphasize some of the problems: the bill likely violates the 10^th Amendment, it is absolutely a policy change of the sort that the Senate Parliamentarian should rule out-of-bounds for a reconciliation bill, and it’s deeply unpopular: the public is worried about AI and wants it regulated. A standard debate about state vs. federal regulation cites the Brandeis “laboratories of democracy” vs. the need for uniform federal rules. This is a reasonable debate, and it probably depends on the topic which you’d favor. They are not necessarily exclusive choices either, as federal regulation can set floors and ceilings that states may go above/below, and sometimes federal regulations develop out of state rules. Federal copyright law preempts state, and that system makes obvious sense. State laws around gambling or alcohol make sense given the diversity of local cultures.

That debate from your civics class is however not what this is about. The problem of course is not that we’d have federal policy instead of state. The problem is that there is absolutely no chance that this Congress will pass meaningful AI regulation, so the choice is between a patchwork of state rules and nothing. Congress hasn’t even passed meaningful privacy regulation yet, and right now they’re in thrall to an autocrat who issued an Executive Order in his first week in office directing agencies to “suspend, revise, or rescind such actions, or propose suspending, revising, or rescinding such actions” taken in compliance with the Biden Administration’s (bare minimum) efforts at AI regulation, in the name of “revok[ing] certain existing AI policies and directives that act as barriers to American AI innovation, clearing a path for the United States to act decisively to retain global leadership in artificial intelligence.”

(more…)

recent posts

about