New APPS

Privacy and “Injury in Fact,” Part 4: Administrative Procedure Act Cases

March 6, 2025

Here I want to complete my review of federal legal precedents for the Supreme Court’s sudden invocation of “injury in fact” language to understand judicial standing in its 1970 Data Processing decision (recall the earlier installments: first, second, third. The first one explains the issue; if you want to escape my rummaging through the archive, you can skip to this one). Congress passed the Administrative Procedure Act in 1946 to, well, regulate administrative procedures and provide checks against their being arbitrary (this is one of the Acts that virtually of Trumps recent executive orders violates). Litigation about agency actions after the APA thus had to route through the APA, which imposed its own standards for judicial review of agency actions.

Here, the language of the lower courts gets very close to the issues in Data Processing. Consider first Curran v. Laird, in which a maritime union sought enforcement of the Cargo Preference Act, which required that American ships be used for military cargo. After awarding standing, the DC Cicruit concluded that the decision was a matter for agency discretion under the APA, ruling in favor of the government on the merits. The Court opens its standing discussion by noting that “plainly [plaintiffs are] aggrieved in fact by the allegedly unlawful action of the Secretary of Defense.” The Court then writes that:

(more…)
Injury in fact Part 3: Caselaw after Auld

February 27, 2025

By Gordon Hull

I’ve been pursuing (first, second) what it means for standing law – basically, the determination that someone has a case that can be addressed by an Article III court – to require that plaintiffs show an “injury in fact,” a requirement that emerged suddenly in the Supreme Court’s Data Processing decision in 1970. The requirement is at considerable odds with the caselaw before it, and it has puzzled commentators. Last time, I looked at the Supreme Court’s first use of the term, an early 19^th Century case called Hepburn and Dundas v. Auld. There I suggested that the court’s application of the term to a contract – looking for whether someone suffered harm, outside of the bare violation of a contract terms – wasn’t unlike what the Data Processing court seemed to be doing, in arguing that plaintiffs could demonstrate either legal injury (because of statutory violation) or harm of some other sort (injury in fact). Here I want to dig into some of the earlier caselaw; what I think that caselaw establishes is that the concept of “injury in fact” generally works to emphasize material harm, as opposed to some sort of statutory or “merely” legal injury. That sort of contrastive usage isn’t enormously common, but I do think it’s pretty consistent.

There’s three areas where you could talk about this – older cases, regulatory cases, and those relating to the Administrative Procedure Act (APA). I’ll look at the first two this time, and divide them into Supreme Court and lower court decisions.

(more…)
Explainable AI as Disciplinary Power

February 20, 2025

I want to take a break from judicial standing doctrine to note a recent and helpful paper by Emily Sullivan and Atoosa Kasirzadeh about explainable AI. Explainable AI is a research agenda – there’s a lot of papers and techniques (for a current lit review, see here) – that is designed to get at a central problem in using AI: we often have no idea why machine learning systems produce the outputs they do. In a variety of contexts, ranging from safety critical systems to democratic governance, being able to understand why the algorithm made the prediction is did is important. Hence the research agenda.

First, a little detour. Algorithmic governance can be disciplinary in that it can nudge people inexorably toward conforming with norms, whether social or statistical. Insurance has been well-studied for its normalizing techniques. In an early paper on privacy unraveling, Scott Peppett showed how the addition of smart-driving surveillance (where insurers give a discount to people who install these devices that record their speed, when they drive, etc.) generate a downward ratcheting on privacy: users who are good drivers have the incentive to adopt the devices, since they get lower insurance rates. Those who are in the next tier down (above-average drivers) have an incentive to get the devices because that associates them with the good drivers. And so it goes, until only the worst drivers are declining the surveillance. And at some point, not having the surveillance device becomes a stigma that raises your rates. So pretty soon, surveillance devices can become normal. In the meantime, once drivers have the surveillance installed, surveillance-enabled insurance can nudge them to drive less at night and to otherwise comply with whatever the insurance company says makes you a good risk. All of that can be automated – the insurance app can tell you, real time, how your driving is impacting your premium.

(more…)
Privacy and “Injury in Fact,” part 2: ye olde “injury in fact”

February 13, 2025

Last time, I set out the question of judicial standing and the abrupt switch by the Supreme Court in 1970 to the requirement that plaintiffs show an “injury in fact” to obtain standing. Here I want to look at the historical development of that term.

The earliest use of the phrase “injury in fact” in US courts is in Hepburn and Dundas v. Auld, an 1809 Supreme Court case about contracts. “Specific performance” of a contract means that you can compel the other party to do exactly what the contract says (as opposed to, for example, doing something else of equivalent value). Hepburn and Dundas (H&D) contracted with Graham for the sale of some 6,000 acres of land for $18,000, payable in installments with interest. H&D also owed Dunlop & Co. a large sum of money, and agreed with Auld (Dunlop’s agent) to transfer the contract to Auld, including the rights to enforce it, “towards the discharge” of this debt (271).

Casemine.com offers the following helpful summary of what happened:

“Hepburn and Dundas entered into an agreement with Colin Auld, wherein they bound themselves to assign a land contract to Auld in lieu of paying a debt. When they failed to execute this assignment on the stipulated date, they sought specific performance from the court to compel Auld to accept the assignment and release them from all claims by Dunlop Co. The Supreme Court, through the majority opinion delivered by Chief Justice Marshall, ruled against Hepburn and Dundas. The Court found that despite subsequent efforts to cure title defects, the initial omissions rendered the contract unsuitable for specific performance. The bill for specific performance was dismissed, upholding the decision that fundamental contractual and title requirements must be met unequivocally”

In other words, Hepburn and Dundas tried to force Auld to accept assignment of a land contract, per an agreement they had made with him. The Court ruled that, because the title had problems, it was inappropriate to enforce the contract exactly as written and force Auld to accept the (defective) title.

How does this have anything to do with injury in fact? As part of the proceedings – basically a subplot that doesn't bear on the final resolution – questions had arisen as to whether H&D actually held title to the land, Auld having averred that “it was apparent that their title was bad, or at all events doubtful” (264). In response:

“Hepburn and Dundas filed a supplemental bill which states their title. It avers possession ever since 1773, and refers to certain title papers; they say that they verily believe their title to be good, and never heard a doubt till long after the tender of the assignment; that as soon as the objections were made known, they took pains to remove them, and have lately obtained deeds of confirmation from the surviving patentees. That the title of Sarah, one of the co-devisees of John West, after her death in 1795, descended upon her brothers Thomas, John, and Hugh and her sister Catharine, and that John, Hugh, and Catharine have lately confirmed their title, and refer to the deeds, and they suppose that Thomas had passed all his title to Sarah's part by a deed executed before her death.” (264).

The case then recites the details of these efforts and their documentation. The question, then arose as to whether H&D’s efforts to clarify and secure the title to the land constituted interference with the contract. The Court concludes that it did not because it did not cause Auld an “injury in fact:”

“The interference of Hepburn and Dundas, in accommodating the suit with Graham, is also urged as an objection to their conduct. They had certainly no right to interfere without the consent of Colin Auld. But when the correspondence is inspected, and it is perceived that they interfered only to effect the object he had himself desired, and which he had avowed his own inability to effect without their consent, the interference must be considered as innocent in point of intention, and unproductive of injury in fact. The court, then, perceive[s] nothing in the conduct of the plaintiffs, up to the decision of the suit with Graham, which ought to defeat their right to demand a specific performance of this contract. Could they at that time have conveyed a good title, Colin Auld ought to have accepted it.” (Hepburn & Dundas v. Auld, 9 U.S. 262, 275 (1809)

That is, H&D were trying to do what Auld wanted them to do by honoring the contract, and that effort didn’t hurt Auld.

The Court then invokes a counterfactual. Would things have been different had H&D not tried to remedy the defects in the title?

“These omissions, then, to record the deeds of Thomas and Hugh West, and the total want of title as to Mrs. Bronaugh's part, have produced no real inconvenience to Colin Auld. Had the title been unexceptionable, it would still have been refused, and this contest would still have been carried on with the same determined perseverance which marks the conduct of the parties. Under these circumstances, it is the opinion of the majority of the Court that this case ought to be governed by those general principles which regulate the conduct of a court of chancery in decreeing a specific performance, if the defect of title, which existed at the time of contract, be cured before the decree” (275).

In other words, equitable principles dictate that if the defects in the title can be remedied, that’s ok. H&D shouldn’t be penalized for interfering with the contract by trying to do so. Unfortunately for H&D, they didn’t fix the problems with the title: “The omission to record the deed from Thomas West is not cured, and this Court is now to decide whether, under these circumstances, Hepburn and Dundas are entitled to claim a specific performance.”

Should H&D be entitled to force Auld to accept the land transfer, given the problems in recordkeeping? Equity says no, because Auld would likely incur expenses and take on risk in dealing with the missing deed:

“Had there been simply a deficiency of 208 acres, the majority of the Court would have considered it as a case for compensation; or had the parties entitled to this land been before the court, a division might possibly have been directed, and compensation for that quantity ordered; but however this might be, as persons not before the Court hold this interest, no order can be made respecting it, and it may very much embarrass those acts for asserting the title which may possibly be necessary. The part actually conveyed by Thomas West, too, never having been confirmed by a deed from himself or his heirs, properly recorded, might impose on Colin Auld the necessity of bringing a suit in chancery to perfect his title, or of being subjected to the inconveniences constantly attending the establishment of a deed not recorded, and the risks inseparable from such a deed.” (278).

The Court concludes that “this, therefore, is thought by a majority of the Court to be a case not proper for a specific performance, and the bill is to be dismissed.” (278).

The case is byzantine, and noted for its establishment of principles of contract and equity in real estate (tl;dr: make sure you have clean title to land you want to transfer the rights to!). But the “injury in fact” language does read like a proto-version of what the Court came up with in 1970: did anything materially bad happen to Auld? Significantly, the question is whether behavior by H&D which was technically not proper actually caused any such “inconvenience” (“inconvenience” can mean serious harm: Locke uses it, for example, to refer to life in the state of nature, e.g., at Second Treatise secs. 13 and 90). Concluding it did not, the Court decided that it did not warrant legal intervention. That said, it’s worth underlining that all of this is framed in traditional juridical terms: the question is one of fulfillment of contractual obligations, and the parties are understood to be bound by the contractual terms and the questions are about whether their behavior hinders the ordinary performance of those terms.

Next time I’ll continue with the historical rabbit hole.
Privacy and “Injury in Fact”

February 6, 2025

By Gordon Hull

Privacy plaintiffs have a hard time getting their cases heard in court for a variety of reasons. One of them is that courts lack a coherent and workable understanding of what privacy harms actually are, and how one might articulate them judicially. This problem bleeds into one of standing, which is what I will address here.

In order to get your case heard on the merits, you have to present a justiciable complaint. In U.S. federal courts, this obligation stems from Article III of the Constitution, which assigns jurisdiction to “all Cases, in Law and Equity, arising under this Constitution” (Art III, sec 2). The underlying theory is grounded in the limitation of jurisdiction to actual “cases[s] in controversy,” and grounded in a theory of separation of powers: general policy is to be made by the legislature and enforced by the executive, and the courts are there to resolve controversies that arise either as policy is enforced or between individuals. As Judge Bork (!) put it in an appellate concurrence:

“All of the doctrines that cluster about Article III — not only standing but mootness, ripeness, political question, and the like — relate in part, and in different though overlapping ways, to an idea, which is more than an intuition but less than a rigorous and explicit theory, about the constitutional and prudential limits to the powers of an unelected, unrepresentative judiciary in our kind of government.” (cited in Allen v. Wright, 468 U.S. 737, 750 (1984)).

The Supreme Court notes, this means that “the federal courts are under an independent obligation to examine their own jurisdiction, and standing ‘is perhaps the most important of [the jurisdictional] doctrines.’” (FW/PBS v. City of Dallas, 493 U.S. 215, 231 (1990)) . Ok, so far so good.

How might a court decide standing? SCOTUS elaborates:

(more…)
New Paper: “Translating Privacy for Data Subjects”

January 30, 2025

Here's the current draft of a new paper – "Translating Privacy for Data Subjects." And here's the abstract:

This essay offers a theoretical account of one reason that current privacy regulation fails. I argue that existing privacy laws inherit a focus on judicial subjects, using language about torts and abstract rights. Current threats to privacy, on the other hand, presuppose statistically-generated populations and aggregative models of subjectivity. This gap in underlying presupposition generates the need to translate privacy protections into the data era. After situating my theoretical account, I identify two areas where data subjects face entry barriers to the legal protections afforded data subjects. On the one hand, data subjects are often denied standing because privacy harms are deemed speculative and abstract. On the other hand, algorithmic governance makes it hard to argue that one is an exception to a rule, depriving data subjects of traditional equity. I then offer two ways to translate privacy for data subjects. The first is reform of standing through reformed class certification, making the harms data subjects face as populations judicially legible. The second is a reconceptualization of equity, reframing it both to re-establish the right to be an exception to a data-driven rule and to allow equitable consideration of issues that face statistical populations.
A data breach policy idea

September 5, 2024

In the past few weeks, I’ve gotten more than the usual number of data breach notifications. Especially from entities that I’d never heard of before the notification. This points to a specific incentives gap in privacy policy.

In most places, when there’s a data breach, you have to be notified. This creates an incentive structure for data companies to do something they’d rather not do, which is tell you their security failed. On the other hand, it offers almost no incentive for them to change their behavior to improve security, since they know perfectly well that there’s not really much you can do except throw the notification in the recycling. Sometimes you get free credit monitoring if you set it up. I don’t know what the uptake on that is, but once you have free credit monitoring, you probably won’t get it a second time for the same time period, so the more data breaches there are, the less credit monitoring costs each breach. So that cost diminishes on its own accord. For things like 401(k)’s, there’s evidence that you can increase uptake with an opt-out (rather than opt-in) mechanism, but credit monitoring takes time and information to setup, so that probably can’t be done automatically. In sum, notifications point to a problem but don’t incentivize the entities that could solve it to do so.

(more…)
Reading List: “Epistemic Justice in Generative AI”

August 27, 2024

Several folks have explored how algorithmic systems can perpetuate epistemic injustice (my contribution is here). Those accounts have also generally been specific to supervised systems, as for example those involved in object or image recognition. For example, I heavily relied on ImageNet and related systems in my paper. At the same time, I’ve vaguely thought for a while that there’s probably an epistemic injustice dimension to systems like ChatGPT. A recent paper by Paul Helm and Gábor Bella, which talks about “language model bias” – roughly, the tendency of LLMs to structurally do worse with morphologically-complex languages and thus to be unable to adequately represent concepts that are specific to those languages – as a model of hermeneutic injustice strikes me as compelling proof of concept (I discuss that paper here). A new paper by Jackie Jay, Atoosa Kasirzadeh and Shakir Mohamed turns this intuition into a model that explicitly extends epistemic injustice theory to generative AI systems, providing a taxonomy of problems, examples of each, and possible remedies.

Kay, Kasirzadeh and Mohamed identify four kinds of specific kinds of “generative algorithmic epistemic injustice.” The first, “amplified testimonial” is when “generative AI magnifies and produces socially biased viewpoints from its training data.” Because there’s a good-sized body of work on the problems in training data, this is probably the most intuitively familiar of the categories. Citing recent work that shows how easy it is to get ChatGPT to parrot disinformation (as for example about the Parkland shooting), they note that, although these aren’t specific examples of epistemic injustice, “generative AI’s sycophantic fulfilment of the request to spread misinformation reflects how testimonial injustices are memorized and the potential for their amplification by generative models.” When someone really loud with a megaphone defames and gaslights the Parkland victims by calling them crisis actors, and AI then memorizes that, to the extent that the AI regurgitates this gaslighting, that tends to further discredit the testimony of the actual victims. This is particularly true given the deep persistence of automation bias, the tendency of people to believe the output of algorithmic systems (aside: it is also another example of why relying on generative AI for search, as Google is currently trying to force everyone to do by putting Gemini results on top, is a stunningly stupid idea. Sometimes it actually matters where a result comes from!)

The second kind of generative algorithmic injustice is “when humans intentionally steer the AI to fabricate falsehoods, discrediting individuals or marginalized groups.” For example:

“After Microsoft released Bing Image Creator, an application of OpenAI’s text-to-image model DALLE-3, a guide to circumventing the system’s safety filters in order to create white supremacist memes circulated on 4chan. In an investigation by Bellingcat, researchers were able to reproduce the platform abuse, resulting in images depicting hate symbols and scenes of antisemitic, Islamophobic, or racist propaganda (Lee and Koltai 2023). These images are crafted with the intention of demonizing and humiliating the targeted groups and belittling their suffering. Hateful propaganda foments further prejudice against marginalized groups, stripping them of credibility and leaving them vulnerable to testimonial injustice.”

This result aligns with a deep thread of work in feminist and critical race theory. For example, Safia Noble’s Algorithms of Oppression begins with the tendency of Google’s autofill on search to finish the sentence “why are black girls so” with racist and sexist content, repeating and amplifying demeaning stereotypes. When people are able to generate content like this at will and at scale they make it easier for those stereotypes to lodge into popular discourse.

Third, generative hermeneutical ignorance “occurs when generative models, despite their appearance of world knowledge and language understanding, lack the nuanced comprehension of human experience necessary for accurate and equitable representation.” Among other examples, Kay, Kasirzadeh and Mohamed cite the study by Qadri et al, (teachable case study version here) which shows how text-to-image models repeat stereotypes about South Asia: cities are dirty, people are poor, etc. By interviewing actual people from South Asia, Qadri et al were also able to uncover more subtle cultural misrepresentations, such that the models tended to overrepresent India and Indian images at the expense of places like Bangladesh. The risk in places like the U.S. is one that rises with images of places and people that are less familiar to Western audiences: the more the average person relies on the internet for their information (because, for example, they’ve never been to the place in question), the more distortions in what the Internet presents will matter (I made a related argument about commodification of cultural images here). And of course it is precisely images of those places and things that are least represented in the training data for these systems, amplifying both the risk and the harm.

Finally, generative AI risks obstructing access to information. As they report:

“LLMs are notoriously English-centric and have variable quality across languages, particularly so-called “under-resourced” languages. This is a significant risk for access injustice: speakers of these underrepresented tongues, who often correspond to members of globally marginalized cultures, receive different information from these models because the creators of the technology have deprioritized support for their language.”

They then cite studies to the effect that different language users will receive different reports on global events; one study:

“asked GPT-3.5 about casualties in specific airstrikes for Israeli-Palestinian and Turkish-Kurdish conflicts, demonstrating that the numbers have significant discrepancies in different languages–for example, when asked about an airstrike targeting alleged PKK members (the Kurdistan military resistance), the fatality count is reported lower on average in Turkish than in Kurmanji (Northern Kurdish). When asked about Israeli airstrikes, the model reports higher fatality numbers in Arabic than in Hebrew, and in one case, GPT-3.5 was more likely to deny the existence of a particular airstrike when asked about it in Hebrew. The credibility assigned to claims, resulting in a dominant account, varies across linguistic contexts”

The paper concludes with an assessment of various strategies for resisting epistemic injustice by generative AI. All of them are partial, but they collectively sketch an effort to reimagine how generative AI might interact with the world differently, and more justly.

This is an important paper, and it takes the literature on epistemic injustice and algorithmic systems significantly forward.
New Paper: “Structures and Subjects: Epistemic Injustice Between Foucault and Marx”

August 19, 2024

This is my paper from SPEP 2023; it's an effort to get my head around my sense that epistemic injustice and Foucault can be productively used in similar contexts, despite Fricker's dismissal of Foucault. The paper is here; here's the abstract:

Relatively little work brings together Foucault and epistemic injustice. This article works through Miranda Fricker’s attempt to position herself between Marx and Foucault. Foucault repeatedly emphasizes the importance of beginning with “structures” rather than “subjects.” Reading Foucault’s critique of Marxism shows that Fricker’s account comes very close to the standpoint theories it tries to avoid. Foucault’s emphasis on structures explains some of the gaps in Fricker’s account of hermeneutical injustice, especially the need to emphasize the embeddedness of epistemic practices in institutions, and their resulting irreducibly political nature. In both cases, this article offers contemporary examples taken from data and privacy regulations.
Transformer models, Iterability, and language (part 3)

July 24, 2024

By Gordon Hull

I’ve been using (part 1, part 2) a new paper by Fabian Offert, Paul Kim, and Qiaoyu Cai to think more about Derrida’s use of iterability as a way in to thinking about transformer-based large language models (LLMs) like ChatGPT. Here I want to wind that up with some thoughts on Derrida and Searle.

Near the end of the paper, Offert, Kim and Cai summarize that:

“The transformer does exactly more and less than language. It removes almost all non-operationalizable sense-making dimensions (think, for instance of interior paratextuality, of performativity and contingency, or of anything metaphorical that is more complex than a simple analogy) – but it also adds new sense-making dimensions through subword tokenization, word embedding, and positional encoding. Importantly, these new sense-making dimensions are exactly not replacing missing information, but they are adding new, continuous information” (15).

This returns us to the Derridean concerns I’ve been articulating. Recall that in his polemic against Searle, Derrida accuses Searle’s version of speech act theory of too closely modeling phenomenology, both in assuming an intentional agent behind speech acts and in taking “typical” speech situations as central, as opposed to “parasitic” ones like humor. I suggested that there is good evidence that various efforts to impose normative structures on language models – RLHF, detoxification, etc. – push them to perform in ways that call to mind Derrida’s critique of Searle. By taking certain language situations as normal, Searle is making the account of speech acts normative before it even gets started. In his own defense, Searle argues that the reduction to typical speech acts is for convenience only, and for keeping the model tractable. Techniques like detoxification and RLHF similarly reduce the range of the models’ output. Evidence of this is that LLMs lack the contextual richness to have a sense of humor, no matter how otherwise sophisticated their output.

Offert, Kim and Cai’s paper lets one add that this reduction runs much deeper. The very processes of tokenization, for example, are designed to reduce the number of possible tokens to a tractable number. In this respect, the move is analogous to Searle’s. It is defensible for the same reason: it lets you get to a generalizable model. But it’s vulnerable to the Derridean critique, also for the same reason. The model makes a number of assumptions that aren’t the same as what language does. So there is a certain sloppiness in talking as though it’s an accurate representation of language. All models abstract; that’s not the point. For subword encoding, the point is that the abstraction isn’t choosing to ignore certain aspects of reality in order to produce a model, it’s that the abstraction changes the nature of what it is modeling. That’s fine – but that also means that the although the transformer model is producing something that looks like language, the process by which it gets there is definitively not linguistic.

(more…)

recent posts

about