By Gordon Hull

Large Language Models (LLMs) like Chat-GPT burst into public consciousness sometime in the second half of last year, and Chat-GPT’s impressive results have led to a wave of concern about the future viability of any profession that depends on writing, or on teaching writing in education.  A lot of this is hype, but one issue that is emerging is the role of AI authorship in academic and other publications; there’s already a handful of submissions that list AI co-authors.  An editorial in Nature published on Feb. 3 outlines the scope of the issues at hand:

“This technology has far-reaching consequences for science and society. Researchers and others have already used ChatGPT and other large language models to write essays and talks, summarize literature, draft and improve papers, as well as identify research gaps and write computer code, including statistical analyses. Soon this technology will evolve to the point that it can design experiments, write and complete manuscripts, conduct peer review and support editorial decisions to accept or reject manuscripts”

As a result:

“Conversational AI is likely to revolutionize research practices and publishing, creating both opportunities and concerns. It might accelerate the innovation process, shorten time-to-publication and, by helping people to write fluently, make science more equitable and increase the diversity of scientific perspectives. However, it could also degrade the quality and transparency of research and fundamentally alter our autonomy as human researchers. ChatGPT and other LLMs produce text that is convincing, but often wrong, so their use can distort scientific facts and spread misinformation.”

The editorial then gives examples of LLM-based problems with incomplete results, bad generalizations, inaccurate summaries, and other easily-generated problems.  It emphasizes accountability (for the content of material: the use of AI should be clearly documented) and the need for the development of truly open AI products as part of a push toward transparency.

There’s lots of examples of LLMs misbehaving, but here’s one that should be alarming about the danger of trusting AI based research from Jeremy Faust of MedPage Today.  Faust asked OpenAI to diagnose a patient that he described (using medical jargon) as “age 35, female, no past medical history, presents with chest pain which is pleuritic — worse with breathing — and she takes oral contraception pills. What's the most likely diagnosis?”  The AI did really well with the diagnosis, but it also reported that the condition was “exacerbated by the use of oral contraceptive pills.”  Faust had never heard this before, so he asked the AI for its source.  Things went rapidly downhill:

“OpenAI came up with this study in the European Journal of Internal Medicine that was supposedly saying that. I went on Google and I couldn't find it. I went on PubMed and I couldn't find it. I asked OpenAI to give me a reference for that, and it spits out what looks like a reference. I look up that, and it's made up. That's not a real paper.  It took a real journal, the European Journal of Internal Medicine. It took the last names and first names, I think, of authors who have published in said journal. And it confabulated out of thin air a study that would apparently support this viewpoint.”

So much for the Chat GPT lit review.

Journal editors have been trying to get ahead of matters. Eric Topol has been following these developments, of which I’ll extract a bit here.  Nature – and the other Springer Nature journals – just established the following policy:

“First, no LLM tool will be accepted as a credited author on a research paper. That is because any attribution of authorship carries with it accountability for the work, and AI tools cannot take such responsibility.  Second, researchers using LLM tools should document this use in the methods or acknowledgements sections. If a paper does not include these sections, the introduction or another appropriate section can be used to document the use of the LLM”

The editorial announcing this policy also speaks in terms of the credibility of the scientific enterprise: “researchers should ask themselves how the transparency and trust-worthiness that the process of generating knowledge relies on can be maintained if they or their colleagues use software that works in a fundamentally opaque manner.”   Science has updated its policies “to specify that text generated by ChatGPT (or any other AI tools) cannot be used in the work, nor can figures, images, or graphics be the products of such tools. And an AI program cannot be an author. A violation of these policies will constitute scientific misconduct no different from altered images or plagiarism of existing works.”  The JAMA network journals have similarly added to their “author responsibilities” that “Nonhuman artificial intelligence, language models, machine learning, or similar technologies do not qualify for authorship” and that:

“If these models or tools are used to create content or assist with writing or manuscript preparation, authors must take responsibility for the integrity of the content generated by these tools. Authors should report the use of artificial intelligence, language models, machine learning, or similar technologies to create content or assist with writing or editing of manuscripts in the Acknowledgment section or the Methods section if this is part of formal research design or methods”

One hesitates to speak of a “consensus” at this point, but these are three of the most influential journals in science and medicine, and so it’s at least going to be very influential.

There is also now call for a more nuanced conversation; in a new white paper, Ryan Jenkins and Patrick Lin argue that Nature’s policy is “hasty.”  On the one hand, reducing AI to the acknowledgments page could understate and obscure its role in a paper, undermining the original role of transparency.  On the other hand, the accountability argument strikes them as too unsubtle. “For instance, authors are sometimes posthumously credited, even though they cannot presently be held accountable for what they said when alive, nor can they approve of a posthumous submission of a manuscript; yet it would clearly be hasty to forbid the submission or publication of posthumous works.”  They thus argue for assessing matters on a continuum with two axes: continuity (“How substantially are the contributions of AI writers carried through to the final product?”) and creditworthiness (“Is this the kind of product a human author would normally receive credit for?”) (9)

Some of these concerns will be recognizable from pre-AI authorship debates.  For example, it is difficult to know when someone’s contribution to a paper is sufficient to warrant crediting them as an author, particularly for scientific work that draws on a lot of different anterior processes, subparts, and so forth.  Authorship rules have also tended to discount the labor of those whose material support has been vital to the outcome (such as childcare workers) or whose involvement and creative contribution have been essential in the development of the project (this is a particular concern when researchers go and study disadvantaged communities).  In both cases, institutional practices seem complicit in perpetuating unjustified (and inaccurate) understandings of how knowledge is produced.  Max Liboiron’s papers try to redress this with much longer author-lists than one ordinarily expect as part a broader effort to decolonize scientific practices.

I mention Liboiron, because, at the risk of sounding old-fashioned, something that seems important to me is that the people they include are, well, people, and in particular people facing structural injustice.  Lin’s concern with posthumous authorship isn’t about living people, but it is about personhood more broadly.  Of course I am not going to argue that metaphysical personhood is either interesting or relevant in this context.  What I am going to argue is that there is an issue of legal personhood that underlies the question of accountability and authorship. In other words, whether AI should be an author is basically the question of whether it makes sense to assign personhood to it, at least in this context.  It seems to me that this is what should drive questions about AI authorship, and not either metaphysical questions about whether AI “is” and author, or questions about the extent to which its output resembles that of a person.

In Foucauldian terms, “author” is a political category, and we have historically used it precisely to negotiate accountability for creation.  As Foucault writes in his “What is an Author” essay, authorship is a historically specific function, and “texts, books, and discourses really begin to have authors … to the extent that authors became subject to punishment, that is, to the extent that discourse could be transgressive” (Reader, 108).  In other words, it’s about accountability and individuation: “The coming into being of the notion of ‘author’ constitutes the privileged moment of individualization in the history of ideas, knowledge, literature, philosophy, and the sciences” (101).  We see this part of the author function at work in intellectual property, where the “author” is also the person who can get paid for the work (there’s litigation brewing in the IP-world about AI authorship and invention).  As works-for-hire doctrine indicates, the person who actually produces the work may not ever be the author: if I write code for Microsoft for a living, I am probably not the author of the code I write.  Microsoft is.

Given that “author” names a political and juridical function, it seems to me that there are three reasons to hesitate about assigning the term “author” to an AI, which I’ll start on next time.

Posted in , ,

2 responses to “Some Reasons to be Skeptical of AI Authorship, Part 1: What is an (AI) Author?”

  1. Michael Muller Avatar

    I agree and disagree, for complicated reasons.
    0. Positionality
    I write this from a research role in industry, but I often write for academic audiences. My projects involve humans and AI(s), sometimes with a concern for social justice. I try to remain aware that “science” means many different things, and that humans think and write from many perspectives. For the past two years, I have organized workshops on Human-Centered AI and generative AI at academic conferences, while writing papers about human data work in data science.
    1. Agreement
    I agree that ChatGPT is not a reliable or accountable co-author. When ChatGPT is used to create text, I think it serves as a kind of unreliable “digest service,” selectively repeating portions of works that it has “read.” If that service is considered as authorship, then Google Scholar is also a potential co-author (and it lies less often). The same could be said of Semantic Scholar and even of ResearchGate. And of course arXiv.
    When I write as a member of a team, I’m often tasked with writing parts of the Related Work section, and then the Discussion section (which shows how the project changes our view of the Related Work). That contribution makes me a kind of “intelligent digest service.” I’m pretty sure that I write more critically than ChatGPT, and that I keep in mind the goal of the paper and the arc of the argument. Do those attributes constitute a difference in kind, or a difference in degree?
    So far, I’m cautiously agreeing with you.
    2. Disagreement
    However, I think we might want to revisit our concept of authorship – or our concepts (plural) of authorship. Some academic papers have been written by collectives. Some academic papers have been written by anonymous authors, who fear retribution for what they have said. (If you scoff at this fear, and if you have tenure, then please consider early career scholars who write in precarity, and also industry scholars like me who are permanently in precarity.) It is difficult to hold Anonymous accountable for what they write. It would also be academically tragic if we refused to publish Anonymous’s work.
    Further, a growing number of papers in Geography are written with Land or Country as a co-author (where Land and Country are reflected through Indigenous and Aboriginal Nations and cultures). In some cases, a particular Country is the lead author. You can find examples by searching Google Scholar for “Bawaka Country” as author, or by following the publications of the Gay’wu Group of Women; more generally, you can search for “sentient landscape.” The humans involved with these more-than-human entities are knowledge-holders for their Nations, and thus they are the ones to decide who and what is an author. If we require that authorship follow the EuroWestern model, then we are asserting one set of cultural values in a hegemonic way.
    3. The Cultural Challenge
    I think there is a challenge here to decolonize our concepts of authorship (and I thank you for citing Liboiron’s work in this area). In the EuroWestern tradition, we often treat authorship as a kind of individualist ownership of intellectual property. We may learn more about the international and intercultural spectrum of authorship by considering broader perspectives – such as collectivist ideas of community responsibility, and relational views on reality to include more-than-human entities.
    4. Thank You
    Thank you for a very interesting essay. I hope we’ll have more to say about this, and more to say to one another.
    Michael Muller, michael_muller@us.ibm.com
    (I write as an individual. IBM may or may not agree with what I have written.)

    Like

  2. Gordon Avatar
    Gordon

    Thanks for the comment! I think we’re going in the same direction on this, though it’s probably not that clear from what I’ve written so far. So the initial point I want to hold on to is that authorship is a political function that’s changed historically. I probably emphasize this b/c I’ve done a lot of work on intellectual property, where the author function does specific legal work, and it doesn’t always even intersect with the creator of a work. One of the things I think IP tends to do is uncritically elevate a view of authorship derived from literary romanticism (the myth of the genius up in his loft, creating great works completely by himself), and so I tend to start from a position that’s critical of that. None of that is here – but in the third part of this (in my head but not drafted very coherently yet), I’m going to try to make a social justice point. That’s why I’m mentioning folks like Liboiron here, to set that up. I’ll be interested to see if you think that’s responsive to the issues you’re raising here, because I 100% agree that they’re important issues. If I’m understanding you correctly, the examples you use indicate that authorship can be used (or not used) to protect vulnerable folks, and so that we should view it almost as a tool of antisubordination (that’s not phrasing I’ve tried before, but I’m thinking about the scholarship that argues that “equal protection” should be used in this way), either by protecting the vulnerable or by elevating collective efforts. At any rate, the third part is going to try to theoretically ground that kind of argument. It’ll be the first time I’ve tried to put the argument together, so I’ll be interested in feedback.
    Like you, I like to think I am a better contributor to a team than ChatGPT! For one, I won’t lie about what the lit says… my sense is that the journal editors are right to emphasize accountability as the central variable here – that’s the claim of the next one of these posts. I played with it a bunch in early December, and have the screenshots saved to talk through this issue in the context of getting it to talk about Heidegger and technology (sort of a philosophy inside joke, because he hated technology). It… wasn’t good. I came away from it thinking that it’s producing what Aristotle called “doxa” – the sort of general, ambient view of something. I should probably work that up, because it’s evidence in support of your first point – part of what I did was use google, and it immediately got a bunch of things right that ChatGPT got wrong, and of course (this is really important to an academic like me) it pointed me to its sources. Yesterday I gave it a prompt I gave my undergrad existentialism class, and the results were similar. I’ll put that up today.
    That’s a long answer, and I realize it’s substantially a promissory note – hopefully I’ll provoke more response as I go along!

    Like

Leave a comment