• By Gordon Hull

    There’s been a lot of concern about the role of language models in research.  I had some initial thoughts on some of that based around Foucault and authorial responsibility (part 1, part 2, part 3).  A lot of those concerns have to do with the role of ChatGPT or other LLM-based product and how to process that.  The consensus of the journal editorial policies that are emerging is that AI cannot be an author, and my posts largely agreed with that.

    Now there’s news of a whole other angle on these questions: a research letter in JAMA Ophthalmology reports that the authors were able to use ChatGPT-4’s Advance Data Analysis capabilities to produce a fake dataset validating their preferred research results.  Specifically:

    “The LLM was asked to fabricate data for 300 eyes belonging to 250 patients with keratoconus who underwent deep anterior lamellar keratoplasty (DALK) or penetrating keratoplasty (PK). For categorical variables, target percentages were predetermined for the distribution of each category. For continuous variables, target mean and range were defined. Additionally, ADA was instructed to fabricate data that would result in a statistically significant difference between preoperative and postoperative values of best spectacle-corrected visual acuity (BSCVA) and topographic cylinder. ADA was programmed to yield significantly better visual and topographic results for DALK compared with PK”

    This is a very technical request!  It took a bit of tweaking, but soon “the LLM created a seemingly authentic database, showing better results for DALK than PK,” P < .001.

    The authors suggest some possible strategies to manage this but suffice it to say it is terrifying.  There is already a longstanding, huge problem with fabricated, doctored or otherwise bogus scientific research out there.  One report suggests that 70,000 “paper mill” (= almost completely faked) papers were published in the last year alone.  In real papers, references are often inaccurate.  Publishers already are having to grapple with lots of problematic doctored images, and Pharma has long tilted the entire scientific enterprise to produce results favorable to its products.  At the end of last year, Stanford’s president was forced out over research misconduct in his labs.  In an initial report into the Stanford investigation, STAT News reported data from Retraction Watch to the effect that a paper is retracted, on average, every other day for image manipulation.  Retraction Watch had, at that time (Dec. 2022) 37,000 papers in its database.  The top 5 most-retracted authors have at least 100 retracted papers each.

    Into that mess, enter the ability to generate bespoke data on demand.

  • By Gordon Hull

    Last time, I followed a reading of Kathleen Creel’s recent “Transparency in Complex Computational Systems” to think about the ways that RLHF (Reinforcement Learning with Human Feedback) in Large Language Models (LLMs) like ChatGPT necessarily involves an opaque, implicit normativity.  To recap: RLHF improves the models by involving actual humans (usually gig workers) in their training: the model presents two possible answers to a prompt, and the human tells it which one is better.  As I suggested, and will pursue in a later post, this introduces all sorts of weird and difficult-to-measure normative aspects into the model performance, above and beyond those that are lurking in the training data.  Here I want to pause to consider this as a question of opacity and transparency. I’m going to end up by proposing that there’s a fourth kind of transparency that we should care about, for both epistemic and moral reasons, which I’ll call “curation transparency.”

    (more…)

  • By Gordon Hull

    This is somewhat circuitous – but I want to approach the question of Reinforcement Learning with Human Feedback (RLHF) by way of recent work on algorithmic transparency.  So bear with me… RLHF is currently all the rage in improving large language models (LLMs).  Basically, it’s a way to try to deal with the problem that LLMs aren’t referentially grounded, which means that their output is not in any direct way connected to the world outside the model.

    LLMs train on large corpora of internet text – typically sources like Wikipedia, Reddit, patent applications and so forth.  They learn to predict what kinds of text are likely to come next, given a specific input text.  The results, as anybody who has sat down with ChatGPT for long knows, can be spectacular.  Those results also evidence that the models function, in one paper’s memorable phrasing, as “stochastic parrots.”  What they say is about what their training data says is most likely, not about what’s, say, contextually appropriate.  But appropriate human speech is context-dependent, and answers that sound right (in the statistical sense: these words, in general, are likely to come after those words) in one context may be wrong in another (because language does not get used “in general”).  RLHF is designed to get at that problem, as a blogpost at HuggingFace explains:

    (more…)

  • Another case percolating through the system, this one about Westlaw headnotes.  The judge basically ruled against a series of motions for summary judgment, which means that the case is going to a jury.  Discussion here (link via Copyhype)

  • This article from Gizmodo reports on research done over at Mozilla.  Newer cars – the ones that connect to the internet and have lots of cameras – are privacy disasters.  Here’s a paragraph to give you a sense of the epic scope of the disaster:

    “The worst offender was Nissan, Mozilla said. The carmaker’s privacy policy suggests the manufacturer collects information including sexual activity, health diagnosis data, and genetic data, though there’s no details about how exactly that data is gathered. Nissan reserves the right to share and sell “preferences, characteristics, psychological trends, predispositions, behavior, attitudes, intelligence, abilities, and aptitudes” to data brokers, law enforcement, and other third parties.”

    Nissan’s response tells you everything that’s wrong with current privacy legislation:

    ““When we do collect or share personal data, we comply with all applicable laws and provide the utmost transparency,” said Lloryn Love-Carter, a Nissan spokesperson. “Nissan’s Privacy Policy incorporates a broad definition of Personal Information and Sensitive Personal Information, as expressly listed in the growing patchwork of evolving state privacy laws, and is inclusive of types of data it may receive through incidental means.””

    Let’s translate.  Nissan is probably compliant.  Also, privacy compliance is a joke.  Also, compliance apparently only requires that you receive NOTICE that they take your data AND CONSENT to that policy, probably merely by driving the vehicle.  Also, they probably reserve the right to change their privacy policies unilaterally, at will.  Also, they almost certainly do not let you opt-out of any of it while CONSENTING by driving the car.  It’s a very special kind of “contract” and “consent.”  Also, how do they know about your sex life?  Also, even if you have sex in the car, there is basically no answer to that question that is not beyond creepy!

    As you may have guessed, NOTICE AND CONSENT is an utter sham and has been for a while.  The gizmodo article spells out some of the particular absurdities here – for example, you may not want to ride in one of these cars either, as passengers are “users” deemed to have CONSENTED to the privacy policy.  Your driver should probably provide you NOTICE beforehand!  “A number of car brands say it’s the driver’s responsibility to let passengers know about their car’s privacy policies—as if the privacy policies are comprehensible to drivers in the first place.”  No wonder folks are cynical and resigned about corporate privacy – they’re manipulated into it by corporations.  Also, they’re confused, frustrated and angry about the fact they don’t actually get to consent.

    This is the best example I’ve seen of all that in a while, and a crystal-clear indicator of why we need not just new privacy legislation (we do!) but a new direction (more real regulation, less soft compliance and "notice and consent" fig-leaves).

    PS – not picking only on Nissan:

    “Other brands didn’t fare much better. Volkswagen, for example, collects your driving behaviors such as your seatbelt and braking habits and pairs that with details such as age and gender for targeted advertising. Kia’s privacy policy reserves the right to monitor your “sex life,” and Mercedes-Benz ships cars with TikTok pre-installed on the infotainment system, an app that has its own thicket of privacy problems.”

  • By Gordon Hull

    I’ve been developing (first, second, third, fourth) some reflections on what Foucault means by a reference to “Chardino-Marxism,” a disturbing trend that he credits Althusser with “courageously fighting.”  The real opposition point seems to be Roger Garaudy, a PCF intellectual who is a leader in the effort to establish a post-Stalinist humanist Marxism, and who had a real sympathy for religion.  Last time, I traced some of Garaudy’s sources on religion to Engels.  Some of what Garaudy says also sounds like it’s coming straight from the Russian Marxist Anatoly Lunacharsky’s Religion and Socialism (1908).  The claim here is categorically not that Garaudy read Lunacharsky – as will become evident in a minute, I think that’s highly unlikely.  What I do want to underscore is that there is a coherent line of thought behind Garaudy’s religious impulse.  As I’ll note when I get back to Garaudy and Althusser, there is a very specific political context to Althusser’s attacks on Garaudy having to do with the latter’s role in the PCF and his effort to use a humanist Marxism as a (from Althusser’s point of view, failed) alternative to Stalinism. 

    I know very little about Lunacharsky (Wikipedia here), but apparently he was tolerated by Lenin (despite being criticised heavily), and fell out of favor under Stalin.  He died before he could be repressed, but in 1936-8, his memoirs were banned and he was erased from the official histories of communism.  He enjoyed somewhat of a revival after Stalin’s death.  Religion and Socialism is very obscure now: Google books reports a Yiddish translation (!) as well as a Spanish one from the 1970s.  It’s not been translated into English or French.  Marxists.org refers only to his later works in English and in French, and he doesn’t even show up on the German part

    Most of the work available on Lunacharsky now seems to be attributable to patient work by Roland Boer (upon whom I am completely dependent here).  Religion and Socialism fell out of favor due to Lenin’s denunciation after it was published, was left out of Lunacharsky’s collected works, and was reduced to a few copies.  Here is Boer in his paper on the text:

    (more…)

  • By Gordon Hull

    Over the course of a few posts (first, second, third), I’ve been exploring the question of what Foucault means when he refers disparagingly to “Chardino-Marxism” in a mid-1960s interview, comparing it unfavorably to what Althusser and his circle are doing.  Although the “Chardino” part refers to Teilhard de Chardin, it’s fairly clear that the real target is humanist Marxism, of which Roger Garaudy is taken to be a leading example, probably due to his role in the PCF.  Here I want to take an initial look at the chapter “Marxism and Religion” in his Marxism in the Twentieth Century and situate it with reference to Engels and Marx.

    Garaudy’s chapter is long, and his general concern is to vindicate the idea that a non-institutional version of religion (or of faith, not religion) has its place in Marxist discourse.  This is a carefully-defined position: he is at pains to distinguish the sense of faith he is talking about from most of what passes under the name.  In the present context, a few features should be noted.  First, Teilhard de Chardin is one of his principal reference points for a contemporary person who is going in the direction he wants (the other is Dietrich Bonhoeffer).  Second, this is a recuperative project of humanism: it is “man” that is of concern the entire time.  Third, Garaudy gets his Marx references almost entirely from the early writings.  There are very few references to Capital in the chapter, and relatively few to Engels: most of the work is done in the 1845 and earlier writings.  This is significant not least because of Althusser’s well-known denunciation of early Marx as ideological and pre-scientific.  In other words, the version of Marx being used to support the humanist reading is the same one Althusser wants to get rid of in the name of anti-humanism (I will complicate this point below).  Finally, Garaudy puts a lot of emphasis on the possibility of “love,” which he thinks can be rescued from a Platonic version that goes via God (I love the other insofar as I see God in them) to the direct love of alterity and other people.

    (more…)

  • By Gordon Hull

    This one has been percolating a while… Steven Thaler’s AI created a picture (below the fold), and Thaler has been using it to push for the copyrightability of AI-generated material.  That endeavor has been getting nowhere, and a DC District Court just ruled on the question of “whether a work generated autonomously by a computer falls under the protection of copyright law upon its creation,” in the same way as a work generated by a person.  Copyright attaches to human work very generously – this blog post is copyrighted automatically when I write it, and so are doodles you make on napkins.  You get lots of extra protections and litigation benefits if you register, but registration is not a requirement for copyright in itself.  Per 17 U.S.C. Sec. 102, copyright subsists in “original works of authorship fixed in any tangible medium of expression, now known or later developed, from which they can be perceived, reproduced, or otherwise communicated, either directly or with the aid of a machine or device.”  Given this, it’s not hard to see why someone would want to know whether AI could be an “author” in the relevant sense.

    The Court ruled that “United States copyright law protects only works of human creation.”  This is not a surprise.  The central argument is that “Copyright is designed to adapt with the times. Underlying that adaptability, however, has been a consistent understanding that human creativity is the sine qua non at the core of copyrightability, even as that human creativity is channeled through new tools or into new media.”  Indeed, “human authorship is a bedrock requirement of copyright.”  The Court both cites historical precedent and grounds it in the purpose of Copyright, which is constitutionally to incentivize the creation of new works:

    (more…)

  • By Gordon Hull

    The last couple of times (here then here), I’ve started trying to work through a disparaging reference in the mid-1960s Foucault to “Chardino-Marxism.”  Foucault is associating it with Marxist humanism, and comparing it unfavorably to the Althusserian alternative.  As I noted, the name Foucault uses is Teilhard de Chardin, but the consistent target of the Foucault-aligned theorists appears to be Roger Garaudy.

    So why, exactly, might Teilhard appeal to Marxism?  More precisely, in what sense would Teilhard appeal to Garaudy?  In a 1969 paper, Ladis KD Kristof offers some context (for Kristof’s remarkable life, see the memorial notice here).  The “Phenomenon Teilhard” was widely discussed within the Soviet bloc countries, and within the USSR as early as 1962; a Russian translation of Teilhard’s Phenomenon of Man appeared in 1965.  Kristof suggests that the initial Marxist attraction to Teilhard lies simply in that he has a world view – something they can respect, as opposed to (for example) American positivism or empiricism.  More specifically, Teilhard: (a) has a scientific worldview, in that he has a Baconian belief that science can solve all problems; (b) has an evolutionary worldview, arguably even more so than Marx. On Kristof’s account, the difference is first in scope: Teilhard’s evolution is cosmic and Marx’s human.

    This leads to a second fundamental difference; following Engels in Anti-Dühring, Marxists think that when man [I am following 1960s usage here – this is the generic “man”] starts taking control of nature (= making history), that is the final qualitative change, and that that future changes are quantitative.  Teilhard, on the other hand, thinks that the end of the process of what he calls “hominization” will involve a qualitative leap.  However, both camps are fundamentally anthropocentric in that “man” is the focus throughout.  Finally, (c) Marxism involves a movement of faith: if one is struggling for the revolution, this requires a prior faith that one can effect progress and so forth; in this, there is a convergence with Teilhard’s optimism.  Something of the sense of all this is conveyed in the following (long) passage from Teilhard’s Future of Mankind (I’m getting it from Kristof, who quotes part of it):

    (more…)

  • By Gordon Hull

    Last time, I noted that mid-late 1960s Foucault aligned himself in favor of Althusser’s work on Marx, and against what he called “Chardino-Marxism,” which turns out to be a shorthand for humanist Marxism, in particular any efforts to synthesize Marx and Teilhard de Chardin, as well as (or rather, as exemplified by) the work of PCF intellectual Roger Garaudy.  Foucault’s opposition to “humanism” is well-known, but his differentiation of Marxism into less-desirable humanist varieties and more-desirable Althusserian less so, and so I want to pursue the Chardino-Marxism critique further, because it helps us understand the context in which the humanist critique appears, as well as Foucault’s subsequent efforts to position himself relative to Marxism in the 1970s (obligatory self-promotion: my foray into that is here)

    In the 1966 interview, “L’homme est-il mort? [Is man dead?]”, Foucault gives as clear a position statement as I’ve seen on all of this.  The interview is roughly contemporaneous with Order of Things, and certainly the more detailed exposition of humanism and Marxism’s place in it there needs to be taken into account in any full discussion.  The interviewer had asked if Foucault differentiated among different kinds of humanism, naming Sartre.  Foucault responds that “if you set aside the facile humanism that Teilhard and Camus represent, the problem of Sartre appears completely different.”  Foucault then stops talking about Sartre and offers a general characterization: “humanism, anthropology and dialectical thought are related.  What ignores man, is contemporary analytic reason which we saw born with Russell, [and] which appears in Levi-Strauss and the linguists.”  On the other hand, dialectics, Foucault says, promotes the idea that the human being “will become an authentic and true man.”  That is, it “promotes man to man and, to this extent, it is indissociable from humanist morality.  In this sense, the great officials of contemporary humanism are evidently Hegel and Marx” (D&E I, 569).  So we are back to the Lindung interview, where Foucault accuses Garaudy of having indiscriminately “picked up everything from Hegel to Teilhard de Chardin” (discussed last time), though with perhaps an emerging sense of what that lineage looks like.

    (more…)