I am also sure you don’t, because the disease is made-up. Unfortunately, that didn’t stop major LLMs from credulously talking about it. Almira Osmanovic Thunström, a Swedish researcher, wrote a couple of obviously fake papers inventing the condition and put them on a preprint server. Within weeks, according to this report in Nature by Chris Stokel-Walker, not only were LLMs treating the condition as real, but it later showed up in peer-reviewed papers.
It’s worth emphasizing how obviously fake the papers apparently are. Stokel-Walker reports:
“Osmanovic Thunström planted many clues in the preprints to alert readers that the work was fake. Izgubljenovic [the made-up author person] works at a non-existent university called Asteria Horizon University in the equally fake Nova City, California. One paper’s acknowledgements thank “Professor Maria Bohm at The Starfleet Academy for her kindness and generosity in contributing with her knowledge and her lab onboard the USS Enterprise”. Both papers say they were funded by “the Professor Sideshow Bob Foundation for its work in advanced trickery. This works is a part of a larger funding initiative from the University of Fellowship of the Ring and the Galactic Triad”. Even if readers didn’t make it all the way to the ends of the papers, they would have encountered red flags early on, such as statements that “this entire paper is made up” and “Fifty made-up individuals aged between 20 and 50 years were recruited for the exposure group”
Why did the LLMs nonetheless accept them? It’s not obvious, but one answer the article suggests is that they were formatted to look like medical journal articles. Stokel-Walker cites research by Mahmud Omar showing “that LLMs are more prone to hallucinate and elaborate on misinformation when the text they’re processing looks professionally medical — formatted like a hospital discharge note or clinical paper — than when it comes from social-media posts.” There’s lots of debates about whether LLMs “understand” what they ingest. There’s good reasons those debates are misplaced, but I think it does tell us something about how they can overfit to irrelevant criteria. This was a standard example in discussing image models: if you train it to recognize a “ball” with a set of mostly tennis-ball images, it might fail to recognize a basketball because it learned that balls are yellow. Something analogous seems to be the case here.
Another way of describing the problem is that, like the rest of us, the LLM is relying on heuristics to filter information. It’s learned not to trust social media (good!) but it’s easier to trust a preprint on an academic server, especially if you don’t really understand the words in it. This particular paper was seeded with evidence it was fake, but the sort of proof-of-concept should be alarming, given how often folks go to LLMs for medical advice. Bixonimania was (?) a condition related to screen exposure, and so wasn’t actually a thing that could lead to further problems. Stokel-Walker writes:
“Another concern is that models could be gamed — potentially for commercial benefit. Osmanovic Thunström says that a bad actor could exploit the same technique she used, for profit. “What if I was a salesman of blue-light glasses and I wanted to use this as an argument?” she says. A would-be salesperson could say, “You can just talk to ChatGPT, and they’ll tell you this is a problem. You can avoid it with these really expensive glasses,” she suggests.”
One can also skew it in more malevolent ways. For example, how can an LLM assess the credibility of vaccine information on government sites in the RFK era? Those sites used to be good! Of course, fake research is a long-standing problem – I remember back from my college debate days discussion around the petroleum-industry-financed “studies” that reported that greenhouse gases weren’t a thing to worry about. But LLMs make them a lot easier. Back in late 2023, researchers demonstrated that ChatGPT could produce an entire fake dataset to prove a desired medical thesis. In this regard, it’s also alarming that Bixonimania showed up in a couple of peer-reviewed papers. They got retracted, but that highlights another problem: lots of bibliographies cite papers that authors haven’t actually read.
Fake research is incredibly hard to get rid of, and LLMs are susceptible to that, too. Again, this isn’t exactly new: retracted papers get cited, often a lot, after their formal retraction. Language models are statistical creatures and so are not natively equipped to handle that either. Stokel-Waler reports:
“AI’s uncritical tendency to suck up information, often without verifying its accuracy, means there is a risk we could see an “information asymmetry”, says Jennifer Byrne, a molecular oncologist and research-integrity sleuth at the University of Sydney in Australia. A single corrective paper about cancer research, for example, can be overwhelmed by hundreds of papers repeating a false claim, she says. “ChatGPT is pretty confident to fill in the gaps and give people all kinds of information about where that cell line came from, the patient from which it originated, how it’s been used in the literature, its research utility and so on,” she says”
In any case, you should read the article. It’s disconcerting.

Leave a comment