Senior European journalist suspended over AI-generated quotes

[−] tobr 56d ago

Interesting to note how similar this seems to what happened with Benj Edwards at Ars Technica. AI was used to extract or summarize information, and quotes found in the summary were then used as source material for the final writing and never double checked against the actual source.

I’ve run into a similar problem myself - working with a big transcript, I asked an AI to pull out passages that related to a certain topic, and only because of oddities in the timestamps extracted did I realize that most of the quotes did not exist in the source at all.

[−] raw_anon_1111 56d ago

This seems like a solved problem. Any RAG interface I design I have links to the original source and passage. Even NotebookLM does this.

[−] mh- 56d ago

For the curious, the term of art is Grounding.

e.g.: https://docs.cloud.google.com/vertex-ai/generative-ai/docs/g...

[−] tobr 56d ago

It might be a solved problem in the sense that it has a possible solution, but not in the sense that it doesn’t happen with the tools most people would expect to be able to handle the task.

[−] Planktonne 56d ago

It was already a solved problem with cmd/ctrl + f.

[−] intended 56d ago

Looking at the media ecosystem at large, gives me a case of gallows humor.

In some sections of the ecosystem, firms still penalize journalists for errors. In other sections, checking reduces the velocity of attention grabbing headlines. The difference in treatment is… farcical.

We need more good journalists, and more good journalism - but we no longer have ways to subsidize such work. Ads / classifieds are dead, and revenue accrues to only a few.

I have no idea how we square this circle.

[−] PeterStuer 56d ago

We can't square this circle. It's why they're all A/B flipping headlines (resulting in the most deranged partisan clickbait), killed of their (too expensive) redactions (especially international news), rely solely on (barely) rewriting AP, Reuters and PRNewswire, and fill their site with opinion rather than factual reporting in support of gov handouts to the sector.

[−] skygazer 56d ago

Out of curiosity, if you asked for the same text extraction multiple times, each inside fresh contexts, is it likely to fabricate unique quotes each time? And if so, a) might that be a procedure we train humans to do to better understand LLM unreliability, and 2) and instrumentalize the behavior to measure answer overlap with non LLM statistical tools?

Also, quote-presence testing/linking against source would seem to be a trivial layer to build on a chat interface, no LLM required. Just highlight and link the longest common strings.

Senior European journalist suspended over AI-generated quotes (theguardian.com)

79 comments