The Trust Crisis: Evaluating AI’s Accuracy in News Reporting

In a rapidly evolving digital age, the reliance on artificial intelligence-powered assistants for news updates is increasing. However, a recent international study, conducted jointly by the European Broadcasting Union (EBU) and the BBC, highlights significant concerns regarding their accuracy. The report reveals that nearly half (45%) of AI-generated news responses contain major errors, with an overwhelming 81% displaying some degree of problem.

Mainstream AI Models Under Scrutiny

The study encompassed 3,000 responses from AI assistants in 14 different languages, including popular tools like ChatGPT, Copilot, Gemini, and Perplexity. Each response was meticulously evaluated based on three critical aspects:

  • Accuracy of the content provided.
  • The correctness of cited sources.
  • The ability to distinguish between factual information and personal opinions.

AI’s Increasing Error Rates: Gemini Takes the Lead

The study found an alarming 45% of responses contained outright inaccuracies, often delivering misleading information or referencing outdated sources. Particularly concerning was the Gemini assistant by Google, which had a staggering 72% of its responses marred by significant source issues, far exceeding its counterparts.

In addition to source issues, 30% of responses improperly noted the origin of information, and 20% featured outdated or incorrect facts, underscoring the pervasive nature of the problem.

Examples of AI Errors: Significant Misreporting

The study provided specific instances of AI failures. Gemini misreported on e-cigarette legislation amendments, while ChatGPT incorrectly claimed Pope Francis was still alive, despite his passing months prior. These high-profile inaccuracies highlight the persisting challenge AI models face in processing real-time news with accuracy.

Despite these pitfalls, companies like Google and OpenAI are actively seeking user feedback to improve the quality and reliability of their AI platforms. Perplexity asserts a promising 93.9% factual accuracy with its ‘deep search mode.’

Implications for Democracy and Trust

The EBU forewarns that as AI assistants increasingly replace traditional search engines as news sources, the inability of the public to discern accurate information could lead to a profound trust crisis, ultimately weakening democratic participation. Therefore, the EBU advocates for the inclusion of AI developers in a ‘news responsibility framework’ to ensure the delivery of verifiable facts, accurate sources, and a clear distinction between commentary and fact.

As AI continues to integrate into our lives, addressing these reliability issues takes on new urgency, both for safeguarding public trust and preserving the democratic process.

Scroll to Top