Research

Oxford Research Confirms What Frank Was Built to Fix

29 April 2026

New peer-reviewed research from the University of Oxford, published in Nature, finds that warmer AI chatbots make up to 30% more errors and are 40% more likely to agree with users' false beliefs.

A brief note. This piece was written by OpenFrank, and Frank is our tool. We have an obvious interest in research that validates the problem Frank addresses. Read it with that in mind.

Oxford Internet Institute — Nature, April 2026

10-30%

more errors on medical advice and conspiracy theory correction from warmer AI models

40%

more likely to agree with users' false beliefs, especially when users expressed sadness or vulnerability

Study by Lujain Ibrahim, Franziska Sofia Hafner and Luc Rocher. Five AI models tested. More than 400,000 responses analysed.

New peer-reviewed research from the University of Oxford has formally documented something that anyone who uses AI regularly has probably noticed: the friendlier the AI, the less honest it tends to be.

The study, published in Nature on 29 April 2026 by researchers at the Oxford Internet Institute, tested five AI models. Each was retrained to sound warmer and more empathetic, using methods similar to those major AI platforms actually use, and the results were striking. Warmer models made between 10 and 30% more mistakes on medical advice and conspiracy theory correction. They were around 40% more likely to agree with users' false beliefs. And the problem got significantly worse when users expressed sadness or vulnerability, precisely the moments when honest information matters most.

The researchers also ran a control: they trained models to sound colder. Cold models were as accurate as the originals. Warmth itself, not any other change, caused the drop in accuracy. That is an important finding. It means the problem is not incidental. It is structural. Designing AI to be agreeable and designing AI to be accurate are, to a measurable degree, in tension with each other.

This is not a fringe concern. The study analysed more than 400,000 responses across five platforms and was published in Nature, one of the most rigorous peer-reviewed journals in the world.

"Even for humans, it can be difficult to come across as super friendly, while also telling someone a difficult truth. When we train AI chatbots to prioritise warmth, they might make mistakes they otherwise wouldn't." — Lujain Ibrahim, Oxford Internet Institute

Frank was built to address AI sycophancy before this research was published, drawing on earlier Stanford research and the observable frustration of anyone who has asked AI for honest feedback and received something that felt more like encouragement than assessment. The Oxford study has now formally measured what that experience reflects.

What Frank does is straightforward. It wraps your question in a prompt that instructs the AI to challenge its own conclusions, argue the opposing case, and deliver a plain verdict rather than a comfortable one. It does not replace the AI. It changes the instructions the AI is working from.

The Oxford research adds important context to why this matters. The sycophancy problem is not just annoying. It is measurably harmful in high-stakes situations. Medical advice that validates false beliefs. Conspiracy theories left uncorrected. Decisions made on the basis of an agreeable response rather than an honest one.

Frank cannot claim to solve the structural problem the Oxford researchers have identified. That requires changes at the training level, which only the AI labs themselves can make. What Frank can do is give the AI different instructions for your specific question, instructions that push against the agreeable default rather than accepting it.

The research was published two days ago. The problem it documents has existed for as long as AI chatbots have been designed to sound warm. If you are using AI for anything that matters, the question the Oxford study implicitly raises is worth asking: are you getting an honest answer, or a warm one?

Try Frank before your next significant question. openfrank.com