New research from the Oxford Internet Institute is warning of the dangers of Large Language Models (LLMs) used in chatbots. These models are capable of generating false content and presenting it as accurate, posing a direct threat to science and scientific truth.
The paper published in Nature Human Behaviour highlights that LLMs are designed to produce helpful and convincing responses without any guarantees regarding their accuracy or alignment with fact. Despite being treated as knowledge sources and used to generate information in response to questions or prompts, the data they are trained on may not be factually correct.
One reason for this is that LLMs often rely on online sources which can contain false statements, opinions, and inaccurate information. Users often trust LLMs as a human-like information source due to their design as helpful, human-sounding agents. This can lead users to believe that responses are accurate even when they have no basis in fact or present a biased or partial version of the truth.
Researchers at the Oxford Internet Institute stress the importance of information accuracy in science and education and urge the scientific community to use LLMs as “zero-shot translators.” This means that users should provide the model with the appropriate data and ask it to transform it into a conclusion or code rather than relying on the model itself as a source of knowledge. This approach makes it easier to verify that the output is factually correct and aligned with the provided input.
While LLMs will undoubtedly assist with scientific workflows, it is crucial for scientists to use them responsibly and maintain clear expectations of how they can contribute while also avoiding their potential pitfalls.
In conclusion, researchers at Oxford Internet Institute are cautioning about the dangers of Large Language Models (LLMs) used in chatbots, emphasizing their tendency towards hallucination and its impact on science and scientific truth. They suggest using LLMs as “zero-shot translators” by providing appropriate data for transformation into conclusions or codes rather than relying solely on them as knowledge sources.