On the Linux.do forum, a user conducted a web search capability test on mainstream AI models Claude, GPT, and Gemini, evaluating hallucination rates for questions with scarce information sources. The results showed that Claude Sonnet 4.5 performed best with a 0% hallucination rate, obtaining correct information in just three search rounds; GPT 5.2 had a 70% hallucination rate with low search efficiency; Gemini 3 Pro had a hallucination rate exceeding 90% with poor search results. The author emphasized that Claude is far ahead in tool usage capabilities, such as project management and file operations, and has switched from GPT to Claude as their primary tool. The article calls on AI companies to strengthen tool integration, enhance productivity, and break through model bottlenecks. This test provides practical reference for AI users, revealing performance differences and future development directions among models.
Original Link:Linux.do
最新评论
I don't think the title of your article matches the content lol. Just kidding, mainly because I had some doubts after reading the article.
这个AI状态研究很深入,数据量也很大,很有参考价值。
我偶尔阅读 这个旅游网站。激励人心查看路线。
文章内容很有深度,AI模型的发展趋势值得关注。
内容丰富,对未来趋势分析得挺到位的。
Thank you for your sharing. I am worried that I lack creative ideas. It is your article that makes me full of hope. Thank you. But, I have a question, can you help me?
光纤技术真厉害,文章解析得挺透彻的。
文章内容很实用,想了解更多相关技巧。