Artificial intelligence (AI), which has been trained with large amounts of data, is evolving to the point where it can perform a variety of tasks at a high level, but there are still areas in which it is weak and cases where it is prone to mistakes. A joint study by the University of California, San Diego and Tsinghua University found that by teaching AI “when to rely on external tools,” rather than relying solely on knowledge built into the system, performance improved by 28%. It has been shown that
[2411.00412] Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation
https://arxiv.org/abs/2411.00412
UC San Diego, Tsinghua University researchers just made AI way better at knowing when to ask for help | VentureBeat
https://venturebeat.com/ai/uc-san-diego-tsinghua-university-researchers-just-made-ai-way-better-at-knowing-when-to-ask-for-help/
AI sometimes outputs sloppy content that looks very similar to the input content. This phenomenon is called “hallucination”, and errors caused by “hallucination” are one of the risks that companies considering introducing generative AI are most concerned about. Simon Hughes, an engineer at AI company Vectara, which released the open source psychedelic evaluation model HEM, said, “For organizations to effectively implement generative AI, they need a clear understanding of the risks and potential downsides.” We need to do that.” According to Mr. Hughes, when HEM evaluated the results of summarizing 1000 documents, the highest one had a “hallucination rate” of 27.2%.
Vectara releases open source evaluation model that can objectively verify the risk of large-scale language models causing “hallucinations” – GIGAZINE
As an approach to preventing AI hallucinations, a paper by the University of California, San Diego and Tsinghua University proposes a new AI training process named “Adapting While Learning.” Traditionally, integrating large-scale language models (LLMs) with other tools improves the reliability of the results obtained for a task, but it leads to over-reliance on the tools and is difficult to understand through basic reasoning. The model’s ability to solve simple problems tended to decrease.
In Adapting While Learning, the model internalizes the knowledge it references by learning from solutions generated using external tools. You then learn to categorize problems as “easy” or “difficult” and decide whether to use the tool accordingly. In other words, by making it possible for AI to evaluate the difficulty of the task it is tackling, it can decide whether to rely on a tool depending on the difficulty.
One of the key aspects of Adapting While Learning is that it is an efficiency-first approach. The researchers used an LLM with about 8 billion parameters, which is far fewer than major LLMs such as GPT-4.GPT-4oCompared to state-of-the-art models such as and Claude-3.5, we report a 28.18% improvement in response accuracy and a 13.89% improvement in tool usage accuracy.
Online media VentureBeat points out that this research is in line with the industry trend, as major AI companies are entering a phase of “AI downsizing” in which they release smaller and more powerful LLMs. I’m doing it. Research suggests that the ability to decide between internal knowledge and tools may be more important for AI than pure model size or computational power.
Most current AI systems either always rely on external tools or try to solve everything internally. AI that constantly accesses external tools has the disadvantage of increasing calculation costs and slowing down simple operations. Also, AI that relies only on internal knowledge will not work well in areas for which it is not sufficiently trained. Both approaches introduce potential errors with complex problems that require specialized tools.
This inefficiency is not only a technical problem, but also a business problem. Companies using AI in their practices often have to pay high fees for cloud computing resources to run external tools, even for basic tasks that AI should handle internally, or they have to pay high fees for cloud computing resources to run external tools, or rely on standalone AI systems. Failure to use the appropriate tools at the right time can lead to mistakes, making it difficult to obtain satisfactory performance. A model in which AI makes “human-like decisions” about when to use tools is expected to be particularly valuable in fields such as scientific research, financial modeling, and medical diagnostics, where both efficiency and accuracy are important. Masu.
Copy the title and URL of this article