ChatGPT and other language models have transformed the field of natural language processing, yet they encounter difficulties with fundamental tasks such as arithmetic and verifying facts. Recently, Meta researchers introduced Toolformer, an AI language model that can acquire the ability to utilize external tools, such as search engines, calculators, and calendars, without compromising its proficiency in modeling the language it is built on.
Toolformer uses an API (application programming interface) to communicate with different applications. During training, the researchers provided Toolformer with a small set of human-written examples illustrating how to use each API, then allowing the tool to annotate a large set of model data. language modeling with calls to potential APIs. He did this in a “self-monitoring” way, meaning he could learn without explicit human guidance. The model learned to predict each text API call like any other text. When it’s running (generating text on human input), it can insert invocations as needed. Moreover, the Toolformer can “determine” itself which tool to use to suit the context and usage.
The ability to call this API allows Toolformer to use external software tools such as search engines, calculators, language translators, and actual references. For example, large language models (LLMs) are notorious for not being particularly good at arithmetic. Toolformer can work around this limitation by using a computer program. Or if someone wants the LLM-based assistant to add dates to their calendar, Toolformer can handle that task using an API link to the calendar app.
Toolformer is a tool that utilizes a pre-existing GPT-J model that has 6.7 billion parameters. According to tests conducted by the researchers on various tasks using Toolformer, it appears to outperform the much larger GPT-3 model, which has 175 billion parameters.
This isn’t the first time researchers have tried to overcome the limitations of language models. In fact, the Bing Chat model that was recently making headlines this week can search the web on its own when needed, and others have already tried integrating with browsers, computers, and search engines. According to Meta researchers, most existing approaches to integrating tools into language models rely on large amounts of human annotation or are limited to individual-specific parameters. Specific tasks. In contrast, Toolformer can learn to use many types of tools in a general way without requiring specialized training for specific tasks.
With techniques like those found in Toolformer, we envision a potential future where LLMs enhanced with the ability to use external applications become more flexible assistants and (seemingly) like) more reliable. But the ability to make API calls can also increase the chance that LLM will harm user data (in the application) or create problems in the outside world (via web browsers or communication tools). – possibilities that they may inadvertently invoke when providing answers. .