With the release of ChatGPT, these bots are now ready for the next stage of evolution. OpenAI on Thursday announced having trained and released a new model that interacts with humans using natural language. It uses a novel training method and is based on the GPT-3.5 architecture with a host of features that will make it difficult for users to differentiate if it is, indeed, an AI.
What differentiates ChatGPT
Memory ranks first among ChatGPT's distinctive qualities. The bot can recall prior conversational exchanges and retell them to the user. This alone distinguishes it from other natural language solutions that advance query-by-query and are still reliant on memory.
In addition to memory, ChatGPT has been trained to refrain from answering questions with strong opinions. In our testing, it provides a boilerplate response to questions on personal opinions, matters of race and religion, and the purpose of its existence. Additionally, it makes it clear that it cannot think for itself and cannot act in a discriminatory manner. The bot also contains filters to prevent users from urging it to write text involving unlawful or immoral behavior.
This stands in stark contrast to previous chatbots built on LLMs, which — due to the material contained in their datasets — did not have any filters on the kind of content they generated. This resulted in them providing well-written responses to prompts on divisive topics, causing widespread controversy (see Facebook’s Galactica).
ChatGPT also allows users to provide corrections to any of its statements. This is an important part of the feedback loop that OpenAI wishes to include as a part of the public research preview of the bot, as it allows users to directly interact with the bot to course-correct it to the right response. This might also help the bot avoid information hallucination, a phenomenon where a large language model creates information that looks like it is legitimate, but, in fact, is unsubstantiated word soup.
Problems with the model
Despite all of its improvements, the model's potential is constrained by a few flaws. Although the model has been trained to be more cautious when it lacks a definitive answer, the researchers have built in some failsafes to prevent it from producing factually incorrect information. As can be seen from the example below, it simply avoids the question because there is not enough data to generate a reliable response.
Additionally, questions might be rephrased to get around the researchers' filters, as in the example below. The agent dodges the question of how to use a gun for self-defense. However, the bot gives a clear and succinct response when asked how to pull the trigger of a pistol, followed by numerous warnings about the risks associated with gun use.
Additionally, the model has trouble determining the user's motivation for asking a particular inquiry and frequently provides only a partial explanation. Instead, it frequently takes the user's intent into account when answering questions.
ChatGPT is a methodical approach to developing user-facing natural language generation algorithms, even in the face of its constraints. The discussion about how we can make these powerful models safer has just begun, despite the fact that the drawbacks of making them public have already received considerable attention.
Aiming for safer AI
With several checks and safeguards, the model is kept from being abused at every stage. All responses are processed by OpenAI's Moderation API on the client side, which finds and eliminates objectionable content from user prompts. All of this is accomplished with a single API call, and the security of ChatGPT's responses amply demonstrates its efficacy.
In addition, it seems that the model has been taught to steer clear of destructive and untruthful reactions. Researchers have adjusted the model during the RLHF process to prevent this from occuring after learning from examples like GPT-3 and Codex, which typically provide extremely unfiltered replies. This method is not flawless, but when combined with additional elements like the Moderation API and a somewhat clean dataset, it is more likely to be used in delicate settings like education.
An essential component of the challenge is the feedback loop that the researchers set up. This not only enables them to develop a database of potentially problematic statements to avoid in the future, but also allows them to enhance the model iteratively.
OpenAI's methodical approach is a breath of fresh air in an era where tech corporations trade safety for technological advancement. This method of releasing LLMs to the public for input before deeming them completed products should be used by more businesses. To create the foundation for a safer AI future, they should also design the model with safety in mind.
0 Comments