October 20, 2023

Why ChatGPT And Other LLMs Generate Different Answers to Same Questions

By Ayfie · 2 minute read

Large language models like GPT-3 and ChatGPT-4, Google Bard and Meta LLaMA, have taken the world by storm, demonstrating remarkable abilities to generate human-like responses to text prompts. But how do they actually work, and why do they sometimes provide different answers to the same questions? In this blog post, we'll explore the inner workings of large language models and provide insights into how to craft effective prompts to double-check their answers.

How Large Language Models Work

At a high level, Large Language Models (LLMs) like ChatGPT-4 work by using a massive neural network to process vast amounts of text data. The model is trained on a large corpus of text, such as the internet or a collection of books, and learns to recognise patterns in the data. This training allows the model to generate coherent and contextually relevant responses to new text prompts.

When a user inputs a prompt, the large language model uses its trained neural network to generate a response. The model doesn't simply regurgitate pre-written responses, but rather generates a response on the fly by drawing on its learned patterns and context from the prompt. This is why large language models can generate human-like responses that seem tailored to the specific prompt.

Why Large Language Models Sometimes Provide Different Answers

While large language models can be incredibly accurate, they can also provide different answers to the same questions. This is because the models are probabilistic, meaning that they generate responses based on a probability distribution. When a user inputs a prompt, the model generates multiple possible responses and ranks them by likelihood. This means that even small changes in the prompt can lead to different responses.

In addition, large language models can also be influenced by biases in the training data. For example, if the model is trained on text that contains biased language or stereotypes, it may generate responses that reflect these biases.

Crafting Effective Prompts to Double-Check Large Language Model Answers

Given the probabilistic nature of large language models, it's important to craft effective prompts to double-check their answers.

Here are some tips for creating good prompts:

Be specific: Try to provide as much detail as possible in your prompt to help the model generate a more accurate response:
Ask for clarification: If the model generates a response that isn't quite what you were looking for, try asking for clarification or more information to help steer it in the right direction.
Cross-check with other sources: If you're unsure about the accuracy of the model's response, try checking it against other sources of information to see if they match up.
Avoid biased language: Be mindful of the language you use in your prompt, as biased language can influence the model's response.

Here are some examples of prompts to double-check the answer:

Example prompt: "What is the capital of France?"
- Prompt to double-check: "What are some other major cities in France?"

Example prompt: "Who played the lead role in the movie 'Forrest Gump'?"
- Prompt to double-check: "What are some other notable movies that the same actor has appeared in?"

Example prompt: "What is the boiling point of water?"
- Prompt to double-check: "How does the boiling point of water change at higher altitudes?"

Example prompt: "Who was the first president of the United States?"
- Prompt to double-check: "What were some of the key accomplishments of this president during their time in office?"

In conclusion, large language models are incredibly powerful tools that can generate human-like responses to text prompts. However, they are probabilistic and can generate different answers to the same questions. By crafting effective prompts and double-checking their answers, we can harness the power of large language models while ensuring their accuracy and reliability.

Make sure to check out our AI Personal Assistant based on ChatGPT to see how you can unleash the power of the language on the documents you choose to interact with.

Sources:

- Specialized LLMs: ChatGPT, LaMDA, Galactica, Codex, Sparrow, and More
- How ChatGPT and Other LLMs Work—and Where They Could Go Next