RAG for LLMs explained in 3 minutes

18,281
0
Published 2024-02-20
How I Explain Retrieval Augmented Generation (RAG) to Business Managers

(in 3 Minutes)

Large language models have been a huge hit for personal and consumer use cases. But what happens when you bring them into your business or use them for enterprise purposes? Well, you encounter a few challenges. The most significant one is the lack of domain expertise.

Remember, these large language models are trained on publicly available datasets. This means they might not possess the detailed knowledge specific to your domain or niche. Moreover, the training data won't include your Standard Operating Procedures (SOPs), records, intellectual property (IP), guidelines, or other relevant content. So, if you're considering using AI assistants "out of the box," they're going to lack much of that context, rendering them nearly useless for your specific business needs.

However, there's a solution that's becoming quite popular and has proven to be robust: RAG, or Retrieval Augmented Generation. In this approach, we add an extra step before a prompt is sent to an AI assistant. This step involves searching through a corpus of your own data—be it documents, PDFs, or transactions—to find information relevant to the user's prompt.

The information found is then added to the prompt that goes into the AI assistant, which subsequently returns the answer to the user. It turns out this is an incredibly effective way to add context for an AI assistant. Doing so also helps reduce hallucinations, which is another major concern.

Hope you find this overview helpful. Have any questions or comments? Please drop them below.

If you're a AI practitioner and believe I've overlooked something or wish to contribute to the discussion, feel free to share your insights. Many people will be watching this, and your input could greatly benefit others.

All Comments (14)
  • @DanielBoueiz
    Does the LLM first defaults to check the additional datastore we gave it to see if it has any relevant data related to the prompt the user enters, and if it finds relevant data, it responds to the user without checking the original data on which it has been trained, and if it doesnt find any relevant data in the datastore to the prompt, will then act as if RAG wasnt even implemented, and will respond based on the data on which it has been originally trained, or am i getting it wrong?
  • @victormustin2547
    So does that mean that the data needs to fit the llm context window ? Or is the data going through some sort of compression ?
  • @jasondsouza3555
    Just wanted to clear my confusion, would i yield better results by applying RAG to a fine-tuned model (i.e. fine-tuned in my field of work) or is RAG on a stock LLM good enough?
  • @farexBaby-ur8ns
    Very Nice. However an example would’ve helped augment the answer. Like ask it the gdp of Chad in 2023 when using ChatGPT.
  • @adipai
    thank you for the video George Santos :):face-blue-smiling:
  • @krishras23
    great video, just confused on if this is eliminating the need for LangChain, or does RAG and LangChain serve different purposes?
  • In the context of an AI chat bot on a company website, this would mean training the bot on company data?