OK, so how will our actual Q&A chain work?

As an avid soccer player and coach of my Sunday league team, I receive countless emails from the association containing PDFs full of regulations. Despite the fact that these documents contain valuable information, I’ve never bothered to open a single one of them.

Let’s make a “ChatGPT” style interface where I can talk to these PDFs whenever I have a question about the rules and regulations.

To do this, we’ll need to set up a couple of things:

  1. We’ll need to extract the text from the PDFs and store them in a vector database (Redis) alongside generated embeddings. We’ll do this in a Python notebook!
  2. We’ll then need to set up a Q&A chain on Relevance AI, which will handle taking our questions, searching the vector database for context and generating answers with GPT 3.5

You can have a look at our Q&A chat chain here and clone it into your Relevance AI project: https://chain.relevanceai.com/notebook/bcbe5a/4dc088f2dcfc-4e60-807e-353c334d4a5b/5ce7f6ba-45de-42da-98b9-d7a12d2694a7/split

The beauty of Relevance is that you can customise this chain entirely! Let’s analyse some of the more interesting steps:

Modifying the user’s question based on chat history

This step is crucial! We use a GPT prompt to make the user’s question more suitable for a relevant vector search, by considering the chat history.

Note that we plug in the chat history from the previous step, the question from API params and we ask it to generate a “better” standalone question.

Essentially, we want to consider the chat history and modify the user's question by adding any relevant context from the conversation.

For instance, suppose we have been asking many questions about the topic of "concussions", and the user asks, "When can I play again?" This question alone may not trigger any context from our vector database related to concussions.

Therefore, this LLM step might change the user's initial question to: "When can I play soccer again if I have a concussion?”

Retrieve context from our Redis vector database

Now that we have a better standalone question, we can tap into our Redis database to find relevant content from the regulation PDFs. We do this with a technique called “vector similarity search”, which we’ll discuss further in the next chapter.

Finally, let's ask GPT for an answer!

Now we’re ready to set up our final prompt, which will return an answer as well as links to the relevant PDFs that the answers were sourced from.

We can see that we have fed in the context from Redis, the transformed question from the first LLM prompt, and then requested an answer in JSON format.

This means for every question, we should receive a JSON that looks like this:

 answer: string;
 references: string[]; // url's of the PDFs used in context