Demo
In this demo, you’ll see a simple RAG application in action. The community has done a great job at abstracting away a lot of the internals. In basically three steps, you can have a complete RAG for yourself. And you can do all these in fewer than 50 lines of code.
In this JupyterLab session, this notebook contains a full RAG application. Here’s a description and demonstration of the RAG at work.
The first cell contains simply an environment variable to help keep record of operations that pertain to this application. Nothing fancy. It’s worth noting the power of notebooks here. With notebooks, you get to build and test your application one cell at a time. Every cell can be executed independently, and subsequent cells retain memory of data from previous ones.
This cell contains the initializing code for the external data source. Remember that a RAG application first retrieves data. In this cell, the data is a Wikipedia page of the 2024 Summer Olympics. The model for this RAG has information on events up to 2021. It knows nothing about an event that happened in 2024. But in this RAG application, you’re feeding it with a more recent event. This is one of the key features that makes RAG super useful.
After loading the data source, the next step is to split the data into manageable chunks. This improves organization and facilitates efficient search and retrieval in the RAG system.
This isn’t crucial to the operations of a RAG but is certainly important. It’s the persistence feature of the RAG. In this cell, the Chroma vector database stores the previously retrieved Wikipedia data. It also includes an existing model from Ollama, the model used for this RAG application.
This next cell tests the database to make sure it’s good. This is a notebook, so it’s perfectly fine to test portions of the app, cell by cell.
This cell sets up the Ollama model. The specific model used here is the llama3.1:latest. There are many other models with different parameters, features, and size.
Next is a quick test of the model. Here, you can see that it doesn’t have any information about events beyond 2021.
Finally, this cell puts everything together. It starts with a prompt, which guides the model into giving a more accurate response.
It then creates a chain of processes with the source data, the prompt, the model, and a class that converts the output into a string format.
The result is a valid, natural response to an event that happened years after the model was trained. This is just awesome, isn’t it? Well, there you have it.
In the upcoming sections, you’ll learn more about all the things you’ve seen in this demo. See you there.