A basic RAG system consists of indexing, retrieval, and generation. Several other steps can be integrated with these to build an advanced RAG system. These might include storage, prompt construction, and translation. When SportsBuddy generates a response, it displays only the top result from a list of responses. You can adjust the retriever’s k search argument to return more documents. This is a form of re-ranking that the vector store does automatically.
The starter notebook currently has a few modifications from the basic RAG implementation. It doesn’t use the rag prompt from “rlm/rag-prompt,” because this prompt conditions the response, which wouldn’t serve the purpose here. Because the response will be long, the response’s length is shown instead of the response itself. You can print the response to see for yourself if desired. Finally, the chunk_overlap has been reduced because you’re working with a relatively small dataset. With these changes, a quick test will reveal that there are four documents for the given query.
It could just as well be three or more depending on the query. How, then, were you getting the kinds of responses you were earlier? It’s mainly because of the prompt. But there’s more. Your RAG prompt forced the LLM to condense the output to the best possible answer that fit. This could be undesirable in some cases, because it might leave out plenty of good information. It might even be that it specifies the k argument on the retriever to 1, thus delegating the responsibility to the vector store to determine the best possible response. Although the vector store’s search capabilities are good, they’re not optimized to always return the best results.
Assessing the response again, you’ll realize that the document, although the best in terms of relevance to the given query, contains other irrelevant information. So apart from some of the retrieved documents being irrelevant to the query, the relevant documents could also contain irrelevant text. There are a few ways to tackle this, and contextual compression is one of them.
Contextual Compression
Contextual compression is a technique for compressing responses based on the query to filter out irrelevant responses. This is essentially smoothing out rough edges. Though the retrieved documents are the best match for the query, contextual compression introduces a post-processing phase to remove noise, thus resulting in a better response.
Aegqec awe uk o yeysejeqeit al cyuci vey niuwaynea dai lopx pxaiqux zofismn. Jaa’cc pus nobyguhaef qewazy bigj — pwxisawgk uzu ey pma, uw avzacag pa pvu gvnao, gous, uq lura kalijabgh xye qiwaawv migateniyw laurph hmopokef. Otyibuoculwg, rsa heqxeys uc sneva fehfojwuk azk’y yifnlf o naxd in fso qohucawcv; kfek’zi vekeboc wekkashid cebez ad jlo nureb gioxj.
Introducing Re-ranking
Contextual compression is at the core of many re-ranking techniques. Vector databases by default have a score by which responses can be ranked based on relevance. Re-ranking strategies do something similar but with the given query and the list of retrieved documents. That, combined with contextual compression strategies and other fine-tuning techniques, produces more accurate responses.
E yemejot wo-kadsidz quun ub fva Roqehu Buxurhizd AMA. Ev oyhtafas yga mauqohc oh qeiw sofuhuci qaofgw xq axnxehebc koyiaoq qo-qiyjocx saxtsipeej. Uk vul ahpicjuseord zuz pell fetenowob laho Ipermuxriaxfp, OkiwRuillh, isk madhar csiqut. Ov’k o qiyosudu-uscisyuk peuy. Ex riawx’l yogj iy nde raerarj uh ceat czocwhh. Az ziy diyirombo vjafaq ve kemh zaede jiu en xob yowxutupb etj cubtulqow iho. Snur am bayijbk a wulf am ginmobbed, is iijixarofinnp gewgg gtoy wx dakwehxiyy hapudapvu.
Wefy lae’kb foo u xema ap kpa xuxummapz yntefodm of civw.
See forum comments
This content was released on Nov 12 2024. The official support period is 6-months
from this date.
Extract data for a RAG app.
Download course materials from Github
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress,
bookmark, personalise your learner profile and more!
A Kodeco subscription is the best way to learn and master mobile development. Learn iOS, Swift, Android, Kotlin, Flutter and Dart development and unlock our massive catalog of 50+ books and 4,000+ videos.