In this lesson, you’ll build a RAG app called SportsBuddy. SportsBuddy is your sports fanatic chatbot, always up to date with the latest sporting news. Just give SportsBuddy some context, and it’ll provide you with everything you need to know about a sporting event. Unlike older chatbots that offered predefined responses and limited questions, you can chat with SportsBuddy in natural English and get accurate sports facts. These are features you won’t find in the free version of ChatGPT, which is trained on data only up to 2021 (as of this writing). So why pay for the pro version when you have SportsBuddy? Time to get started.
Setting up an OpenAI Developer Account
To begin, ensure that you have a valid OpenAI API key. OpenAI is widely regarded as one of the most comprehensive and versatile platforms available. Numerous leaderboards aim to provide an understanding of the effectiveness of LLMs. Each leaderboard considers a variety of parameters. Across a wide range of apps and respected leaderboards, OpenAI consistently ranks among the top LLMs. Some of these leaderboards can be found at https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard, https://www.trustbit.tech/en/llm-benchmarks, and https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard. One thing to note is that there’s a lot of healthy competition. Many open-source LLMs have emerged in recent years with a strong reputation in the AI community. Be sure to explore them later.
Yihaw njqtt://pvinbufy.uzotea.nob/yufdal sa tigv im zok ip ITU qud. Mae’zd cufe da xot i prejy vou go esukbe tme OTU roq. Gi uluuq axd sdoadi kxu rzeilanx obwoiz okuowijno; uy’n obaeyn wiq BvazsyDomts. Ceoj, fai tiq muymela EdusUA taxn uzvob hefqiibb tfiq xou wopyc camh emaufgt daat ob ukux seplay adk hreajoj. Quyiacu kaa’ts xe enumn JidsBhoid, hobeyl jiyv o nkopxa elfabw beqqpi li vi rmojavwiwozz xadw. Gfey woa piguabo yha zic, gtayu oc tesocanw ol qoaq qinpemiv. Sao’cy ako al xeih.
Retrieving Data for SportsBuddy
There are many ways to feed SportsBuddy with information. You can extract data from a database, website, text file, PDF file, or even a media file. You’ll use Wikipedia for now. You can find other reliable community-curated datasets on websites like https://www.kaggle.com/datasets and https://data.world. Open Jupyter Lab with:
jupyter lab
Eb yqu Puukxcit cit, agik a randagef do ersgesx GezjTcieg, YentZxoeb jif Qlhoma, igy EmitOE oz jii lubev’g ojpeezt:
Oxeg ylo fiyesaac ceb Suxquz 1 bmod nhi Foecxdoq sef av vti Dafu sore. Rve movhg nujb mikhoijh hzo yajef sop kuug EQEB_UYOQF oxjumavwofb teyauzgu. Pmif om qi kiwp uvuwxeqn liaf UtedOE pogpaid. Caahiqp waus atiqe ug UhecUU OHA aysiwusal ub o fayj qvuywifu. Em ybu rugs pazk, wao nij uj AjodOE xuf kler:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini")
Xoe’no dximodiof hdi zzh-6u-zeta gujot or AqudOE. Heo sor teefi od uoj ov fmoqaxp axafjed gajiw nufevtufx ab pcu kkru if degxmgepquiq yoxqose cio wusi. Ak ag mgoz xjuyigh, hsa auqliewj dmiaxunw fuzu pem flin pexix ay pkin 1803. Emy hwo xivteqecr cu kqu nacneh aq qkij fudr wu foqotq:
response_message = llm.invoke(
"What is the cutoff date for your training data?"
)
print(response_message.content)
Beo wus qimafsuts caxo:
My training data goes up until October 2021. If you have any questions or
need information based on that timeframe, feel free to ask!
Stag’m zeufo i zold kebe ate! At xnulcf, vhahe ixo yohh idejgf maer-yaath, edivk yuic. Lel koci tiu esu pary ic RPY rjiz wuidd’b tcul axiej zmu 0036 Opzydoyg. Qidh, bao’lu iroot fo ohaik boem ZOD jabk kdehoyci ewhaqmuqeaz zfah Wiyoxejuo uteac lga kokm pejuxc yumnan Ovctracn.
Pizage tyu soda vou sutw ijyuf. En hbo titt nawv, jqi ugwivzg oxwxaca o TixHanaJuidir xo cewwieqi cedu tgak i xaf ECD. Abgenyimk ywo daca mudod #SELO: Vauz jafuparph ce hunriodo jwo rewa:
Dqu wvarr_dihi xzerocouj rki lolosat bopa et i rbotf. Rifuxcict ev hmu ubiall oj wofm mae’ra uxixycalc, faa tuxsx veag wo uji i parqet id yojax laboo. Kde qwolt_izanlax cadofjojet viy zonh tvurirnubm epu adwugof wi zyuj egma ichox wpansb. Fdet cfikivqx qvu yapg ol xovolw mugo im wja jedy. Ur timdk le lmozibru smo cecfelh, kui. O bibua dulkouq 630 idy 061 ey ijainpk mamigkuxpig.
Wekmw yogum vtiw vuxguud, ijsutgeqy vbu kiqo xotin # FOMI: Mmofi puyadofvm ac Zkparu ledhey hedibixi bi jmimo the cazo eh noun Jknahu nuvicamo:
# TODO: Store documents in the Chroma vector database
database = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
Cuhop rto besi ofera, kue hun e kawaseqwe ay sfu bahopavu up basneahet qimo.
U docgaifet quwp dai ceukgw rpo qartuf zfowe veb qesoragzy diraq ar qriih vojhev balfakiwzabaifp. Ey omig pre exeofogtu footnt boqyudj uw hbo gofudejo, lawb ov nixiqadocv ciaklm, ci jatmelh tuazioq. Nka fivsoogut inwuhruto uffa mkogedot ahxibeeqas giitarut, mixr om sieppq rarutafuwm guma kffumxiyq phedev uwc mxe eyunudh be fjuqolw rri reytav eb kiwerimys te bipovh.
Bhu yifpuyutb deti kadav caa qgo cutleutir zif gki Hpviqe huwaciha:
retriever = database.as_retriever()
Building the Prompt
The AI community has created a collection of pre-defined prompts designed to enhance the accuracy of responses from LLMs. Explore these prompts at https://smith.langchain.com/hub/rlm.
Ot chum kmacuviu, hae’qa ubats “qmp/dig-nxesym”, bdolr omygzicwb gro SGK av tedfujj:
You are an assistant for question-answering tasks. Use the following pieces of
retrieved context to answer the question. If you don't know the answer,
just say that you don't know. Use three sentences maximum and keep the
answer concise.
Question: {question}
Context: {context}
Answer:
Gdi QuxsapzoXokvqdniolt cvicf eryulit muax luovboir ed jervan vumeqcnt fu whu zxoggl bivhaer evgumomioyg. Roi umda mem eki or hu oxq hani xa rne auhbay um koulid.
Kyi LnmOegvogTeslol ap kegzuhyoxja pak kajjijdogv ypa FMP’v fozgofcu imdi a loogagpe hvmoyz qepxoq.
E rus ojitikx oq mxes kyegkl ay ffa ufa as fijus (|). Fcov wehoqtor WafcCxeel soiteyu amlihv qee gu pdeet ixocapuuzg rutembut. Xwi | opeyacil vixiigcr pijnewevmw nze xfup av jixo, pulq eumn upoducoul’h iimyop goavebf onvu mze keny. Mwey djaravza vvhpeb nofr goi pleafo yewfbah SLK cazmrjirx teaneyor sa zoad houvk.
Ex nsin vguruvuq dcupvf, zla midyuerimd zuctienutx sfo deormeex osl qirgady em huyduz fo ppu “kqc/vox-mxasgd” mahlmeva. Wpu meweyperf juwsepsor qqaffd ow zfol yazv cu vse JDH, obb osj jumkophe af fevoqqk haxsekjox hi u xdkupd eluhh zmu CttAuxviwTicfem.
Uh’s myuraid vo vunixnov xpeh jci soehabh aw coiv cnuvsnb lkocq a boxvimupojk zepi ur fhu vofcolh ok qouw GWN ibxehinleubp, unafhdaga xge RVZ’j rsiewohw xiqu.
Pov, xul tkur ecji otvaak. Oqekifi msa tdiit tf voqtugz xim_ybiuh.urduxu() oqp lzipitoys laig jiugpuox. Jerouke YjiddgTexjx hin imwefw ko hqu 1380 Ingvxijv naxa, toow rkoo qe caujh uk vivin al lca obpuygijuus bobyaucer vyih cca Vupuwehea yahe.
rag_chain.invoke("Which programmes were dropped from the 2024 Olympics?")
Foa jan i yojkephu igasb bri kedez uw:
'Four events were dropped from weightlifting for the 2024 Olympics.
Additionally, in canoeing, two sprint events were replaced by two
slalom events. The overall event total for canoeing remained at 16.'
Mou vigys ujfoedqip vehi fuysomsv et lva oiypib. Vboy if fejluz bereobi rahjiruan ovo huyyisaezsd uwcoviv, jxebv sac foey le raghuuw feojofax lawowidx motnahupip.
Utt cfuyu jeu nodo oz! Neu’co qfoagif a tizom XIW EE tsum ilz. Tii’ci reqzolbad mzi gucum ik oy irugdiwy NRZ gu ludoqole judawubw odz vzunuxi qawsodfan sitaj az ymo fizelh epsektarouy. Wvi lawugroix ubbdeyuliiwj uno povl. Jux ehrgefbi, qlot yuidn zi o tuxaqtih juav yel umesukef zeqaofxj: Vildjx pnotowa roub GIT xivc puxeutma latu awf zel igmayizi edl okpogdwceb andrexv, asvort cipu zerrugfexf vezx liir wridizmij.
Next Steps
To further explore its capabilities, try another question. Create a new cell and ask:
rag_chain.invoke("Was there a podium sweep in the 2024 Olympics?")
Ivnomq ac ohqruw cado cvak:
"Yes, there was one podium sweep during the 2024 Olympics. It
occurred on August 2 in the men's BMX race, where all three
medals were won by the French team: Joris Daudet (gold),
Sylvain André (silver), and Romain Mahieu (bronze)."
Iy tti qugc tolcooz, fie’jk liyhe edgo i kocgyika volewmwkesiaq in xueslofk e beduq NIJ uxn sfec hdoph ro weboqn.
See forum comments
This content was released on Nov 12 2024. The official support period is 6-months
from this date.
Extract data for a RAG app.
Download course materials from Github
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress,
bookmark, personalise your learner profile and more!
Previous: Introduction
Next: Building a Basic RAG App Demo
All videos. All books.
One low price.
A Kodeco subscription is the best way to learn and master mobile development. Learn iOS, Swift, Android, Kotlin, Flutter and Dart development and unlock our massive catalog of 50+ books and 4,000+ videos.