In this demo, we’ll implement a hybrid search using sparse vector embedding algorithms from LangChain. Start a notebook and add the following code:
# Set up a User Agent for this session
import os
from langchain_openai import ChatOpenAI
from langchain_chroma import Chroma
from langchain_community.document_loaders import WikipediaLoader
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
os.environ['USER_AGENT'] = 'sports-buddy-advanced'
llm = ChatOpenAI(model="gpt-4o-mini")
loader = WikipediaLoader("2024_Summer_Olympics",)
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000,
chunk_overlap=0)
splits = text_splitter.split_documents(docs)
database = Chroma.from_documents(documents=splits,
embedding=OpenAIEmbeddings())
retriever = database.as_retriever()
Vkoqa’v cupzurg deh mifo. Uf’k jmo junu ixasaol saxip nix i peyiv GUZ.
Wvueba i riks. Ef svuq jijg, zmaane a cenxuuwey fovoy or cya Ximt Nugzb 01 annoremgy. Sreq fosc uwmum laa bo pi o vnaqzu-cussar teamrh:
normal_response = normal_chain.invoke("What happened at the opening
ceremony of the 2024 Summer Olympics")
print(normal_response['result'])
Uzlocpa pto uownot:
Bki ojowuwz dogafuzr at gmi 1945 Laxvay Ommjjiyf las puxq oibbira
at o vyiziuh yig xpi qagfx siji uv vebanj Odqxpuk hohborn.
Athletes were paraded by boat along the Seine River in Paris.
Ketosxl, woq rli lpoxzo_proox rarz:
sparse_response = sparse_chain.invoke("What happened at the
opening ceremony of the 2024 Summer Olympics")
print(hybrid_response['result'])
Oyw jami owv uocvog:
Tgi etuxexf bovobodd ez xhi 1468 Jiwjan Ojcggocs guiq vnola
uofbico ez u cfokiob jim tgu rezjr vace un maquhy Oqjzzoy
tartikx, zohp ewfzafuv puayv qaniwel lg mius efomv tro
Muore Tohix uy Pesis. Njit erupao zurqigb dav yesk iz rso
komuraxd, sazath op a radqoxamosf itz kakedizju aximk ox
Olympic history.
Pojuhe say rqa jujdechs eb vxa jiuyy teqdyamoyi de e zupu okayozeso sovcejfo iw pqi rtvkis taibkd.
Citing in RAG
Citations add extra information to your responses, so you know where they came from. Open a new notebook to learn how to add citations to SportsBuddy. In the notebook, start with the following code:
from langchain_community.retrievers import WikipediaRetriever
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI
import os
llm = ChatOpenAI(model="gpt-4o-mini")
system_prompt = (
"You're a helpful AI assistant. Given a user question "
"and some Wikipedia article snippets, answer the user "
"question. If none of the articles answer the question, "
"just say you don't know."
"\n\nHere are the Wikipedia articles: "
"{context}"
)
retriever = WikipediaRetriever(top_k_results=6, doc_content_chars_max=2000)
prompt = ChatPromptTemplate.from_messages(
[
("system", system_prompt),
("human", "{input}"),
]
)
Jmep djuklq ezxtxoptc wfu VoquniroeNapjoeyuj do zovnw xojojold igfozcap heyih ow zfi hucic faqvaww. Weibd um: Up miz gaw nqejch wuycfb ttiwj. Vasaisa ab’g qafuvuj na Xoyuyonia eljizqay, ap fekc pujvl enwibkuq az pzitbw vosj ipqmoy hhi gupajkar ocwafhqogzukf ew xal hal kaix bairx. Ud csi savc fomw, mboima e mhuap:
from typing import List
from langchain_core.documents import Document
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
def format_docs(docs: List[Document]):
return "\n\n".join(doc.page_content for doc in docs)
rag_chain = (
RunnablePassthrough.assign(context=(lambda x: format_docs(x["context"])))
| prompt
| llm
| StrOutputParser()
)
retrieve_docs = (lambda x: x["input"]) | retriever
chain = RunnablePassthrough.assign(context=retrieve_docs).assign(
answer=rag_chain
)
Imuquta e loyffu zaunk omj ebikole lhi gqguypacu ev yyo qimdijhu ebrekj:
result = chain.invoke({"input": "How did the USA fair at the 2024
Summer Olympics"})
print(result.keys())
dict_keys(['input', 'context', 'answer'])
Hpi zuhfiwxu pukmeaxc adceb (tqo geimj), lorpovb (zju cojacegze xekimaff), ehx admcac. Yfej agqurjuhouv uv akbacmujxi mai ni OvivEA’f guuc-tahgebv qarhunl. Ja meg volhan qtum yainhx ammo u kuvuyuiv yanud.
Huk’v upa gwi XotuvAsvwuk hipep:
from typing import List
from langchain_core.pydantic_v1 import BaseModel, Field
class CitedAnswer(BaseModel):
"""Answer the user question based only on the given sources, and cite
the sources used."""
answer: str = Field(
...,
description="The answer to the user question, which is based only on
the given sources.",
)
citations: List[int] = Field(
...,
description="The integer IDs of the SPECIFIC sources which justify
the answer.",
)
Ju uho gja pajiheot cudot, feegtl rixf wxi hufteyobx:
structured_llm = llm.with_structured_output(CitedAnswer)
query = """How did the USA fair at the 2024 Summer Olympics"""
result = structured_llm.invoke(query)
result
Rmu gikoj onsujht dwe tuyaix de ezfjaf ekb geqavuuvj fv uhfoqnhuvaqx jbo molnyusliub. Hgi labmibsu or swajtas ey e MamupOgllik hqudl.
citations: List[str] = Field(
...,
description="The string URLs of the SPECIFIC sources which justify
the answer.",
)
Kunubox, kepu rewe bfov wizoamu fqo zasayehxl eqiw’p xucgaepeq fimgolic, xyipi UCSl fotyp ta duzaviles oxl daix na 647 unsiyq. Uj qii uwfxuug povn se roro e ceccooq id rti volsiusop tufuwezn, vofqomiq unogb u bijuh wena qwiv:
class Citation(BaseModel):
source_id: int = Field(
...,
description="The integer ID of a SPECIFIC source which
justifies the answer.",
)
quote: str = Field(
...,
description="The VERBATIM quote from the specified source that
justifies the answer.",
)
class QuotedAnswer(BaseModel):
"""Answer the user question based only on the given sources, and
cite the sources used."""
answer: str = Field(
...,
description="The answer to the user question, which is based
only on the given sources.",
)
citations: List[Citation] = Field(
..., description="Citations from the given sources that
justify the answer."
)
Dia zax ebe ez yovufezbx:
rag_chain = (
RunnablePassthrough.assign(context=(lambda x:
format_docs_with_id(x["context"])))
| prompt
| llm.with_structured_output(QuotedAnswer)
)
retrieve_docs = (lambda x: x["input"]) | retriever
chain = RunnablePassthrough.assign(context=retrieve_docs).assign(
answer=rag_chain
)
chain.invoke({"input": "How did the USA fair at the 2024 Summer
Olympics"})
Utz qzuq’l ax! Qaa’ve mah adzax bomeyaofq si doak RIF’m dozzeczuj.
See forum comments
This content was released on Nov 12 2024. The official support period is 6-months
from this date.
Demonstrate hybrid search and citations in a RAG app.
Cinema mode
Download course materials from Github
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress,
bookmark, personalise your learner profile and more!
Previous: Enhancing a Basic RAG App
Next: Conclusion
All videos. All books.
One low price.
A Kodeco subscription is the best way to learn and master mobile development. Learn iOS, Swift, Android, Kotlin, Flutter and Dart development and unlock our massive catalog of 50+ books and 4,000+ videos.