Optimizing document retrieval

Retrieval Augmented Generation (RAG) with LangChain

Meri Nova

Machine Learning Engineer

Putting the R in RAG...

Documents being retrieved from a vector database and sent back to the application to generate a response.

Retrieval Augmented Generation (RAG) with LangChain

Dense

Encode chunks as a single vector with non-zero components

A vector space showing similar terms grouped together: Large Language Model, AI, and Machine Learning.

  • Pros: Capturing semantic meaning
  • Cons: Computationally expensive
Retrieval Augmented Generation (RAG) with LangChain

Dense

Encode chunks as a single vector with non-zero components

Emnbedded terms in a vector space with more semantically similar terms grouped more closely together.

  • Pros: Capturing semantic meaning
  • Cons: Computationally expensive

Sparse

Encode using word matching with mostly zero components

The documents containing instances of particular terms, where the documents containing the most terms highlighted.

  • Pros: Precise, explainable, rare-word handling
  • Cons: Generalizability
Retrieval Augmented Generation (RAG) with LangChain

Sparse retrieval methods

TF-IDF: Encodes documents using the words that make the document unique

The documents containing instances of particular terms, where the documents containing the most terms highlighted.

BM25: Helps mitigate high-frequency words from saturating the encoding

Retrieval Augmented Generation (RAG) with LangChain

BM25 retrieval

from langchain_community.retrievers import BM25Retriever

chunks = [ "Python was created by Guido van Rossum and released in 1991.", "Python is a popular language for machine learning (ML).", "The PyTorch library is a popular Python library for AI and ML." ]
bm25_retriever = BM25Retriever.from_texts(chunks, k=3)
Retrieval Augmented Generation (RAG) with LangChain

BM25 retrieval

results = bm25_retriever.invoke("When was Python created?")
print("Most Relevant Document:")
print(results[0].page_content)
Most Relevant Document:
Python was created by Guido van Rossum and released in 1991.
  • Python was created by Guido van Rossum and released in 1991."
  • "Python is a popular language for machine learning (ML)."
  • "The PyTorch library is a popular Python library for AI/ML."
Retrieval Augmented Generation (RAG) with LangChain

BM25 in RAG

retriever = BM25Retriever.from_documents(
    documents=chunks, 
    k=5
)

chain = ({"context": retriever, "question": RunnablePassthrough()} | prompt | llm | StrOutputParser() )
1 https://www.datacamp.com/blog/what-is-retrieval-augmented-generation-rag
Retrieval Augmented Generation (RAG) with LangChain

BM25 in RAG

print(chain.invoke("How can LLM hallucination impact a RAG application?"))
The RAG application may generate responses that are off-topic or inaccurate.
Retrieval Augmented Generation (RAG) with LangChain

Let's practice!

Retrieval Augmented Generation (RAG) with LangChain

Preparing Video For Download...