Meningkatkan pengambilan graf

Retrieval Augmented Generation (RAG) dengan LangChain

Meri Nova

Machine Learning Engineer

Teknik

Batasan utama: keandalan terjemahan user → Cypher

Strategi untuk meningkatkan sistem pengambilan graf:

  • Memfilter skema graf
  • Memvalidasi kueri Cypher
  • Few-shot prompting
Retrieval Augmented Generation (RAG) dengan LangChain

Pemfilteran

from langchain_community.chains.graph_qa.cypher import GraphCypherQAChain

llm = ChatOpenAI(api_key="...", model="gpt-4o-mini", temperature=0)

chain = GraphCypherQAChain.from_llm(
graph=graph, llm=llm, exclude_types=["Concept"], verbose=True
)
print(graph.get_schema)
Node properties:
Document {title: STRING, id: STRING, text: STRING, summary: STRING, source: STRING}
Organization {id: STRING}
Retrieval Augmented Generation (RAG) dengan LangChain

Validasi kueri Cypher

  • Sulit menafsirkan arah relasi
chain = GraphCypherQAChain.from_llm(
    graph=graph, llm=llm, verbose=True, validate_cypher=True
)
  1. Mendeteksi node dan relasi
  2. Menentukan arah relasi
  3. Memeriksa skema graf
  4. Memperbarui arah relasi
Retrieval Augmented Generation (RAG) dengan LangChain

Few-shot prompting

examples = [
    {
        "question": "How many notable large language models are mentioned in the article?",
        "query": "MATCH (m:Concept {id: 'Large Language Model'}) RETURN count(DISTINCT m)",
    },
    {
        "question": "Which companies or organizations have developed the large language models mentioned?",
        "query": "MATCH (o:Organization)-[:DEVELOPS]->(m:Concept {id: 'Large Language Model'}) RETURN DISTINCT o.id",
    },
    {
        "question": "What is the largest model size mentioned in the article, in terms of number of parameters?",
        "query": "MATCH (m:Concept {id: 'Large Language Model'}) RETURN max(m.parameters) AS largest_model",
    },
]
Retrieval Augmented Generation (RAG) dengan LangChain

Menerapkan few-shot prompting

from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate

example_prompt = PromptTemplate.from_template( "User input: {question}\nCypher query: {query}" )
cypher_prompt = FewShotPromptTemplate( examples=examples, example_prompt=example_prompt, prefix="You are a Neo4j expert. Given an input question, create a syntactically correct Cypher query to run.\n\nHere is the schema information\n{schema}.\n\n Below are a number of examples of questions and their corresponding Cypher queries.", suffix="User input: {question}\nCypher query: ", input_variables=["question"], )
Retrieval Augmented Generation (RAG) dengan LangChain

Lengkapi prompt

Anda adalah pakar Neo4j. Berdasarkan pertanyaan masukan, buat kueri Cypher yang benar secara sintaks untuk dijalankan.

Di bawah ini beberapa contoh pertanyaan dan kueri Cypher yang sesuai.

User input: How many notable large language models are mentioned in the article?
Cypher query: MATCH (p:Paper) RETURN count(DISTINCT p)

User input: Which companies or organizations have developed the large language models?
Cypher query: MATCH (o:Organization)-[:DEVELOPS]->(m:Concept {id: 'Large Language Model'}) RETURN DISTINCT o.id

User input: What is the largest model size mentioned in the article, in terms of number of parameters?
Cypher query: MATCH (m:Concept {id: 'Large Language Model'}) RETURN max(m.parameters) AS largest_model

User input: How many papers were published in 2016?
Cypher query:
Retrieval Augmented Generation (RAG) dengan LangChain

Menambahkan contoh few-shot

chain = GraphCypherQAChain.from_llm(
    graph=graph, llm=llm, cypher_prompt=cypher_prompt,
    verbose=True, validate_cypher=True
)
Retrieval Augmented Generation (RAG) dengan LangChain

Ayo berlatih!

Retrieval Augmented Generation (RAG) dengan LangChain

Preparing Video For Download...