Mejorar la recuperación en grafos

Retrieval Augmented Generation (RAG) con LangChain

Meri Nova

Machine Learning Engineer

Técnicas

Principal limitación: fiabilidad de traducción usuario → Cypher

Estrategias para mejorar la recuperación en grafos:

Filtrar el esquema del grafo
Validar la consulta Cypher
Few-shot prompting

Filtrado

from langchain_community.chains.graph_qa.cypher import GraphCypherQAChain

llm = ChatOpenAI(api_key="...", model="gpt-4o-mini", temperature=0)


chain = GraphCypherQAChain.from_llm(

    graph=graph, llm=llm, exclude_types=["Concept"], verbose=True

)

print(graph.get_schema)

Propiedades de nodos:
Document {title: STRING, id: STRING, text: STRING, summary: STRING, source: STRING}
Organization {id: STRING}

Validar la consulta Cypher

Dificultad para interpretar la dirección de las relaciones

chain = GraphCypherQAChain.from_llm(
    graph=graph, llm=llm, verbose=True, validate_cypher=True
)

Detecta nodos y relaciones
Determina las direcciones de las relaciones
Comprueba el esquema del grafo
Actualiza la dirección de las relaciones

Few-shot prompting

examples = [
    {
        "question": "How many notable large language models are mentioned in the article?",
        "query": "MATCH (m:Concept {id: 'Large Language Model'}) RETURN count(DISTINCT m)",
    },
    {
        "question": "Which companies or organizations have developed the large language models mentioned?",
        "query": "MATCH (o:Organization)-[:DEVELOPS]->(m:Concept {id: 'Large Language Model'}) RETURN DISTINCT o.id",
    },
    {
        "question": "What is the largest model size mentioned in the article, in terms of number of parameters?",
        "query": "MATCH (m:Concept {id: 'Large Language Model'}) RETURN max(m.parameters) AS largest_model",
    },
]

Implementar few-shot prompting

from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate


example_prompt = PromptTemplate.from_template(
    "User input: {question}\nCypher query: {query}"
)


cypher_prompt = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix="Eres un experto en Neo4j. Dada una pregunta de entrada, crea una consulta Cypher sintácticamente correcta para ejecutar.\n\nAquí está la información del esquema\n{schema}.\n\n
    A continuación hay varios ejemplos de preguntas y sus consultas Cypher correspondientes.",
    suffix="User input: {question}\nCypher query: ",
    input_variables=["question"],
)

Completar el prompt

Eres un experto en Neo4j. Dada una pregunta de entrada, crea una consulta Cypher sintácticamente correcta para ejecutar.

A continuación hay varios ejemplos de preguntas y sus consultas Cypher correspondientes.

User input: How many notable large language models are mentioned in the article?
Cypher query: MATCH (p:Paper) RETURN count(DISTINCT p)

User input: Which companies or organizations have developed the large language models?
Cypher query: MATCH (o:Organization)-[:DEVELOPS]->(m:Concept {id: 'Large Language Model'}) RETURN DISTINCT o.id

User input: What is the largest model size mentioned in the article, in terms of number of parameters?
Cypher query: MATCH (m:Concept {id: 'Large Language Model'}) RETURN max(m.parameters) AS largest_model

User input: How many papers were published in 2016?
Cypher query:

Añadir ejemplos few-shot

chain = GraphCypherQAChain.from_llm(
    graph=graph, llm=llm, cypher_prompt=cypher_prompt,
    verbose=True, validate_cypher=True
)

¡Vamos a practicar!

Retrieval Augmented Generation (RAG) con LangChain