Storing and querying documents

Retrieval Augmented Generation (RAG) with LangChain

Meri Nova

Machine Learning Engineer

Instantiating the Neo4j database

from langchain_community.graphs import Neo4jGraph

graph = Neo4jGraph(url="bolt://localhost:7687", username="neo4j", password="...")

import os

url = os.environ["NEO4J_URI"]
user = os.environ["NEO4J_USERNAME"]
password = os.environ["NEO4J_PASSWORD"]

graph = Neo4jGraph(url=url, username=user, password=password)

¹ https://neo4j.com/download/

Storing graph documents

from langchain_experimental.graph_transformers import LLMGraphTransformer

llm = ChatOpenAI(api_key="...", temperature=0, model="gpt-4o-mini")
llm_transformer = LLMGraphTransformer(llm=llm)

graph_documents = llm_transformer.convert_to_graph_documents(documents)

Storing graph documents

graph.add_graph_documents(
  graph_documents,

  include_source=True,

  baseEntityLabel=True

)

include_source=True: link nodes to source documents with MENTIONS edge
baseEntityLabel=True: add __Entity__ label to each node

The graph documents represented as nodes and edges.

A zoomed-in version of the previous image showing nodes about OpenAI models.

Database schema

print(graph.get_schema)

Node properties:
Concept {id: STRING}
Architecture {id: STRING}
Organization {id: STRING}
Event {id: STRING}
Paper {id: STRING}

The relationships:
(:Concept)-[:DEVELOPED_BY]->(:Person)
(:Architecture)-[:BASED_ON]->(:Concept)
(:Organization)-[:PROPOSED]->(:Concept)
(:Document)-[:MENTIONS]->(:Event)
(:Paper)-[:BASED_ON]->(:Concept)

Querying Neo4j - Cypher Query Language

A node called James with a relationship called friends pointing to a mystery node person.

Querying Neo4j - Cypher Query Language

Querying the LLM graph

results = graph.query("""
MATCH (gpt4:Model {id: "Gpt-4"})-[:DEVELOPED_BY]->(org:Organization)
RETURN org
""")


print(results)

[{'org': {'id': 'Openai'}}]

Let's practice!

Retrieval Augmented Generation (RAG) with LangChain