Entwickeln von LLM-Anwendungen mit LangChain
Jonathan Bennion
AI Engineer & LangChain Contributor


Zeile 1:
Recurrent neural networks, long short-term memory [13] and gated recurrent [7] neural networks
Zeile 2:
in particular, have been firmly established as state of the art approaches in sequence modeling and


CharacterTextSplitterRecursiveCharacterTextSplitterquote = '''One machine can do the work of fifty ordinary humans.
No machine can do
the work of one extraordinary human.'''
len(quote)
103
chunk_size = 24
chunk_overlap = 3
from langchain_text_splitters import CharacterTextSplitterct_splitter = CharacterTextSplitter( separator='.', chunk_size=chunk_size, chunk_overlap=chunk_overlap)docs = ct_splitter.split_text(quote) print(docs)print([len(doc) for doc in docs])
['One machine can do the work of fifty ordinary humans', 'No machine can do the work of one extraordinary human'][52, 53]
chunk_size, aber das klappt vielleicht nicht immer!from langchain_text_splitters import RecursiveCharacterTextSplitterrc_splitter = RecursiveCharacterTextSplitter( separators=[" ", " ", " ", ""], chunk_size=chunk_size, chunk_overlap=chunk_overlap)docs = rc_splitter.split_text(quote) print(docs)
separators=["\n\n", "\n", " ", ""]['One machine can do the',
'work of fifty ordinary',
'humans.',
'No machine can do the',
'work of one',
'extraordinary human.']
"\n\n""\n"" "from langchain_community.document_loaders import UnstructuredHTMLLoader from langchain_text_splitters import RecursiveCharacterTextSplitterloader = UnstructuredHTMLLoader("white_house_executive_order_nov_2023.html") data = loader.load()rc_splitter = RecursiveCharacterTextSplitter( chunk_size=chunk_size, chunk_overlap=chunk_overlap, separators=['.'])docs = rc_splitter.split_documents(data) print(docs[0])
Document(page_content="To search this site, enter a search term [...]
Entwickeln von LLM-Anwendungen mit LangChain