Working with Hugging Face
Jacob H. Marquez
Lead Data Engineer
$$
$$
Question: "What is the total revenue of Q3?"
$$
$$
$$
$$
US-Employee_Policy.pdf
$$
$$
$$
from pypdf import PdfReader
# Load the PDF file reader = PdfReader("US-Employee_Policy.pdf")
# Extract text from all pages document_text = "" for page in reader.pages:
document_text += page.extract_text()
Welcome to the US Employee Policy document...
# Load the question-answering pipeline qa_pipeline = pipeline( task="question-answering", model="distilbert-base-cased-distilled-squad")
question = "How many volunteer days are offered annually?"
# Get the answer from the QA pipeline result = qa_pipeline(question=question, context=document_text)
print(f"Answer: {result['answer']}")
Answer: 1
$$
PdfReader
from pypdf
to load and read PDF files.pages
and .extract_text()
into document_text
question-answering
pipelinequestion
and context
to the pipelineWorking with Hugging Face