Working with Hugging Face
Jacob H. Marquez
Lead Data Engineer
$$
$$
$$
Extractive:
$$ ✅ Selects key sentences from the text
$$ ✅ Efficient, needs fewer resources
$$ ❌ Lacks flexibility; may be less cohesive
$$
Abstractive:
$$ ✅ Generates new, rephrased text
$$ ✅ Clearer and more readable
$$ ❌ Requires more resources and processing
$$
$$
$$
$$
$$
$$
$$ $$
from transformers import pipeline
# Load the extractive summarization pipeline
summarizer = pipeline("summarization", model="nyamuda/extractive-summarization")
text = "This is my really large text about Data Science..."
summary_text = summarizer(text)
print(summary_text[0]['summary_text'])
"data science is a field that combines mathematics, statistics...."
from transformers import pipeline # Load the abstractive summarization pipeline summarizer = pipeline("summarization", model="sshleifer/distilbart-cnn-12-6")
text = "This is my really large text about Data Science..." summary_text = summarizer(text) print(summary_text[0]['summary_text'])
"The global data science platform market is projected
is projected to reach $140.9 billion by 2025..."
min_length
& max_length
: Control summary lengthsummarizer = pipeline(task="summarization", min_length=10, max_length=150)
$$
Example Error
Your max_length is set to 150, but your input_length is only 81.
Since this is a summarization task, where outputs shorter than the input are
typically wanted, you might consider decreasing max_length manually,
e.g. summarizer('...', max_length=40)
max_length
for short inputsWorking with Hugging Face