Natural Language Processing (NLP) in Python
Fouad Trad
Machine Learning Engineer

Memahami topik suatu teks

Memahami topik suatu teks

Tugas yang membutuhkan setiap kata dalam teks

NLTK menyediakan daftar stop word untuk beberapa bahasa
from nltk.corpus import stopwords nltk.download('stopwords')stop_words = stopwords.words('english')print(stop_words[:10])
['a', 'about', 'above', 'after', 'again', 'against', 'ain', 'all', 'am', 'an']
from nltk.tokenize import word_tokenizetext = "This is an example to demonstrate removing stop words."tokens = word_tokenize(text)# The .lower() method helps with case sensitivity filtered_tokens = [word for word in tokens if word.lower() not in stop_words]print(filtered_tokens)
['example', 'demonstrate', 'removing', 'stop', 'words', '.']

Tugas yang mencari kata umum atau penting dalam dokumen

Tugas yang mencari kata umum atau penting dalam dokumen

Tugas yang perlu mempertahankan struktur kalimat agar jelas

import string
print(string.punctuation)
!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
text = "This is an example to demonstrate removing stop words." tokens = word_tokenize(text) filtered_tokens = [word for word in tokens if word.lower() not in stop_words]clean_tokens = [word for word in filtered_tokens if word not in string.punctuation]print(clean_tokens)
['example', 'demonstrate', 'removing', 'stop', 'words']
Natural Language Processing (NLP) in Python