Sentimentanalyse in Python
Violeta Misheva
Data Scientist
Stemming zet woorden om naar hun stam, ook als die stam geen echt woord is.
staying, stays, stayed ----> stay
house, houses, housing ----> hous
Lemmatization lijkt op stemming, maar reduceert woorden tot geldige woorden in de taal.
stay, stays, staying, stayed ----> stay
house, houses, housing ----> house
Stemming
Lemmatization
from nltk.stem import PorterStemmer
porter = PorterStemmer()
porter.stem('wonderful')
'wonder'
Snowball Stemmer: Deens, Nederlands, Engels, Fins, Frans, Duits, Hongaars, Italiaans, Noors, Portugees, Roemeens, Russisch, Spaans, Zweeds
from nltk.stem.snowball import SnowballStemmer
DutchStemmer = SnowballStemmer("dutch")
DutchStemmer.stem("beginnen")
'begin'
porter.stem('Today is a wonderful day!')
'today is a wonderful day!'
tokens = word_tokenize('Today is a wonderful day!')
stemmed_tokens = [porter.stem(token) for token in tokens]
stemmed_tokens
['today', 'is', 'a', 'wonder', 'day', '!']
from nltk.stem import WordNetLemmatizer
WNlemmatizer = WordNetLemmatizer()
WNlemmatizer.lemmatize('wonderful', pos='a')
'wonderful'
Sentimentanalyse in Python