Kodlayıcıyı uygulama

Keras ile Machine Translation

Thushan Ganegedara

Data Scientist and Author

Verileri anlama

Veri kümesindeki bazı verileri yazdırma

for en_sent, fr_sent in zip(en_text[:3], fr_text[:3]):
  print("English: ", en_sent)
  print("\tFrench: ", fr_sent)
English:  new jersey is sometimes quiet during autumn , and it is snowy in april .
    French:  new jersey est parfois calme pendant l' automne , et il est neigeux en avril .

English:  the united states is usually chilly during july , and it is usually freezing in november .
    French:  les états-unis est généralement froid en juillet , et il gèle habituellement en novembre .

English:  california is usually quiet during march , and it is usually hot in june .
    French:  california est généralement calme en mars , et il est généralement chaud en juin .
Keras ile Machine Translation

Cümleleri tokenleştirme

Tokenleştirme

  • Bir cümleyi/ifadeyi tekil tokenlara (örn. kelimelere) ayırma süreci

Cümlelerdeki kelimeleri tokenleştirme

first_sent = en_text[0]
print("First sentence: ", first_sent)
first_words = first_sent.split(" ")
print("\tWords: ", first_words)
First sentence:  new jersey is sometimes quiet during autumn , and it is snowy in april .
    Words:  ['new', 'jersey', 'is', 'sometimes', 'quiet', 'during', 'autumn', ',', 
             'and', 'it', 'is', 'snowy', 'in', 'april', '.']
Keras ile Machine Translation

Cümle uzunluğunu hesaplama

Ortalama cümle uzunluğu ve sözlük boyutunu hesaplama (İngilizce)

sent_lengths = [len(en_sent.split(" ")) for en_sent in en_text]
mean_length = np.mean(sent_lengths)
print('(English) Mean sentence length: ', mean_length)
(English) Mean sentence length:  13.20662
Keras ile Machine Translation

Sözlük boyutunu hesaplama

all_words = []
for sent in en_text:
    all_words.extend(sent.split(" "))
vocab_size = len(set(all_words))
print("(English) Vocabulary size: ", vocab_size)
  • set nesnesi yalnızca benzersiz öğeler içerir, yinelenenleri içermez
(English) Vocabulary size:  228
Keras ile Machine Translation

Kodlayıcı

Kodlayıcı

Keras ile Machine Translation

Keras ile kodlayıcıyı uygulama

  • Girdi katmanı
    en_inputs = Input(shape=(en_len, en_vocab))
    
  • GRU katmanı
    en_gru = GRU(hsize, return_state=True)
    en_out, en_state = en_gru(en_inputs)
    
  • Keras modeli
    encoder = Model(inputs=en_inputs, outputs=en_state)
    
Keras ile Machine Translation

Keras model özetini anlama

print(encoder.summary())
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 15, 150)           0         
_________________________________________________________________
gru (GRU)                    [(None, 48), (None, 48)]  28656     
=================================================================
Total params: 28,656
Trainable params: 28,656
Non-trainable params: 0
_________________________________________________________________
Keras ile Machine Translation

Hadi pratik yapalım!

Keras ile Machine Translation

Preparing Video For Download...