How to use keras embedding layer with 3D tensor input?

Abdul Karim Khan

I am facing difficulty in using Keras embedding layer with one hot encoding of my input data.

Following is the toy code.

Import packages

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers.embeddings import Embedding
from keras.optimizers import Adam
import matplotlib.pyplot as plt
import numpy as np
import openpyxl
import pandas as pd
from keras.callbacks import ModelCheckpoint
from keras.callbacks import ReduceLROnPlateau

The input data is text based as follows.

Train and Test data

X_train_orignal= np.array(['OC(=O)C1=C(Cl)C=CC=C1Cl', 'OC(=O)C1=C(Cl)C=C(Cl)C=C1Cl',
       'OC(=O)C1=CC=CC(=C1Cl)Cl', 'OC(=O)C1=CC(=CC=C1Cl)Cl',
       'OC1=C(C=C(C=C1)[N+]([O-])=O)[N+]([O-])=O'])

X_test_orignal=np.array(['OC(=O)C1=CC=C(Cl)C=C1Cl', 'CCOC(N)=O',
       'OC1=C(Cl)C(=C(Cl)C=C1Cl)Cl'])

Y_train=np.array(([[2.33],
       [2.59],
       [2.59],
       [2.54],
       [4.06]]))

Y_test=np.array([[2.20],
   [2.81],
   [2.00]])

Creating dictionaries

Now i create two dictionaries, characters to index vice. The unique character number is stored in len(charset) and maximum length of the string along with 5 additional characters is stored in embed. The start of each string will be padded with ! and end will be E.

charset = set("".join(list(X_train_orignal))+"!E")
char_to_int = dict((c,i) for i,c in enumerate(charset))
int_to_char = dict((i,c) for i,c in enumerate(charset))
embed = max([len(smile) for smile in X_train_orignal]) + 5
print (str(charset))
print(len(charset), embed)

One hot encoding

I convert all the train data into one hot encoding as follows.

def vectorize(smiles):
        one_hot =  np.zeros((smiles.shape[0], embed , len(charset)),dtype=np.int8)
        for i,smile in enumerate(smiles):
            #encode the startchar
            one_hot[i,0,char_to_int["!"]] = 1
            #encode the rest of the chars
            for j,c in enumerate(smile):
                one_hot[i,j+1,char_to_int[c]] = 1
            #Encode endchar
            one_hot[i,len(smile)+1:,char_to_int["E"]] = 1

        return one_hot[:,0:-1,:]

X_train = vectorize(X_train_orignal)
print(X_train.shape)
X_test = vectorize(X_test_orignal)
print(X_test.shape)

When it converts the input train data into one hot encoding, the shape of the one hot encoded data becomes (5, 44, 14) for train and (3, 44, 14) for test. For train, there are 5 example, 0-44 is the maximum length and 14 are the unique characters. The examples for which there are less number of characters, are padded with E till the maximum length.

Verifying the correct padding Following is the code to verify if we have done the padding rightly.

mol_str_train=[]
mol_str_test=[]
for x in range(5):

    mol_str_train.append("".join([int_to_char[idx] for idx in np.argmax(X_train[x,:,:], axis=1)]))

for x in range(3):
    mol_str_test.append("".join([int_to_char[idx] for idx in np.argmax(X_test[x,:,:], axis=1)]))

and let's see, how the train set looks like.

mol_str_train

['!OC(=O)C1=C(Cl)C=CC=C1ClEEEEEEEEEEEEEEEEEEEE',
 '!OC(=O)C1=C(Cl)C=C(Cl)C=C1ClEEEEEEEEEEEEEEEE',
 '!OC(=O)C1=CC=CC(=C1Cl)ClEEEEEEEEEEEEEEEEEEEE',
 '!OC(=O)C1=CC(=CC=C1Cl)ClEEEEEEEEEEEEEEEEEEEE',
 '!OC1=C(C=C(C=C1)[N+]([O-])=O)[N+]([O-])=OEEE']

Now is the time to build model.

Model

model = Sequential()
model.add(Embedding(len(charset), 10, input_length=embed))
model.add(Flatten())
model.add(Dense(1, activation='linear'))

def coeff_determination(y_true, y_pred):
    from keras import backend as K
    SS_res =  K.sum(K.square( y_true-y_pred ))
    SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) )
    return ( 1 - SS_res/(SS_tot + K.epsilon()) )

def get_lr_metric(optimizer):
    def lr(y_true, y_pred):
        return optimizer.lr
    return lr


optimizer = Adam(lr=0.00025)
lr_metric = get_lr_metric(optimizer)
model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination, lr_metric])



callbacks_list = [
    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15, verbose=1, mode='auto',cooldown=0),
    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss', save_best_only=True, verbose=1, mode='auto')]


history =model.fit(x=X_train, y=Y_train,
                              batch_size=1,
                              epochs=10,
                              validation_data=(X_test,Y_test),
                              callbacks=callbacks_list)

Error

ValueError: Error when checking input: expected embedding_3_input to have 2 dimensions, but got array with shape (5, 44, 14)

The embedding layer expects two dimensional array. How can I deal with this issue so that it can accept the one hot vector encoded data.

All the above code can be run.

Nomiluks

our input shape was not defined properly in the embedding layer. The following code works for me by reducing the steps to covert your data dimensions to 2D you can directly pass the 3-D input to your embedding layer.

#THE MISSING STUFF
#_________________________________________
Y_train = Y_train.reshape(5) #Dense layer contains a single unit so need to input single dimension array
max_len = len(charset)
max_features = embed-1
inputshape = (max_features, max_len) #input shape didn't define. Embedding layer can accept 3D input by using input_shape
#__________________________________________

model = Sequential()
#model.add(Embedding(len(charset), 10, input_length=14))

model.add(Embedding(max_features, 10, input_shape=inputshape))#input_length=max_len))
model.add(Flatten())
model.add(Dense(1, activation='linear'))
print(model.summary())

optimizer = Adam(lr=0.00025)
lr_metric = get_lr_metric(optimizer)
model.compile(loss="mse", optimizer=optimizer, metrics=[coeff_determination, lr_metric])


callbacks_list = [
    ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-15, verbose=1, mode='auto',cooldown=0),
    ModelCheckpoint(filepath="weights.best.hdf5", monitor='val_loss', save_best_only=True, verbose=1, mode='auto')]

history =model.fit(x=X_train, y=Y_train,
                              batch_size=10,
                              epochs=10,
                              validation_data=(X_test,Y_test),
                              callbacks=callbacks_list)

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集
0

コメントを追加

0

関連記事

分類Dev

How to input a list to the embedding layer?

分類Dev

how to build Sequence-to-sequence autoencoder in keras with embedding layer?

分類Dev

isinstance() to check Keras Layer Type on Tensor

分類Dev

How to use a scipy function on each element of a tensor using Keras?

分類Dev

Keras: ValueError: Input 0 of layer sequential_1 is incompatible with the layer: expected ndim=3, found ndim=2

分類Dev

How to ignore some input layer, while predicting, in a keras model trained with multiple input layers?

分類Dev

couldn't run embedding network Keras with multiplue input

分類Dev

How to specify padding with keras in Conv2D layer?

分類Dev

How to cache layer activations in Keras?

分類Dev

How to use Keras LSTM batch_input_size properly

分類Dev

How to replace (or insert) intermediate layer in Keras model?

分類Dev

How to input cifar10 into inceptionv3 in keras

分類Dev

How to input a 2D array in Keras-Python?

分類Dev

How to use Keras TimeseriesGenerator

分類Dev

Is there a way to use the native tf Attention layer with keras Sequential API?

分類Dev

How to reshape (None, 10)-dimensional tensor to (None, None, 10) in Keras?

分類Dev

Tensor Flow 2.0、kerasのConv2Dレイヤーでinput_shapeを指定する方法

分類Dev

How to use mouse to rotate matplotlib 3D plots in wxPython?

分類Dev

Slice tensor in Keras Tensorflow

分類Dev

Use "Flatten" or "Reshape" to get 1D output of unknown input shape in keras

分類Dev

Different methods for initializing embedding layer weights in Pytorch

分類Dev

How to specify the axis when using the softmax activation in a Keras layer?

分類Dev

How to support masking in custom tf.keras.layers.Layer

分類Dev

How to remove the FC layer off of a fine turned model keras

分類Dev

How to implement custom output layer with dynamic shape in Keras?

分類Dev

Pytorch Inner Product of 3D tensor with 1D Tensor to generate 2D Tensor

分類Dev

Modify layer parameters in Keras

分類Dev

Splitting cnn layer in keras

分類Dev

How to get input tensor shape of an unknown PyTorch model

Related 関連記事

  1. 1

    How to input a list to the embedding layer?

  2. 2

    how to build Sequence-to-sequence autoencoder in keras with embedding layer?

  3. 3

    isinstance() to check Keras Layer Type on Tensor

  4. 4

    How to use a scipy function on each element of a tensor using Keras?

  5. 5

    Keras: ValueError: Input 0 of layer sequential_1 is incompatible with the layer: expected ndim=3, found ndim=2

  6. 6

    How to ignore some input layer, while predicting, in a keras model trained with multiple input layers?

  7. 7

    couldn't run embedding network Keras with multiplue input

  8. 8

    How to specify padding with keras in Conv2D layer?

  9. 9

    How to cache layer activations in Keras?

  10. 10

    How to use Keras LSTM batch_input_size properly

  11. 11

    How to replace (or insert) intermediate layer in Keras model?

  12. 12

    How to input cifar10 into inceptionv3 in keras

  13. 13

    How to input a 2D array in Keras-Python?

  14. 14

    How to use Keras TimeseriesGenerator

  15. 15

    Is there a way to use the native tf Attention layer with keras Sequential API?

  16. 16

    How to reshape (None, 10)-dimensional tensor to (None, None, 10) in Keras?

  17. 17

    Tensor Flow 2.0、kerasのConv2Dレイヤーでinput_shapeを指定する方法

  18. 18

    How to use mouse to rotate matplotlib 3D plots in wxPython?

  19. 19

    Slice tensor in Keras Tensorflow

  20. 20

    Use "Flatten" or "Reshape" to get 1D output of unknown input shape in keras

  21. 21

    Different methods for initializing embedding layer weights in Pytorch

  22. 22

    How to specify the axis when using the softmax activation in a Keras layer?

  23. 23

    How to support masking in custom tf.keras.layers.Layer

  24. 24

    How to remove the FC layer off of a fine turned model keras

  25. 25

    How to implement custom output layer with dynamic shape in Keras?

  26. 26

    Pytorch Inner Product of 3D tensor with 1D Tensor to generate 2D Tensor

  27. 27

    Modify layer parameters in Keras

  28. 28

    Splitting cnn layer in keras

  29. 29

    How to get input tensor shape of an unknown PyTorch model

ホットタグ

アーカイブ