Word 2 vec

In the word 2 vec, we have two types

CBOW and Skip-gram

In Cbow we try to predict the target word by using neighbor context words.

In Skip Gram, we try to predict context words from the target.

Cbow working:

Word 2 vec is trained on Wikipedia data but to demonstrate the working on Cbow, lets take simple sentence - "Hope can set you free". let's consider a window size of 3. So here input will be "hope" and "set", and the output is "can".

since this is cbow, here we try to predict the target word by using context words. so here in this sentence, we will try to predict the word "can" by using context words "hope" and "set".

in the input, we pass the one-hot encoding of words "hope" and "set" and by using a neural network, we will try to predict the target word "can".

here in between the input and hidden layer, we have a 3X5 matrix as weight. here our neural network updates weights in such a way that our predicted one hot encoding will be equal to the actual one hot encoding. in backpropagation, these weights get updated accordingly.

Skpi Gram working:

here in the skip gram, we try to predict the context words using the target word.

here our input will be "can" and by using it, we are trying to predict context words "hope" and "set".

as the output of the neural network, we get one hot encoding vector, and we convert it into the word embedding by multiplying one hot vector by weights after training.

so here for word "hope" (5X1), we multiply its one hot encoding with the weight (3X5) of model to get it's embedding (3X1). dot(3x5, 5x1) --> 3x1

in actual word2vec while training google has taken window size of 5 to 10 and vector of 300 dimension i.e. neural network will have 300 neurons.

Search This Blog

GL-NLP

Word 2 vec

Comments

Post a Comment

Popular posts from this blog

Sentiment analysis