Word 2 vec
In the word 2 vec, we have two types
CBOW and Skip-gram
In Cbow we try to predict the target word by using neighbor context words.
In Skip Gram, we try to predict context words from the target.
Word 2 vec is trained on Wikipedia data but to demonstrate the working on Cbow, lets take simple sentence - "Hope can set you free". let's consider a window size of 3. So here input will be "hope" and "set", and the output is "can".
since this is cbow, here we try to predict the target word by using context words. so here in this sentence, we will try to predict the word "can" by using context words "hope" and "set".
here in between the input and hidden layer, we have a 3X5 matrix as weight. here our neural network updates weights in such a way that our predicted one hot encoding will be equal to the actual one hot encoding. in backpropagation, these weights get updated accordingly.
Skpi Gram working:
here in the skip gram, we try to predict the context words using the target word.
here our input will be "can" and by using it, we are trying to predict context words "hope" and "set".
as the output of the neural network, we get one hot encoding vector, and we convert it into the word embedding by multiplying one hot vector by weights after training.
so here for word "hope" (5X1), we multiply its one hot encoding with the weight (3X5) of model to get it's embedding (3X1). dot(3x5, 5x1) --> 3x1
in actual word2vec while training google has taken window size of 5 to 10 and vector of 300 dimension i.e. neural network will have 300 neurons.
Comments
Post a Comment