Skip Grams

Created on 2020-08-17T21:40:53.379733

Return to the Index

This card can also be read via Gemini.

TODO how does the hidden layer get extracted to perform document vectoring

"Skip Grams" are like CBOW but it guesses multiple output words given a single input word. It excels at "small data sets with rare words."

Network

There is an input, single hidden and output layer.
Input works as a "one hot" of the words going in to the bag of words.
More than one word may be in the output layer to add "context."
Output is a "one hot" of the next predicted word.

The output layer becomes a list of probabilities for each target word.

One-hot

One neuron for each word the network knows how to handle.
All are set to zero while the correct word is set to one.