CakeChat: Emotional Generative Dialog System
Created on 2023-04-30T10:43:20-05:00
Seems like a mini ChatGPT that was made before the concept of Attention is All You Need transformer networks.
Overall it seems to be a proto-Transformer architecture. Hidden context is given to the output layer but using thought vectors instead of directly feeding the LSTMs.
RNNs and GRU cells are used for sequence to sequence conversion.
Encoding is done to "context vectors" and some mention of "thought vectors" which are passed along to the decoders.
Hierarchical Recurrent Encoder-Decoder (HRED)
Word embeddings using word 2 vector models.
Dataset was scraped or acquired from Twitter. Threads were used to harvest conversational nodes.
Relied on DeepMoji network to do initial semantic analysis on messages.