CakeChat: Emotional Generative Dialog System

Created on 2023-04-30T10:43:20-05:00

Return to the Index

This card pertains to a resource available on the internet.

This card can also be read via Gemini.

Seems like a mini ChatGPT that was made before the concept of Attention is All You Need transformer networks.

Overall it seems to be a proto-Transformer architecture. Hidden context is given to the output layer but using thought vectors instead of directly feeding the LSTMs.

RNNs and GRU cells are used for sequence to sequence conversion.

Encoding is done to "context vectors" and some mention of "thought vectors" which are passed along to the decoders.

Hierarchical Recurrent Encoder-Decoder (HRED)

Word embeddings using word 2 vector models.

Dataset was scraped or acquired from Twitter. Threads were used to harvest conversational nodes.

Relied on DeepMoji network to do initial semantic analysis on messages.