Rapid Automatic Keyword Extraction (RAKE)
Created on 2023-09-10T21:11:24-05:00
A list of "stop words" are first removed from the input.
Prose is then broken in to "phrases" buy splitting every time you see a phrase delimeter.
Create a matrix of how often n-grams occur with one another, such as "keyword extraction" (a bigram.)
Score the n-grams.
Take the top N n-grams until you have as many keywords as you were hoping to get.