A common baseline that often works really well for sentence vectors is simply to average the word vectors for each word in the sentence. So lets say I have a vector made in this way. If I have a dictionary with lots of words and their associated word vectors, can I get the most likely sequence of words that made this sentence vector?
I'm fine with assuming we have 8 or less words in our sentence so unscrambling words is easy, but I'd like to have this algorithm work with millions of words in my dictionary.