If word2vec encounters the same word multiple times in the same window, what occurs? Obviously it is meaningless to decrease the distance between the vectors for the input word and the target word. But will the repetition strengthen the relationship between the repeated word and the context words?
2 Answers
I think your last question is worth discussing, but forgive my careless on skipping the details of the model and just leaving a quick answer here :P
Repeating a sentence in your corpus would definitely change the learning result, and strength the relationship of the words in this sentence, because one of the models behind word2vec is skip-gram, which assume the center word can be used to predict its surroundings.
But I have to ask another question coming follows: what is our purpose of using word2vec?
- To find similar words in semantic and synthetic, which is used to search and information retrieval.
- A skip-gram model is useful for modeling those like click-sequence data, which could be used in recommendation
- $\begingroup$ I can see how repeating the sentence might do that, but my question is specifically about repeating one of the words in the sentence. $\endgroup$jamesmf– jamesmf2015-09-24 20:02:54 +00:00Commented Sep 24, 2015 at 20:02
- $\begingroup$ I believe you meant "syntactic" not "synthetic. $\endgroup$Vladislavs Dovgalecs– Vladislavs Dovgalecs2016-02-23 16:52:40 +00:00Commented Feb 23, 2016 at 16:52