Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

Required fields*

4
  • 1
    $\begingroup$ Not necessarily, it depends on the geometry properties of the embedding. Just recently came across a paper showing skip-gram trained embeddings are narrowly clustered in a single orthant (aclweb.org/anthology/D17-1308). In facebook's MUSE project, they also investigate the Euclidean method for alignment. $\endgroup$ Commented Jan 10, 2019 at 1:10
  • 2
    $\begingroup$ The curse of dimensionality, and its relevance to real world data, was covered in Jeremy Howard & Rachel Thomas' fast.ai course. I found his view thought provoking. Quote below from his Deep Learning for Coders video: "“the more columns you have it basically creates a space that's more and more empty” That turns out just not to be the case. It's not the case for a number of reasons… in practice building models on lots and lots of columns works really really well" $\endgroup$ Commented Jul 18, 2019 at 7:29
  • $\begingroup$ @JulianH I am not arguing against using high dimensional data. It rocks. However, Euclidian distance is often not as useful, because of the curse. $\endgroup$ Commented Jul 1, 2020 at 22:47
  • $\begingroup$ Mmh, I don't think that's quite true. When you say "so if two vectors are pointing in the same direction, that's already pretty good." you seem to imply that cosine distance is not really affected by the curse of dimensionality. But as you can prove something about random points being distant from one another in 200d, you can similarly prove that the angle between two random points is almost surely 90degrees. See 0.2 Funny facts here cs.princeton.edu/courses/archive/fall13/cos521/lecnotes/… $\endgroup$ Commented Nov 19, 2023 at 16:40