In August, Google open sourced a tool called word2vec that lets developers and data scientists experiment with language-based deep learning models. Now, the company has published a research paper showing off another use for the technology — automatically detecting the similarities between different languages to create, for example, more accurate dictionaries.
The method works by analyzing how words are used in different languages and representing those relationships as vectors on a two-dimensional graph. Obviously, a computer doesn’t need a visualization to understand the results of the computations, but this one from the paper is instructive in showing the general idea of what the technique does.
Here’s how authors Tomas Mikolov, Quoc V. Le and Ilya Sutskever describe the concept and the chart:
“In Figure 1, we visualize the vectors for numbers and animals in English and Spanish, and it can be easily seen that these concepts have similar geometric arrangements. The…
View original post 490 more words