Abstract
We introduce a word embedding method that generates a set of real-valued word vectors from a distributional semantic space. The semantic space is built with a set of context units (words) which are selected by an entropy-based feature selection approach with respect to the certainty involved in their contextual environments. We show that the most predictive context of a target word is its preceding word. An adaptive transformation function is also introduced that reshapes the data distribution to make it suitable for dimensionality reduction techniques. The final low-dimensional word vectors are formed by the singular vectors of a matrix of transformed data. We show that the resulting word vectors are as good as other sets of word vectors generated with popular word embedding methods.
| Original language | English |
|---|---|
| Journal | Journal of Experimental and Theoretical Artificial Intelligence |
| Volume | 32 |
| Issue number | 4 |
| Pages (from-to) | 557-579 |
| Number of pages | 23 |
| ISSN | 0952-813X |
| DOIs | |
| Publication status | Published - 3 Jul 2020 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2019, © 2019 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
Keywords
- context selection
- dependency parsing
- entropy
- singular value decomposition
- transformation
- Word embeddings