Abstract
We study the presence of linguistically motivated information in the word embeddings generated with statistical methods. The nominal aspects of uter/neuter, common/proper, and count/mass in Swedish are selected to represent respectively grammatical, semantic, and mixed types of nominal categories within languages. Our results indicate that typical grammatical and semantic features are easily captured by word embeddings. The classification of semantic features required significantly less neurons than grammatical features in our experiments based on a single layer feed-forward neural network. However, semantic features also generated higher entropy in the classification output despite its high accuracy. Furthermore, the count/mass distinction resulted in difficulties to the model, even though the quantity of neurons was almost tuned to its maximum.
Original language | English |
---|---|
Title of host publication | Agents and Artificial Intelligence - 10th International Conference, ICAART 2018, Revised Selected Papers |
Editors | Jaap van den Herik, Ana Paula Rocha |
Number of pages | 22 |
Place of Publication | Cham |
Publisher | Springer Verlag |
Publication date | 2019 |
Pages | 492-513 |
ISBN (Print) | 9783030054526 |
DOIs | |
Publication status | Published - 2019 |
Externally published | Yes |
Event | 10th International Conference on Agents and Artificial Intelligence, ICAART 2018 - Funchal, Madeira, Portugal Duration: 16 Jan 2018 → 18 Jan 2018 |
Conference
Conference | 10th International Conference on Agents and Artificial Intelligence, ICAART 2018 |
---|---|
Country/Territory | Portugal |
City | Funchal, Madeira |
Period | 16/01/2018 → 18/01/2018 |
Sponsor | Institute for Systems and Technologies of Information, Control and Communication (INSTICC) |
Series | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 11352 LNAI |
ISSN | 0302-9743 |
Bibliographical note
Publisher Copyright:© Springer Nature Switzerland AG 2019.
Keywords
- Neural network
- Nominal classification
- Swedish
- Word embedding