Abstract
We study the presence of linguistically motivated information in the word embeddings generated with statistical methods. The nominal aspects of uter/neuter, common/proper, and count/mass in Swedish are selected to represent respectively grammatical, semantic, and mixed types of nominal categories within languages. Our results indicate that typical grammatical and semantic features are easily captured by word embeddings. The classification of semantic features required significantly less neurons than grammatical features in our experiments based on a single layer feed-forward neural network. However, semantic features also generated higher entropy in the classification output despite its high accuracy. Furthermore, the count/mass distinction resulted in difficulties to the model, even though the quantity of neurons was almost tuned to its maximum.
Originalsprog | Engelsk |
---|---|
Titel | Agents and Artificial Intelligence - 10th International Conference, ICAART 2018, Revised Selected Papers |
Redaktører | Jaap van den Herik, Ana Paula Rocha |
Antal sider | 22 |
Udgivelsessted | Cham |
Forlag | Springer Verlag |
Publikationsdato | 2019 |
Sider | 492-513 |
ISBN (Trykt) | 9783030054526 |
DOI | |
Status | Udgivet - 2019 |
Udgivet eksternt | Ja |
Begivenhed | 10th International Conference on Agents and Artificial Intelligence, ICAART 2018 - Funchal, Madeira, Portugal Varighed: 16 jan. 2018 → 18 jan. 2018 |
Konference
Konference | 10th International Conference on Agents and Artificial Intelligence, ICAART 2018 |
---|---|
Land/Område | Portugal |
By | Funchal, Madeira |
Periode | 16/01/2018 → 18/01/2018 |
Sponsor | Institute for Systems and Technologies of Information, Control and Communication (INSTICC) |
Navn | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Vol/bind | 11352 LNAI |
ISSN | 0302-9743 |
Bibliografisk note
Publisher Copyright:© Springer Nature Switzerland AG 2019.