- 【Updated on May 12, 2025】 Integration of CiNii Dissertations and CiNii Books into CiNii Research
- Trial version of CiNii Research Knowledge Graph Search feature is available on CiNii Labs
- 【Updated on June 30, 2025】Suspension and deletion of data provided by Nikkei BP
- Regarding the recording of “Research Data” and “Evidence Data”
Surprisal through Word Embeddings
-
- Asahara Masayuki
- NINJAL, Japan
Bibliographic Information
- Other Title
-
- 単語埋め込みに基づくサプライザル
- タンゴ ウメコミ ニ モトズク サプライザル
Search this article
Description
<p>The concept of surprisal was proposed by Hale as a psycholinguistic model of sentence processing costs based on the information theory. Surprisal measures a word’s negative log probability in context and can be used to model the difficulty in processing a sentence. If this difficulty is estimated using the eye-tracking method, the reading time can be estimated using base phrase units in Japanese. In addition, word probability is estimated from the frequency of morphemes or word units in Japanese. We introduced word embeddings to address the discrepancy in units, which makes it difficult to model surprisal in Japanese. The additive property of skip-gram word embeddings enabled us to compose a base phrase vector from word vectors in the base phrase. We confirmed that the cosine similarity between two adjacent base phrase vectors can be used to model the contextual probability of the bi-gram of the base phrase and found that the norm of the base phrase correlates with reading time in Japanese. </p>
Journal
-
- Journal of Natural Language Processing
-
Journal of Natural Language Processing 26 (3), 635-652, 2019-09-15
The Association for Natural Language Processing
- Tweet
Keywords
Details 詳細情報について
-
- CRID
- 1390846609781749888
-
- NII Article ID
- 130007761388
-
- NII Book ID
- AN10472659
-
- ISSN
- 21858314
- 13407619
-
- NDL BIB ID
- 029987979
-
- Text Lang
- ja
-
- Article Type
- journal article
-
- Data Source
-
- JaLC
- IRDB
- NDL Search
- Crossref
- CiNii Articles
- KAKEN
-
- Abstract License Flag
- Disallowed