Selecting Reading Texts Suitable for Incidental Vocabulary Learning by Considering the Estimated Distribution of Acquired Vocabulary
説明
In second language acquisition, incidental vocabulary learning refers to the process by which one's vocabulary increases through activities in which increasing vocabulary is not the main goal. A typical example is extensive reading, where learners naturally expand their vocabulary by reading many texts and guessing the meanings of unfamiliar words. Selecting texts suitable for incidental learning requires a personalized and fine-grained estimation of each language learner's vocabulary. If a learner does not know sufficient words in a text, then the learner cannot read through the text to guess the meanings of unfamiliar terms, so incidental learning does not occur. In contrast, if a learner knows all the words in a text, then incidental learning cannot occur because the learner has no new words to learn. Therefore, if a learner attempts to select a text that can significantly increase their vocabulary, the risk of reading failure increases along with the possibility that no words can be guessed and learned. Therefore, learners should be presented with both the amount of vocabulary they can add and risk of failure in reading, and be allowed to select the text they wish to read. To this end, we require an algorithm that can simultaneously calculate the amount of vocabulary that can be learned and relevant risk when a text is read. This paper presents an algorithm for this purpose with preliminary experimental results. Specifically, we use findings from applied linguistics that indicate that the condition for incidental learning to occur is that the percentage of words that a learner knows in a text is above a certain threshold. By modeling the estimated size of the increase in vocabulary as a random variable, our method uses the variance of the estimated size as a measure of the risk of reading failure. This allows a learner to select the text with the lowest risk among texts that have the same estimated size of increase in vocabulary. Experimental results demonstrate that our method can significantly aid learners in selecting ``efficient'' texts to read by identifying a handful of such texts among a library of 500 texts. The results also demonstrate that some texts are stable and efficient for many learners.