Page-type and time-series variations of a newspaper's character occurrence rate
-
- HISANO MASAKI
- Department of Human Communications, Faculty of Electro-Communications, The University of Electro-Communications
Bibliographic Information
- Other Title
-
- 新聞の用字の面による変動と時系列変動
- シンブン ノ ヨウジ ノ メン ニ ヨル ヘンドウ ト ジケイレツ ヘンドウ
Search this article
Abstract
Using 1991-1997 Mainichi Shimbun Newspaper's CD-ROMs containing about 340 million characters, nature of character use was explored. Significant differences of mean occurrence rates among 16 page-types (e. g., editorial, sports, local) were observed in 69.2%of 5, 726 character types that covered all the cases in the corpus except a space character. Similarly, 20.3%of those showed significant month-level (seasonal) variations of occurrence rates, and 43.9% showed significant year-level variations (trends). Limited to frequent 2, 732 character types that severally accounted for more than 0.001%o of the corpus, these tendencies became more clearly: rates of character types that showed significant variations of occurrence rates by page-types, month-levels, and year-levels were 98.4, 33.5, and 76.0%respectively. These results suggest that there could be a vast range of systematic variations in lexical use, which have been overlooked in simple summing-up of mass corpuses.
Journal
-
- Journal of Natural Language Processing
-
Journal of Natural Language Processing 7 (2), 45-61, 2000
The Association for Natural Language Processing
- Tweet
Keywords
Details 詳細情報について
-
- CRID
- 1390001204475435776
-
- NII Article ID
- 10008829628
-
- NII Book ID
- AN10472659
-
- ISSN
- 21858314
- 13407619
-
- NDL BIB ID
- 5437699
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- NDL
- Crossref
- CiNii Articles
-
- Abstract License Flag
- Disallowed