A Language Independent Method for Filtering Unsolicited Bulk E-Mails
-
- Sakaguchi Tetsuo
- Graduate School of Library, Information and Media Studies, University of Tsukuba
-
- Yu Jiafu
- School of Library and Information Science, University of Tsukuba
Bibliographic Information
- Other Title
-
- 言語に依存しない迷惑メール選別手法
- ゲンゴ ニ イゾン シナイ メイワク メール センベツ シュホウ
Search this article
Abstract
The growth of unsolicited bulk e-mails (spams) is a crucial problem on e-mails of the Internet. There are many anti-spam tools based on automatic classification by learning, such as Bayesian filters. They are dependent on language of e-mails because they have lexical analyzer to get words from e-mails. However, spams are written in various languages, such as English, Japanese, Chinese, and so on. This paper proposes a language independent method for filtering spams. By the method, e-mails are classified into spams and no-spams by SVM which uses frequencies of sub-strings extracted from e-mails. This paper also describes a result of test of the method with sample e-mails written in English, Japanese, Chinese, and some other languages, and discusses about the result and future works.
Journal
-
- Joho Chishiki Gakkaishi
-
Joho Chishiki Gakkaishi 15 (2), 53-56, 2005
Japan Society of Information and Knowledge
- Tweet
Details 詳細情報について
-
- CRID
- 1390001204423959040
-
- NII Article ID
- 110003375657
-
- NII Book ID
- AN10459774
-
- ISSN
- 18817661
- 09171436
-
- NDL BIB ID
- 7432108
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- NDL
- Crossref
- CiNii Articles
- KAKEN
-
- Abstract License Flag
- Disallowed