A Language Independent Method for Filtering Unsolicited Bulk E-Mails

  • Sakaguchi Tetsuo
    Graduate School of Library, Information and Media Studies, University of Tsukuba
  • Yu Jiafu
    School of Library and Information Science, University of Tsukuba

Bibliographic Information

Other Title
  • 言語に依存しない迷惑メール選別手法
  • ゲンゴ ニ イゾン シナイ メイワク メール センベツ シュホウ

Search this article

Abstract

The growth of unsolicited bulk e-mails (spams) is a crucial problem on e-mails of the Internet. There are many anti-spam tools based on automatic classification by learning, such as Bayesian filters. They are dependent on language of e-mails because they have lexical analyzer to get words from e-mails. However, spams are written in various languages, such as English, Japanese, Chinese, and so on. This paper proposes a language independent method for filtering spams. By the method, e-mails are classified into spams and no-spams by SVM which uses frequencies of sub-strings extracted from e-mails. This paper also describes a result of test of the method with sample e-mails written in English, Japanese, Chinese, and some other languages, and discusses about the result and future works.

Journal

References(5)*help

See more

Related Projects

See more

Details 詳細情報について

Report a problem

Back to top