Transfer Learning for Bibliographic Information Extraction

Takasu Atsuhiro, Quang Hong Vuong

doi:10.5220/0005283003740379

説明

This paper discusses the problems of analyzing title page layouts and extracting bibliographic information from academic papers. Information extraction is an important task for easily using digital libraries. Sequence analyzers are usually used to extract information from pages. Because we often receive new layouts and the layouts also usually change, it is necessary to have a machenism for self-trainning a new analyzer to achieve a good extraction accuracy. This also makes the management becomes easier. For example, when the new layout is inputed, There is a problem of how we can learn automatically and efficiently to create a new analyzer. This paper focuses on learning a new sequence analyzer automatically by using transfer learning approach. We evaluated the efficiency by testing three academic journals. The results show that the proposed method is effective to self-train a new sequence analyer.

収録刊行物

Proceedings of the International Conference on Pattern Recognition Applications and Methods

Proceedings of the International Conference on Pattern Recognition Applications and Methods 374-379, 2015-01-01

SCITEPRESS - Science and and Technology Publications

詳細情報詳細情報について

CRID: 1871146593206536832

DOI: 10.5220/0005283003740379

データソース種別

OpenAIRE

書き出し

問題の指摘

Transfer Learning for Bibliographic Information Extraction

説明

収録刊行物

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について