An Abstract of the 1st International Workshop on NLP and XML with a Special Emphasis on ISO / TC37 / SC3 Standard of Multimodal Document

NOMURA Naoyuki, NAKABASAMI Chieko, Choi Key-Sun

XML, the universal structured data representation meta-language, has become the standard framework for publishing on the net, as well as the standard e-commerce language to build B2B and B2C Web services. A major concern for this scenario is the "point of creation" bottleneck, at which creating useful, well-structured XML data can consume unduly amount of time and effort. Hopefully, NLP should be able to resolve this bottleneck by automating the conversion from unstructured or semi-structured text data into XML documents with much richer structure hidden in the original NL descriptions. This is "NLP for XML" that can give some intelligence, or disambiguation capabilities to XML generating engines. Conversely, XML can help NLP researches, especially the ones with annotated corpus based approaches, by providing them with the knowledge representation frameworks for morphological, syntactic, semantics and/or pragmatics information structure of NL resources. In many cases, XML should be able to provide NLP with deeper semantic structure clues and thus realize much more robust, higher precision NLP applications. The vision described above has led to the 1st International Workshop on "NLP and XML," which is summarized in this paper. ISO/TC37/SC3 standard for terminology mark up is briefly mentioned as well.

An Abstract of the 1st International Workshop on NLP and XML with a Special Emphasis on ISO / TC37 / SC3 Standard of Multimodal Document

Bibliographic Information

Search this article

Description

Journal

Details 詳細情報について

Export

Report a problem