- 【Updated on May 12, 2025】 Integration of CiNii Dissertations and CiNii Books into CiNii Research
- Trial version of CiNii Research Knowledge Graph Search feature is available on CiNii Labs
- 【Updated on June 30, 2025】Suspension and deletion of data provided by Nikkei BP
- Regarding the recording of “Research Data” and “Evidence Data”
Writeprints
-
- Ahmed Abbasi
- The University of Arizona, Tucson, AZ
-
- Hsinchun Chen
- The University of Arizona, Tucson, AZ
Bibliographic Information
- Other Title
-
- A stylometric approach to identity-level identification and similarity detection in cyberspace
Search this article
Description
<jats:p>One of the problems often associated with online anonymity is that it hinders social accountability, as substantiated by the high levels of cybercrime. Although identity cues are scarce in cyberspace, individuals often leave behind textual identity traces. In this study we proposed the use of stylometric analysis techniques to help identify individuals based on writing style. We incorporated a rich set of stylistic features, including lexical, syntactic, structural, content-specific, and idiosyncratic attributes. We also developed the Writeprints technique for identification and similarity detection of anonymous identities. Writeprints is a Karhunen-Loeve transforms-based technique that uses a sliding window and pattern disruption algorithm with individual author-level feature sets. The Writeprints technique and extended feature set were evaluated on a testbed encompassing four online datasets spanning different domains: email, instant messaging, feedback comments, and program code. Writeprints outperformed benchmark techniques, including SVM, Ensemble SVM, PCA, and standard Karhunen-Loeve transforms, on the identification and similarity detection tasks with accuracy as high as 94% when differentiating between 100 authors. The extended feature set also significantly outperformed a baseline set of features commonly used in previous research. Furthermore, individual-author-level feature sets generally outperformed use of a single group of attributes.</jats:p>
Journal
-
- ACM Transactions on Information Systems
-
ACM Transactions on Information Systems 26 (2), 1-29, 2008-03
Association for Computing Machinery (ACM)
- Tweet
Details 詳細情報について
-
- CRID
- 1361699994879873536
-
- ISSN
- 15582868
- 10468188
-
- Data Source
-
- Crossref