Stylometry
From Kook Science
Stylometry, or applied stylistics, is the statistical analysis of text (textometry) or other forms of data for the purpose of authorship recognition and attribution, relying on authorial invariants (writeprints), which are patterns (styles) that an author unconsciously utilises in their writing. This may include comparing anonymous messages to signed messages for the purpose of determining an author's true identity.
Reading
- Peng, Roger D.; Hengartner, Nicolas W. (August 2002), "Quantitative Analysis of Literary Styles", The American Statistician 56 (3): 175-185, http://www.biostat.jhsph.edu/~rpeng/papers/archive/authorship-tas2-final.pdf
- Li, Jiexun; Zheng, Rong; Chen, Hsinchun (April 2006), "From Fingerprint to Writeprint", Communications of the ACM 49 (4): 76-82, https://www.semanticscholar.org/paper/From-fingerprint-to-writeprint-Li-Zheng/3b0dca26ee4b773543412011fdfc827239979d7b/pdf
- Sun, Jianwen; Yang, Zongkai; Liu, Sanya (February 2012), "Applying Stylometric Analysis Techniques to Counter Anonymity in Cyberspace", Journal of Networks 7 (2), http://www.academia.edu/4234632/Applying_Stylometric_Analysis_Techniques_to_Counter_Anonymity_in_Cyberspace
- Brennan, Michael; Afroz, Sadia; Greenstadt, Rachel (November 2012), "Adversarial Stylometry: Circumventing Authorship Recognition to Preserve Privacy and Anonymity", ACM Transactions on Information and System Security 15 (3), https://www.cs.drexel.edu/~sa499/papers/adversarial_stylometry.pdf
- Orebaugh, Angela; Kinser, Jason; Allnutt, Jeremy (2014), "Visualizing Instant Messaging Author Writeprints for Forensic Analysis", ADFSL Conference on Digital Forensics, Security and Law: 191-214, http://proceedings.adfsl.org/index.php/CDFSL/article/view/38
Resources
- Potter, Jeff, A Stylometry Library for Python, github.com, https://github.com/jpotts18/stylometry