style_of_criticism

This repository contains the code and lexicons for the stylometric analysis of the "Style of criticism."
Because of copyright and privacy issues, we cannot share the corpora that we used for our analyses.
The main statistics can be found here:

                                     |  Social Reading  |  Paper Magazines  |  Scientific Journals
-------------------------------------|------------------|-------------------|-----------------------------------------------------------
Source                               |  aNobii          |  Sole 24 Ore      |  Between – Osservatorio critico della germanistica - OBLIO
Publication dates                    |                  |  2010-2011        |  1998-2016
Total number of tokens               |  646964          |  704242           |  655192
Length of shortest review (tokens)   |  1               |  32               |  234
Length of longest review (tokens)    |  2229            |  1840             |  3195
Mean length of reviews (tokens)      |  125.3174        |  526.5012         |  1197.275
Standard deviation of review length  |  169.9646        |  350.4949         |  459.4581

The "Extend_lexicons.py" file contains a script for expanding the lexicons in the "lexicons" folder through the fastText Italian Pre-trained word vector: https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md
The Style_of_criticism_step1.R file contains a script for the analysis of the texts in the "corpus" folder (NOTE: in this repository they are just empty files), through stylomety and word classification. It also generates files to be analyzed with the LIWC software: https://liwc.wpengine.com/
The Style_of_criticism_step2.R file contains an SVM classifier, that works on the results of the previous analysis and evaluate the quality of the classification.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

style_of_criticism

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
corpus		corpus
lexicons		lexicons
Extend_lexicons.py		Extend_lexicons.py
README.md		README.md
Style_of_criticism_step1.R		Style_of_criticism_step1.R
Style_of_criticism_step2.R		Style_of_criticism_step2.R
corpus_statistics.csv		corpus_statistics.csv

SimoneRebora/style_of_criticism

Folders and files

Latest commit

History

Repository files navigation

style_of_criticism

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages