This repository provides materials for a session that is part of the I2DS Tools for Data Science workshop 2023 run at the Hertie School, Berlin in October 2023. The student-run workshop is part of the course Introduction to Data Science taught by Simon Munzert at the Hertie School, Berlin, in Fall 2023.
This session will introduce you to quanteda! Quanteda is an R package for quantitative text analysis. A lot of data we deal with every day comes in text, and will increasingly do so in the future! Through the use of social media and digitalisation efforts in all parts of life, large volumes of data become increasingly accessible. Quanteda is an efficient R package to process large text data and analyse it quantitatively. The possibilities for analysis which quanteda and its expansion packages offer are well maintained and are constantly evolving. In this session we will introduce the main objects and most important functions of quanteda. We will run participants through the workflow of analyzing text data with quanteda and illustrate these steps with the analysis of speeched given by Barack Obama.
In the live tutorial you will be able to apply basic quanteda functions yourself and will ultimately analyze content from our Introduction to Data Science lectures. Follow this link for the html which contains some background information on quanteda and all instructions you need to follow along with our tutorial or teach yourself quanteda in your own time.
The goals of this session are to (1) equip you with conceptual knowledge about the quanteda package and the qunatitative text analysis workflow, (2) show you the three key objects of the package, and (3) provide you with the most commonly used functions, practice material as well as some further readings.
🔍 Quanteda Tutorial on the Quanteda Website
🔍 Presentation by quanteda founder Kenneth Benoi at the University of Münster
🔍 A Beginner’s Guide to Text Analysis with quanteda (University of Virginia)
🔍 quanteda: An R package for the quantitative analysis of textual data, JOSS, 2018
🔍 An Introduction to Text as Data with quanteda (Penn State and Essex courses in "Text as Dara")
🔍 Advancing Text Mining with R and quanteda: Methods Bites
The material in this repository is made available under the MIT license.
Killian Conyngham Prepared the content of the live coding tutorial, set up the Rmarkdown and post-processed the recording.
Aranxa Marquez Ampudia Prepared the use case example for the presentation and contributed to the presentation and Rmarkdown.
Luca Vellage Prepared the presentation and contributed to the Rmarkdown.