Skip to content

cahucadi/Python_Scrapy_UOC_GamesScraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GamesScraping

This repository contains a Scrapy spider for Steam digital game plattaform reviews.

Contributors

Name Mail
Carlos Humberto Carreño Díaz cahucadi@uoc.edu
David Barrera Montesdeoca dbarreram@uoc.edu

Installation

First, you will need a Python 3.x+ virtualenv.

After cloning the repository with

git clone git@github.com:cahucadi/GamesScraping.git

Install Python requirements with:

pip install -r requirements.txt

Using the crawler

First you need to locate game_scraping/game_url.txt file to define the url you want to crawl using Steam Community page.

This file must have the game url with and id (APP_ID) and language (LANGUAGE) of the specific review (english, spanish, latam, etc), using the following format:

https://steamcommunity.com/app/APP_ID/reviews/?browsefilter=mostrecent&snr=1_5_100010_&filterLanguage=LANGUAGE

You can initiate the crawl using:

scrapy crawl review_spider -o reviews.json

Next you can generate a .csv file (semicolon separated) using:

python main.py

Beware, it can take several hours to proccess

APP File description

Most important files:

  • main.py : used for .csv generation once you get the reviews.json file
  • game_scraping/
    • classes.py: This file contains project's main classes for scrapy item structure from scrapy.Item class
    • functions.py: This file contains project's main helpers functions (format, parsing, clean)
    • functions.py: This file contains scrapy default configuration
  • game_scraping/
    • review_spider.py: This file contains project's main spider from scrapy.Spider class
    • util_functions.py: This file contains spider needed functions

DOI File

The dataset is available at Zenodo with DOI: 10.5281/zenodo.4244834

And published at: http://doi.org/10.5281/zenodo.4244834

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages