Python Web Scraper
- Scrap all search results for a keyword entered as an argument.
- Can be saved as
.csvand.json. - Also collect user data who uploaded contents included in search results.
pip install default-scraperor
pip install git+https://github.com/Seongbuming/crawler.gitfrom default_scraper.instagram.parser import InstagramParser
USERNAME = ""
PASSWORD = ""
KEYWORD = ""
parser = InstagramParser(USERNAME, PASSWORD, KEYWORD, False)
parser.run()Run following command to scrap contents from Instagram:
python main.py --platform instagram --keyword {KEYWORD} [--output_file OUTPUT_FILE] [--all]Use --all or -a option to also scrap unstructured fields.
from default_scraper.googleplay.review.parser import GooglePlayReviewParser
APP_ID = ""
parser = GooglePlayReviewParser(APP_ID)
parser.run()python main.py --platform googleplay_review --keyword {APP_ID} [--output_file OUTPUT_FILE]- Structured fields
pkidtaken_atmedia_typecodecomment_countuserlike_countcaptionaccessibility_captionoriginal_widthoriginal_heightimages
- Some fields may be missing depending on Instagram's response data.
review_idauthorreview_textratingwrited_time
- Will support scraping from more platform services.