Skip to content

add topic scraper to api #7

@sonicmax

Description

@sonicmax

it's not really feasible to run this as a script because of CORS restrictions (ie can't fetch archives.endoftheinter.net from boards.endoftheinter.net, can't use Location.assign() method to redirect user). cors-anywhere won't work for this as it strips all cookies - for good reason. a chrome extension would allow us to bypass CORS restrictions but it's a lot of extra boilerplate code to write (considering that i've already written majority of code required to do this). it would probably be easier to do this via the app & pass the scraped data directly to database

parameters should be something like

  • message: scrapes individual message (for testing, i guess)
  • topic: will scrape posts from individual topic. we can handle user filtering/etc internally
  • q: will search LUE for query and scrape results for topics by user, and then scrape posts

username regex will have to be hardcoded depending on which server we run the code on. maybe in future we can add an api to specify which users to scrape

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions