Skip to content
WojtekK1902 edited this page Oct 31, 2014 · 5 revisions

Clients may manage their crawling tasks not only in FCS web application but also with the use of REST API. Below there are avaiable methods specified. All requests must be authenticated with OAuth2 Token.

POST add_task

    Creates new task. Required request parameters:

        name - name of task
        priority - task priority
        expire - datetime of task expiration
        mime_type - list of MIME types (separated by whitespace) which are to be crawled
        start_links - list of URLs separated by whitespace - starting point of crawling
        whitelist - URLs which should be crawled (in regex format)
        blacklist - URLs which should not be crawled (in regex format)
        max_links - maximal amount of links that may be visited while crawling

    Returns: Response with new task’s ID if successful, response with error message and code otherwise.


POST delete_task

    Finishes a task. Required request parameters:

        id - ID of task to be deleted

    Returns: Response with confirmation if successful, response with error message and code otherwise.


POST pause_task

    Pauses a task. Required request parameters:

        id - ID of task to be paused

    Returns: Response with confirmation if successful, response with error message and code otherwise.


POST resume_task

    Resumes a task. Required request parameters:

        id - ID of task to be resumed

    Returns: Response with confirmation if successful, response with error message and code otherwise.


POST get_data_from_crawler

    Downloads data gathered by crawler. Required request parameters:

        id – ID of task which data is to be downloaded
        size – size of requested data

    Returns: Response with crawled content if successful, response with error message and code otherwise.

Clone this wiki locally