Skip to content

Commands

Zim edited this page Jul 19, 2022 · 1 revision

Positional

URLs...

Link(s) to the OD(s) you would like to scrape/record/download content from. *This is not needed if you are using -i, --input.


Options

General

-h, --help

Prints help information.

-V, --version

Prints version information

-v, --verbose

Enable verbose output


WebDriver

--driver

Type of webdriver to use. Choices: firefox,chrome,msedge. Default: firefox.

--headless

Activates headless mode. Use a browser without the Graphical User Interface component. Cannot be used with--download.

--all-certs

Accepts all certificates (Beware!)

Accepts all certificates even invalid ones. Use this option at your own risk!

--compat-driver

The driver version you want Zyod to download & use. In case an incompatible driver was downloaded, use this option to specify the correct driver version. Default: auto.

Example: To install a driver for Google Chrome version 95.0.4638.69 use:

java -jar zyod.jar --compat-driver "95.0.4638.69" https://example.com/folder/images

Navigator

-d, --depth

Specify the maximum depth for recursive scraping. Default: 20. Depth of1 is current directory.


Scraper

-w, --wait

Wait a maximum number of seconds before scraping.

Most dynamic ODs load content on the page very slooooowly. This option allows Zyod to wait a certain amount of time before scraping.

Default: 6

--random-wait

Randomize the amount of time to wait.

The time before scraping will vary between 0.5 * --wait,-w (inclusive) to 1.5 * --wait,-w (exclusive)


Downloading

--download

Enable downloading features.

By default, downloading is disabled. Use this option to allow Zyod to download files from ODs. Cannot be used with --headless.

--ddir, --download-dir

Directory path to store downloaded files.

The directory path to store download files. Default: Downloads folder/Zyod

--dwait, --download-wait

Wait a random amount of seconds before downloading.

Wait between 0.5 * --download-wait (inclusive) to 1.5 * --download-wait (exclusive) seconds before downloading. Default: 0.


Recording

-o, --output

The file path to store the scraped links.

The output file path to place all recorded links. Links are appended to the file! Default: ./output.txt.

-i, --input

Read links from a file.

Read links from a file, which points to a series of ODs. Each line must represent a link to an OD. This option can be used with the URL.. positional argument.

java -jar zyod.jar -i input.txt ""

--no-record

Disable recording features.

Recording is enabled by default. Use this option to disable recording.


Interactivity

--scroll

Activates scrolling feature.

Scroll down the page repeatedly until last element is reached. Some dynamic ODs only loads 25, 50, etc. amount of content on the page at a time. When the bottom of the page is reached, more content is loaded. This option will allow Zyod to scroll in order to scrape and download more content from the OD.

--scroll-wait

Amount of seconds to wait before attempting to scroll again.

Default: 4.

--interact-wait

Amount of seconds to wait before/after a simulated interaction.

After performing an action (ex: right-click, click, dragging, etc.), the OD may go into a loading phase as it loads up the next set of content for the page. Use this option to increase the wait time before/after performing a simulated interaction.

Default: 5.


Miscellaneous

--init-refresh

Refresh the FIRST page.

Refresh the first (initial) webpage when Zyod navigates to it.

--no-refresh

Do not refresh the page and try again upon a scrape failure.

Do not refresh the page when Zyod fails to navigate to a page or fails to locate elements on the page.

--init-page-wait

Amount of seconds to explicitly wait after navigating to the first (initial) webpage.

Default: 0.

--page-wait

Amount of seconds to implicitly wait for a page load to complete before throwing an error.

Default: 30.

--element-wait

Amount of seconds to implicitly wait for web elements to appear before throwing an error. If Zyod is taking too long to retrieve anything on the current page, try reducing the amoount of seconds.

Default30.

Clone this wiki locally