This repo contains working Selenium examples for web scraping in Python and Node.js.
It covers everything from driver setup, navigation, waits, element extraction, downloads, network/proxy management, to scaling strategies (Grid) and async scraping.
- Requirements
- Project Structure
- Python Selenium Examples
- Node.js Selenium Examples
- Notes
- More Resources
- Python 3.8+
- pip
- Node.js 18+
- npm or yarn
- Node.js 18+
- npm or yarn
selenium-scraping/
│
├── python/
│ ├── 01_install_selenium.md
│ ├── 02_driver_initialization.py
│ ├── 03_context_manager.py
│ ├── 04_options_headless.py
│ ├── 05_prefs_and_cdp.py
│ ├── 06_navigation.py
│ ├── 07_locators_and_find.py
│ ├── 08_shadow_dom.py
│ ├── 09_data_extraction.py
│ ├── 10_lists_and_tables.py
│ ├── 11_export_csv_json.py
│ ├── 12_waits_and_sync.py
│ ├── 13_stale_handling.py
│ ├── 14_interactions.py
│ ├── 15_scrolling.py
│ ├── 16_tabs_frames_alerts.py
│ ├── 17_sessions_and_auth.py
│ ├── 18_downloads_and_monitoring.py
│ ├── 19_debugging_and_logging.py
│ ├── 20_network_and_proxy.py
│ ├── 21_grid_examples.py
│ ├── 22_scrapy_integration.py
│ └── 23_hasdata_async_example.py
│
├── nodejs/
│ ├── 01_install_selenium.md
│ ├── 02_launch_browser.js
│ ├── 03_options_headless.js
│ ├── 04_block_resources_cdp.js
│ ├── 05_navigation_and_locators.js
│ ├── 06_shadow_dom.js
│ ├── 07_data_extraction.js
│ ├── 08_tables_and_export.js
│ ├── 09_waits_and_retry.js
│ ├── 10_downloads_and_monitoring.js
│ └── 11_proxy_and_stealth.js
│
└── README.md
All examples use selenium 4+:
02_driver_initialization.py— Chrome/Firefox/Edge/Safari drivers03_context_manager.py— auto-quit with context manager04_options_headless.py— headless mode, window size, user-agent05_prefs_and_cdp.py— prefs and CDP commands06_navigation.py— get, refresh, back, forward07_locators_and_find.py— By API, find_element(s)08_shadow_dom.py— accessing shadow root09_data_extraction.py— element.text normalization, get_attribute10_lists_and_tables.py— scrape lists and tables11_export_csv_json.py— CSV/JSON export12_waits_and_sync.py— implicit vs explicit waits13_stale_handling.py— StaleElementReferenceException14_interactions.py— clicks, send_keys, forms15_scrolling.py— scrollIntoView, infinite scroll16_tabs_frames_alerts.py— tabs, iframes, alerts17_sessions_and_auth.py— login and cookies18_downloads_and_monitoring.py— auto-download, monitor completion19_debugging_and_logging.py— screenshots, logs20_network_and_proxy.py— CDP, Selenium Wire, proxies21_grid_examples.py— Selenium Grid remote driver examples22_scrapy_integration.py— scrapy-selenium integration23_hasdata_async_example.py— async scraping example
All examples use selenium-webdriver:
02_driver_initialization.js— Chrome/Firefox drivers03_options_headless.js— headless mode, options, window size04_block_resources_cdp.js— CDP commands to block images/fonts05_navigation_and_locators.js— get, click, locators06_shadow_dom.js— access shadow DOM via JS execution07_data_extraction.js— text, attributes extraction08_tables_and_export.js— build CSV/JSON from scraped data09_waits_and_retry.js— implicit/explicit waits and retry10_downloads_and_monitoring.js— download handling11_proxy_and_stealth.js— proxies, stealth patterns
These examples are for educational purposes only. Learn more about the legality of web scraping.
- Use context managers in Python to avoid leftover browser processes.
- CDP features work best with Chrome/Chromium.
- For large-scale scraping, consider Selenium Grid or Scrapy integration.
- All code is ready-to-run: adapt snippets to your own scraping tasks.
