Skip to content

Conversation

@zees-dev
Copy link
Contributor

@zees-dev zees-dev commented Jun 25, 2025

Description

This PR introduces the bless-crawl extension integration.
The bless-crawl plugin can be used in quickjs environment (for bless browser runtime).

A simple example is provided (JS) which displays how one can utilize the bless crawl plugin.

AI description

This pull request introduces the new BlessCrawl SDK for distributed web scraping on the BLESS Network, along with updates to the documentation, examples, and package configuration to support it.
The most significant changes include the addition of the BlessCrawl SDK implementation, updates to supported features, and new examples to demonstrate its usage.

New SDK Implementation:

  • Added the BlessCrawl SDK with core functions (scrape, map, and crawl) for web scraping, link discovery, and recursive crawling. Includes detailed configuration options and examples. (docs/bless-crawl.md, lib/index.ts, lib/bless-crawl)

Documentation Updates:

  • Updated README.md to include instructions for building and using the BlessCrawl example. (README.md, [1] [2]
  • Added a comprehensive guide for the BlessCrawl SDK, including architecture, installation, and usage examples. (docs/bless-crawl.md, docs/bless-crawl.mdR1-R211)

Example Projects:

  • Introduced a new example project (examples/crawl) with sample scripts (index.ts, scrape-stdin-example.ts) to demonstrate the BlessCrawl SDK's capabilities. (examples/crawl/index.ts, examples/crawl/scrape-stdin-example.ts, examples/crawl/README.md, examples/crawl/package.json)

Feature and Dependency Updates:

  • Extended the SUPPORTED_FEATURES array to include crawl and wasip1 features. (bundler/index.ts, bundler/index.tsL29-R29)
  • Added zod as a new dependency for schema validation. (package.json, package.jsonL38-R40)

Miscellaneous:

  • Minor cleanup in lib/entry.ts to remove unnecessary blank lines. (lib/entry.ts, lib/entry.tsL12-L18)

@zees-dev zees-dev requested review from Copilot, michalzajda and uditdc and removed request for Copilot June 25, 2025 12:47
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces the BlessCrawl integration to the SDK, adding distributed web scraping functionality with support for both WASM and HTTP runtime modes along with comprehensive examples and documentation.

  • Adds the BlessCrawl SDK in lib/bless-crawl.ts and exports it via lib/index.ts.
  • Updates package.json to include the new dependency “zod” and main entry.
  • Provides several example scripts and documentation to demonstrate scraping, mapping, and crawling.

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
package.json Added main entry and zod dependency to support the new BlessCrawl integration.
lib/index.ts Exports the new BlessCrawl module.
lib/bless-crawl.ts Implements the BlessCrawl SDK with runtime detection, validation using zod schemas, and HTTP fallback.
examples/crawl/scrape-stdin-example.ts Provides a stdin-driven example to execute scraping, mapping, and crawling operations.
examples/crawl/package.json Configures the examples project with build and run scripts.
examples/crawl/index.ts Demonstrates usage of the SDK in both WASM and HTTP modes with logging for testing.
examples/crawl/README.md Documents usage, configuration, and examples of the new BlessCrawl integration.
README.md Updates the SDK features list to include bless-crawl.
Comments suppressed due to low confidence (1)

lib/bless-crawl.ts:440

  • Consider verifying the type for 'httpResult.code'. If the server may return a numeric code instead of a string, this comparison might lead to an unintended error; clarifying the expected type in both the code and API documentation would improve robustness.
      if (httpResult.code !== "200") {

@zees-dev zees-dev merged commit 91afd59 into main Jul 8, 2025
@zees-dev zees-dev deleted the feat/bless-crawl branch July 8, 2025 04:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants