Skip to content

feat(scys): add SCYS browser adapters#957

Open
warkcod wants to merge 9 commits intojackwener:mainfrom
warkcod:feat/scys-adapters-pr-20260411
Open

feat(scys): add SCYS browser adapters#957
warkcod wants to merge 9 commits intojackwener:mainfrom
warkcod:feat/scys-adapters-pr-20260411

Conversation

@warkcod
Copy link
Copy Markdown
Contributor

@warkcod warkcod commented Apr 11, 2026

Summary

  • add SCYS browser adapters for course, toc, read, feed, opportunity, activity, and article extraction
  • add shared SCYS extraction, course image download, and normalization helpers with adapter test coverage
  • add SCYS adapter docs, schema guidance, and regenerate the CLI manifest

Test Plan

  • npm run build
  • npm run typecheck
  • npm test
  • npx vitest run --project adapter
  • bash scripts/check-doc-coverage.sh --strict

@warkcod warkcod force-pushed the feat/scys-adapters-pr-20260411 branch from fb477b7 to 3054aac Compare April 11, 2026 05:34
@Astro-Han
Copy link
Copy Markdown
Contributor

There is a lot of work here, and the adapter itself looks substantial. My only review note is about deliverables around discoverability.

From what I can tell, the PR adds the SCYS adapter page and the adapters index entry, but not the other repo-level updates that usually ship with a new adapter, namely the VitePress sidebar and the command tables in README.md and README.zh-CN.md. I think it would be worth adding those before merge so the new adapter is discoverable from the main docs and README surfaces.

warkcod added 6 commits April 17, 2026 20:25
SCYS list extraction could return only third-party links even when the page
already exposed a stable topic id, and detail reads could succeed against the
shell frame before the post hydrated. This change prefers SCYS articleDetail
identity, backfills missing topic metadata from hydrated page/cache state, and
retries detail extraction past placeholder shells.

Constraint: SCYS exposes post identity across intercepted payloads, DOM cards, and client-side state with inconsistent hydration timing
Rejected: Increase fixed waits for every extractor | still nondeterministic and needlessly slows healthy runs
Confidence: medium
Scope-risk: moderate
Directive: Keep SCYS articleDetail URLs as the canonical identity whenever a topic id exists; external links belong in source_links/external_links
Tested: npx vitest run --project adapter clis/scys/*.test.js
Tested: npm run typecheck
Tested: npx tsx src/main.ts scys article 'https://scys.com/articleDetail/xq_topic/55522122288425554' --wait 6 -f json
Tested: npx tsx src/main.ts scys feed 'https://scys.com/?filter=essence' --wait 6 --limit 3 -f json
Tested: npx tsx src/main.ts scys opportunity 'https://scys.com/opportunity' --wait 6 --limit 3 -f json
Not-tested: Rebuilding and reinstalling the full local package under /Users/mac/.opencli-scys
The page-cache backfill walks browser state to recover stable SCYS topic
identity, but some extension/page contexts deny direct storage access and were
failing before extraction could continue. This change treats storage access as
optional and keeps the fallback path alive when the page runs in a restricted
context.

Constraint: Browser command contexts can evaluate inside pages where localStorage/sessionStorage are unavailable or throw SecurityError
Rejected: Remove page-cache backfill entirely | would reintroduce the missing topic_id/articleDetail regression on SCYS lists
Confidence: high
Scope-risk: narrow
Directive: Any browser-state probe used during fallback extraction must tolerate storage access failures and continue with the remaining signals
Tested: npx vitest run --project adapter clis/scys/*.test.js
Tested: npm run typecheck
Not-tested: Parallel browser-command contention against the same daemon session
The previous list extraction still depended on tab toggles plus response capture.
On current SCYS pages that path is nondeterministic: switching away from the
active tab can trigger a request for the wrong filter, while switching back may
not issue a request at all. This change makes essence and opportunity fetch the
searchTopic payload directly with explicit authenticated parameters, keeping the
old capture path only as a fallback when direct auth is unavailable.

Constraint: Current SCYS frontend no longer exposes the expected list payload through reliable UI-triggered request timing on cold start
Rejected: Keep relying on interceptor + tab toggles | can return EMPTY_RESULT or silently capture the wrong column
Confidence: high
Scope-risk: moderate
Directive: For SCYS list extraction, prefer explicit authenticated API params over UI-state inference whenever the route maps cleanly to searchTopic
Tested: npx vitest run --project adapter clis/scys/*.test.js
Tested: npm run typecheck
Tested: npx tsx src/main.ts scys article 'https://scys.com/articleDetail/xq_topic/55522122288425554' --wait 6 -f json
Tested: npx tsx src/main.ts scys feed 'https://scys.com/?filter=essence' --wait 6 --limit 3 -f json
Tested: npx tsx src/main.ts scys opportunity 'https://scys.com/opportunity' --wait 6 --limit 3 -f json
Tested: /Users/mac/.opencli-scys/bin/opencli scys article 'https://scys.com/articleDetail/xq_topic/55522122288425554' --wait 6 -f json
Tested: /Users/mac/.opencli-scys/bin/opencli scys feed 'https://scys.com/?filter=essence' --wait 6 --limit 3 -f json
Tested: /Users/mac/.opencli-scys/bin/opencli scys opportunity 'https://scys.com/opportunity' --wait 6 --limit 3 -f json
Not-tested: Personal feed via an explicit API path (still uses the existing fallback extractor path)
SCYS regressions still showed command failures even after the list extractors were
stabilized. The remaining failures came from the browser control plane: daemon
commands treated "Detached while handling command" as terminal, and CDP retries
only covered a narrower set of attach errors. This change promotes detached and
not-attached debugger errors into the transient-retry path and reuses the retry
logic across CDP commands instead of only Runtime.evaluate.

Constraint: The currently connected Browser Bridge can report healthy status while individual tab debugger sessions detach mid-command
Rejected: Rely on doctor/connection health alone | does not protect command execution when a specific tab loses debugger attachment
Confidence: medium
Scope-risk: moderate
Directive: Any new CDP command path should go through the shared retry helper so debugger-detach handling stays consistent
Tested: npx vitest run --project extension extension/src/cdp.test.ts
Tested: npx vitest run --project unit src/browser/errors-detach.test.ts
Tested: npx vitest run --project adapter clis/scys/*.test.js
Tested: npm run typecheck
Tested: npm --prefix extension run typecheck
Tested: npm --prefix extension run build
Not-tested: Reloading the currently connected external Browser Bridge v1.5.1 in Chrome to exercise the new extension dist live
The unpacked Browser Bridge can be brought up in contexts where the MV3 service
worker is evaluated without a fresh install/startup event. In that state the
bridge may never call connect(), leaving doctor green only after an older
extension reconnects. This change initializes eagerly on module load while
keeping the existing one-time guard.

Constraint: MV3 service workers are not guaranteed to re-enter through runtime.onInstalled or runtime.onStartup after every reload/profile restart
Rejected: Depend only on startup/install listeners | leaves newly loaded unpacked extensions idle until some later event happens
Confidence: medium
Scope-risk: narrow
Directive: Background bootstrap for the Browser Bridge should not depend on lifecycle events alone; keep module-load init idempotent
Tested: npx vitest run --project extension extension/src/background.test.ts extension/src/cdp.test.ts
Tested: npm --prefix extension run typecheck
Tested: npm --prefix extension run build
Tested: python3 /Users/mac/clawd/scys-report/scys_report.py build-bundle --date 2026-04-17 --out-dir /Users/mac/clawd/scys-report/runs/2026-04-17-verify-rerun --essence-limit 3 --opportunity-limit 3 --personal-limit 3 --strict
Tested: python3 /Users/mac/clawd/scys-report/scys_report.py build-bundle --date 2026-04-17 --out-dir /Users/mac/clawd/scys-report/runs/2026-04-17-verify-rerun-2 --essence-limit 3 --opportunity-limit 3 --personal-limit 3 --strict
Not-tested: Making doctor report the newly synced unpacked extension version instead of the stale cached version string
@warkcod warkcod force-pushed the feat/scys-adapters-pr-20260411 branch from c3efe0d to 0c37f37 Compare April 17, 2026 12:29
warkcod added 3 commits April 18, 2026 22:17
The detail extractor was still failing on the same SCYS article URL even after
list extraction became deterministic. Two recoverable failure modes remained:
stale page identities after browser navigation and shell-only article loads that
became healthy on a subsequent full rerun. This change clears stale page
identity before retrying execs and gives `scys article` a bounded full-command
retry path on shell-only EMPTY_RESULT outcomes.

Constraint: SCYS article detail pages can hydrate after the shell is visible, and browser target identities can drift across repeated one-shot commands
Rejected: Increase a single fixed wait only | still left repeated empty shell states and stale-page failures in real runs
Confidence: medium
Scope-risk: narrow
Directive: Keep SCYS article retries bounded and targeted to retryable shell/identity errors; avoid broad retries for unrelated extraction failures
Tested: npx vitest run --project unit src/browser/page.test.ts src/browser/errors-detach.test.ts
Tested: npx vitest run --project adapter clis/scys/*.test.js
Tested: npm run typecheck
Tested: /Users/mac/.opencli-scys/bin/opencli scys article https://scys.com/articleDetail/xq_topic/14422288551185512 -f json (12/12 success after final retry tuning)
Not-tested: Full bundle strict run on 2026-04-18 dataset after the final article-only retry increase
The remaining PR jackwener#957 CI failure was not in SCYS extraction logic anymore — it
was the extension build workflow. `extension/package-lock.json` had drifted into
an invalid state, so `npm ci` failed before the extension build even started.
This refreshes the extension lockfile to a clean npm-ci-compatible state and
keeps the background test typings aligned with the current listener mocks.

Constraint: GitHub Actions build-extension job uses `npm ci` in extension/, so any lockfile drift hard-fails the PR before adapter checks matter
Rejected: Ignore the build-extension failure and rely on local verification only | PR remains red and cannot be merged safely
Confidence: high
Scope-risk: narrow
Directive: When extension dependencies change or a worktree leaks lockfile metadata, regenerate extension/package-lock.json from a clean extension/ install and re-check npm ci
Tested: cd extension && npm ci
Tested: cd extension && npm run typecheck
Tested: cd extension && npm run build
Tested: npx vitest run --project extension extension/src/background.test.ts extension/src/cdp.test.ts
Tested: npx vitest run --project adapter clis/scys/*.test.js
Tested: npx vitest run --project unit src/browser/page.test.ts src/browser/errors-detach.test.ts
Tested: npm run typecheck
Not-tested: Re-running the GitHub Actions build job remotely after push
The runtime only honored OPENCLI_CDP_ENDPOINT for Electron apps, so regular browser-backed adapters like douban still hard-failed on Browser Bridge even though the docs describe CDP as the fallback path for remote or no-GUI environments. This routes any browser-backed command through CDPBridge when a manual CDP endpoint is provided and locks the behavior with a focused runtime regression test.

Constraint: Headless and remote-server flows cannot rely on the Browser Bridge extension, so OPENCLI_CDP_ENDPOINT must work for normal browser adapters as documented
Rejected: Keep the override inside executeCommand only | would leave other browserSession entry points inconsistent and keep the docs/runtime mismatch
Confidence: high
Scope-risk: narrow
Directive: If browser factory selection changes again, keep OPENCLI_CDP_ENDPOINT as a top-level override for all browser-backed adapters, not just Electron targets
Tested: npx vitest run --project unit src/runtime.test.ts
Tested: npx vitest run --project unit src/runtime.test.ts src/browser/page.test.ts src/browser/errors-detach.test.ts
Tested: npm run typecheck
Tested: OPENCLI_CDP_ENDPOINT=http://127.0.0.1:18902 node dist/src/main.js scys article https://scys.com/articleDetail/xq_topic/14422288551185512 -f json
Tested: OPENCLI_CDP_ENDPOINT=http://127.0.0.1:18902 node dist/src/main.js douban top250 --limit 1 -f json
Tested: OPENCLI_CDP_ENDPOINT=http://127.0.0.1:18902 /Users/mac/.opencli-scys/bin/opencli douban top250 --limit 1 -f json
Not-tested: /Users/mac/.opencli-scys/bin/opencli scys article via CDP on the current 18902 profile because that browser session is not logged into scys
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants