feat: CLI migration + progressive disclosure redesign for ultimate-scraper by lukas-bekr · Pull Request #33 · apify/agent-skills

lukas-bekr · 2026-03-30T22:59:48Z

Summary

Major upgrade to the apify-ultimate-scraper skill: migrates from REST API scripts to Apify CLI, restructures the information architecture using progressive disclosure, and enriches all workflow guides with 58 research-backed data pipeline patterns.

Phase 1: CLI migration

Replaced 3 Node.js scripts (search_actors.js, run_actor.js, fetch_actor_details.js) with CLI commands (apify actors call --json, actors search, actors info, datasets get-items)
--json output as stable API contract - immune to upcoming CLI UI changes (Markdown default, colors)
OAuth-first authentication (apify login) with env var fallback. Fixed security contradiction in actorization skill (was using apify login -t exposing tokens in shell history, aligned with PR fix: migrate security fixes to actorization skill #31)

Phase 2: Progressive disclosure restructure

Replaced monolithic 400-line Actor index with hub-and-spoke architecture
SKILL.md (~109 lines) routes to lean actor-index (206 lines) + 14 workflow guides + gotchas (108 lines)
Simple task ("scrape Nike's Instagram") loads ~300 lines. Complex pipeline loads ~500. Neither loads the other 13 guides.

Phase 3: Research-driven workflow enrichment

4-workstream research: Notion internal use cases + AI research (Perplexity/Gemini/ChatGPT) + n8n template library scraping (85+ templates, 26 use Apify) + social media scraping
58 distinct workflow patterns mapped to Apify Actors, ranked by cross-source frequency
Every workflow guide now has 4-6 pipelines with explicit Actor chaining, data piping (results[].website -> startUrls), PPE cost estimates, and gotchas

Phase 4: New content

4 new workflow categories: e-commerce price monitoring, contact enrichment, knowledge base/RAG, company research (covers 5,000+ Store Actors with previously zero workflow coverage)
Enriched gotchas with anti-bot guidance (Cloudflare, SPA, fingerprinting), platform rate limits, cost estimation protocols

By the numbers

17 files, 1,597 lines (was 13 files, 782 lines)
Token budget for simple tasks: ~300 lines (unchanged, progressive disclosure)
14 workflow guides with 4-6 pipelines each (was 10 with 1-4 each)
Design principles: Anthropic's "Lessons from Building Skills" - skip the obvious, gotchas are highest-signal, hub-and-spoke progressive disclosure, don't railroad

Scope

apify-ultimate-scraper skill only (full rewrite)
apify-actorization auth fix (aligned with PR fix: migrate security fixes to actorization skill #31)
apify-actor-development minor auth alignment (OAuth-first)
commands/create-actor.md auth alignment
Did NOT touch developer skill content (actor-development, actorization workflows) - Patrik's territory

…, brand, reviews

…jobs, real estate

…uting

- Standardize auth to OAuth-first across all skills - Fix security contradiction in actorization (remove -t flag) - Delete legacy Node.js scripts (replaced by CLI commands) - Bump version to 2.0.0 - Add design spec and implementation plan Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove error handling table (moving non-obvious errors to gotchas.md), add 4 new routing rows for e-commerce, contact enrichment, knowledge base/RAG, and company research, and replace error section with a brief troubleshooting pointer. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…low guides Added 7 new pipelines across 3 files from combined-patterns research: - brand-monitoring: Twitter/X real-time mention routing (P16), Reddit brand monitoring (P17), multi-platform social listening with sentiment (P18) - review-analysis: competitor review intelligence (P21), Google Play app review monitoring (P22), multi-platform hospitality aggregation (P20) - content-and-seo: SERP content brief generation (P23), sitemap content audit (P24), keyword rank tracking with alerts (P26), deep research agent (P54) All pipelines include explicit pipe field paths, PPE cost estimates where applicable, and non-obvious gotchas only. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…with research patterns Added 3 new pipelines to lead-generation.md (Sales Navigator bulk, SERP discovery, Apollo icebreakers, Reddit lead mining), 3 to competitive-intel.md (website change detection, SERP position monitoring, feature benchmarking), and 3 to influencer-vetting.md (TikTok creator vetting, YouTube channel audit, cross-platform hashtag discovery). All entries include explicit field paths, cost estimates for PPE Actors, and per-pipeline gotchas. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…flow guides Add 2 pipelines to each guide from research patterns: Instagram competitor analysis + LinkedIn company page analytics (social); Reddit trend mining + YouTube outlier discovery (trend); sales signal outreach + Upwork monitoring (jobs); lead scoring/routing + construction discovery (real estate). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…esearch) Adds workflow reference guides for the 4 new categories identified in combined-patterns.md research: e-commerce price monitoring (patterns 45-49), contact enrichment (50-52), knowledge base and RAG pipelines (53-55), and company research (56-58). Each guide follows the existing format with When/Pipeline/Output fields/Cost estimate/Gotcha sections. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… rate limits

lukas-bekr and others added 11 commits March 30, 2026 14:29

refactor: replace monolithic actor index with lean lookup + add gotchas

9f0f3a0

feat: add workflow guides for lead-gen, competitive-intel, influencer…

6d6dbf8

…, brand, reviews

feat: add workflow guides for content/SEO, social analytics, trends, …

f325c05

…jobs, real estate

refactor: rewrite SKILL.md with three-layer progressive disclosure ro…

36c25ce

…uting

feat: enrich gotchas with error recovery, anti-bot guidance, platform…

0bba7c4

… rate limits

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: CLI migration + progressive disclosure redesign for ultimate-scraper#33

feat: CLI migration + progressive disclosure redesign for ultimate-scraper#33
lukas-bekr wants to merge 11 commits intoapify:mainfrom
lukas-bekr:feat/ultimate-scraper-cli-migration-and-workflow-upgrade

lukas-bekr commented Mar 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lukas-bekr commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Phase 1: CLI migration

Phase 2: Progressive disclosure restructure

Phase 3: Research-driven workflow enrichment

Phase 4: New content

By the numbers

Scope

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lukas-bekr commented Mar 30, 2026 •

edited

Loading