@@ -4,6 +4,7 @@ The Agentic Scraper enables AI-powered browser automation for complex interactio
4
4
5
5
## 🚀 Quick Start
6
6
7
+ ### Basic Usage (No AI Extraction)
7
8
``` javascript
8
9
import { agenticScraper , getAgenticScraperRequest } from ' scrapegraph-js' ;
9
10
@@ -15,26 +16,76 @@ const steps = [
15
16
' click on login'
16
17
];
17
18
18
- // Submit automation request
19
+ // Submit automation request (basic scraping)
19
20
const response = await agenticScraper (apiKey, url, steps, true );
20
21
console .log (' Request ID:' , response .request_id );
21
22
22
23
// Check results
23
24
const result = await getAgenticScraperRequest (apiKey, response .request_id );
24
25
console .log (' Status:' , result .status );
26
+ console .log (' Markdown Content:' , result .markdown );
27
+ ```
28
+
29
+ ### AI Extraction Usage
30
+ ``` javascript
31
+ import { agenticScraper , getAgenticScraperRequest } from ' scrapegraph-js' ;
32
+
33
+ const apiKey = ' your-api-key' ;
34
+ const url = ' https://dashboard.scrapegraphai.com/' ;
35
+ const steps = [
36
+ ' Type email@gmail.com in email input box' ,
37
+ ' Type test-password@123 in password inputbox' ,
38
+ ' click on login' ,
39
+ ' wait for dashboard to load'
40
+ ];
41
+
42
+ // Define extraction schema
43
+ const outputSchema = {
44
+ user_info: {
45
+ type: " object" ,
46
+ properties: {
47
+ username: { type: " string" },
48
+ email: { type: " string" },
49
+ dashboard_sections: { type: " array" , items: { type: " string" } }
50
+ }
51
+ }
52
+ };
53
+
54
+ // Submit automation request with AI extraction
55
+ const response = await agenticScraper (
56
+ apiKey,
57
+ url,
58
+ steps,
59
+ true , // useSession
60
+ " Extract user information and available dashboard sections" , // userPrompt
61
+ outputSchema, // outputSchema
62
+ true // aiExtraction
63
+ );
64
+
65
+ console .log (' Request ID:' , response .request_id );
66
+
67
+ // Check results
68
+ const result = await getAgenticScraperRequest (apiKey, response .request_id );
69
+ if (result .status === ' completed' ) {
70
+ console .log (' Extracted Data:' , result .result );
71
+ console .log (' Raw Markdown:' , result .markdown );
72
+ }
25
73
```
26
74
27
75
## 📚 API Reference
28
76
29
- ### ` agenticScraper(apiKey, url, steps, useSession) `
77
+ ### ` agenticScraper(apiKey, url, steps, useSession, userPrompt, outputSchema, aiExtraction ) `
30
78
31
- Performs automated browser actions on a webpage.
79
+ Performs automated browser actions on a webpage with optional AI extraction .
32
80
33
81
** Parameters:**
34
82
- ` apiKey ` (string): Your ScrapeGraph AI API key
35
83
- ` url ` (string): The URL of the webpage to interact with
36
84
- ` steps ` (string[ ] ): Array of automation steps to perform
37
85
- ` useSession ` (boolean, optional): Whether to use session management (default: true)
86
+ - ` userPrompt ` (string, optional): Prompt for AI extraction (required when aiExtraction=true)
87
+ - ` outputSchema ` (object, optional): Schema for structured data extraction (used with aiExtraction=true)
88
+ - ` aiExtraction ` (boolean, optional): Whether to use AI for data extraction (default: false)
38
89
39
90
** Returns:** Promise<Object > with ` request_id ` and initial ` status `
40
91
@@ -67,6 +118,150 @@ Retrieves the status or result of an agentic scraper request.
67
118
68
119
## 🎯 Use Cases
69
120
121
+ ### 1. ** Basic Automation (No AI)**
122
+ Perfect for simple automation tasks where you just need the raw HTML/markdown content:
123
+ - ** Login automation** : Automate login flows and capture the resulting page
124
+ - ** Form submission** : Fill out forms and get confirmation pages
125
+ - ** Navigation** : Navigate through multi-step workflows
126
+ - ** Content scraping** : Get page content after performing actions
127
+
128
+ ### 2. ** AI-Powered Data Extraction**
129
+ Ideal when you need structured data from the automated interactions:
130
+ - ** Dashboard data extraction** : Login and extract user information, metrics, settings
131
+ - ** E-commerce scraping** : Search products and extract structured product data
132
+ - ** Form result parsing** : Submit forms and extract confirmation details, reference numbers
133
+ - ** Content analysis** : Navigate to content and extract key information in structured format
134
+
135
+ ### 3. ** Hybrid Approach**
136
+ Use both modes depending on your needs:
137
+ - ** Development/Testing** : Start with basic mode to test automation steps
138
+ - ** Production** : Add AI extraction for structured data processing
139
+ - ** Fallback** : Use basic mode when AI extraction isn't needed
140
+
141
+ ## 💡 AI Extraction Examples
142
+
143
+ ### E-commerce Product Search
144
+ ``` javascript
145
+ const steps = [
146
+ ' click on search box' ,
147
+ ' type "wireless headphones" in search' ,
148
+ ' press enter' ,
149
+ ' wait for results to load' ,
150
+ ' scroll down 2 times'
151
+ ];
152
+
153
+ const schema = {
154
+ products: {
155
+ type: " array" ,
156
+ items: {
157
+ type: " object" ,
158
+ properties: {
159
+ name: { type: " string" },
160
+ price: { type: " string" },
161
+ rating: { type: " number" },
162
+ availability: { type: " string" }
163
+ }
164
+ }
165
+ }
166
+ };
167
+
168
+ const response = await agenticScraper (
169
+ apiKey,
170
+ ' https://example-store.com' ,
171
+ steps,
172
+ true ,
173
+ ' Extract product names, prices, ratings, and availability from search results' ,
174
+ schema,
175
+ true
176
+ );
177
+ ```
178
+
179
+ ### Contact Form with Confirmation
180
+ ``` javascript
181
+ const steps = [
182
+ ' type "John Doe" in name field' ,
183
+ ' type "john@example.com" in email field' ,
184
+ ' type "Product inquiry" in subject field' ,
185
+ ' type "I need more information about pricing" in message field' ,
186
+ ' click submit button' ,
187
+ ' wait for confirmation'
188
+ ];
189
+
190
+ const schema = {
191
+ submission: {
192
+ type: " object" ,
193
+ properties: {
194
+ status: { type: " string" },
195
+ message: { type: " string" },
196
+ reference_number: { type: " string" },
197
+ response_time: { type: " string" }
198
+ }
199
+ }
200
+ };
201
+
202
+ const response = await agenticScraper (
203
+ apiKey,
204
+ ' https://company.com/contact' ,
205
+ steps,
206
+ true ,
207
+ ' Extract form submission status, confirmation message, and any reference numbers' ,
208
+ schema,
209
+ true
210
+ );
211
+ ```
212
+
213
+ ### Social Media Data Extraction
214
+ ``` javascript
215
+ const steps = [
216
+ ' type "username" in username field' ,
217
+ ' type "password" in password field' ,
218
+ ' click login button' ,
219
+ ' wait for dashboard' ,
220
+ ' click on profile section'
221
+ ];
222
+
223
+ const schema = {
224
+ profile: {
225
+ type: " object" ,
226
+ properties: {
227
+ username: { type: " string" },
228
+ followers: { type: " number" },
229
+ following: { type: " number" },
230
+ posts: { type: " number" },
231
+ recent_activity: { type: " array" , items: { type: " string" } }
232
+ }
233
+ }
234
+ };
235
+
236
+ const response = await agenticScraper (
237
+ apiKey,
238
+ ' https://social-platform.com/login' ,
239
+ steps,
240
+ true ,
241
+ ' Extract profile information including username, follower counts, and recent activity' ,
242
+ schema,
243
+ true
244
+ );
245
+ ```
246
+
247
+ ## 🔧 Best Practices
248
+
249
+ ### When to Use AI Extraction
250
+ - ✅ ** Use AI extraction when** : You need structured data, specific information extraction, or data validation
251
+ - ❌ ** Skip AI extraction when** : You just need raw content, testing automation steps, or processing content externally
252
+
253
+ ### Schema Design Tips
254
+ - ** Be specific** : Define exact data types and required fields
255
+ - ** Use descriptions** : Add description fields to guide AI extraction
256
+ - ** Nested objects** : Use nested schemas for complex data structures
257
+ - ** Arrays** : Use arrays for lists of similar items (products, comments, etc.)
258
+
259
+ ### Step Optimization
260
+ - ** Wait steps** : Add wait steps after actions that trigger loading
261
+ - ** Specific selectors** : Use specific element descriptions ("click on blue submit button")
262
+ - ** Sequential actions** : Break complex actions into smaller, specific steps
263
+ - ** Error handling** : Include steps to handle common UI variations
264
+
70
265
### 🔐 Login Automation
71
266
``` javascript
72
267
const loginSteps = [
0 commit comments