44[ ![ Python Support] ( https://img.shields.io/pypi/pyversions/scrapegraph-py.svg )] ( https://pypi.org/project/scrapegraph-py/ )
55[ ![ License] ( https://img.shields.io/badge/License-MIT-blue.svg )] ( https://opensource.org/licenses/MIT )
66[ ![ Code style: black] ( https://img.shields.io/badge/code%20style-black-000000.svg )] ( https://github.com/psf/black )
7- [ ![ Documentation Status] ( https://readthedocs.org/projects/scrapegraph-py/badge/?version=latest )] ( https://docs.scrapegraphai.com )
7+ [ ![ Documentation Status] ( https://readthedocs.org/projects/scrapegraph-py/badge/?version=latest )] ( https://docs.scrapegraphai.com )
88
99<p align =" left " >
1010 <img src =" https://raw.githubusercontent.com/VinciGit00/Scrapegraph-ai/main/docs/assets/api-banner.png " alt =" ScrapeGraph API Banner " style =" width : 70% ;" >
@@ -20,7 +20,7 @@ pip install scrapegraph-py
2020
2121## 🚀 Features
2222
23- - 🤖 AI-powered web scraping
23+ - 🤖 AI-powered web scraping and search
2424- 🔄 Both sync and async clients
2525- 📊 Structured output with Pydantic schemas
2626- 🔍 Detailed logging
@@ -40,21 +40,36 @@ client = Client(api_key="your-api-key-here")
4040
4141## 📚 Available Endpoints
4242
43- ### 🔍 SmartScraper
43+ ### 🤖 SmartScraper
4444
45- Scrapes any webpage using AI to extract specific information .
45+ Extract structured data from any webpage or HTML content using AI .
4646
4747``` python
4848from scrapegraph_py import Client
4949
5050client = Client(api_key = " your-api-key-here" )
5151
52- # Basic usage
52+ # Using a URL
5353response = client.smartscraper(
5454 website_url = " https://example.com" ,
5555 user_prompt = " Extract the main heading and description"
5656)
5757
58+ # Or using HTML content
59+ html_content = """
60+ <html>
61+ <body>
62+ <h1>Company Name</h1>
63+ <p>We are a technology company focused on AI solutions.</p>
64+ </body>
65+ </html>
66+ """
67+
68+ response = client.smartscraper(
69+ website_html = html_content,
70+ user_prompt = " Extract the company description"
71+ )
72+
5873print (response)
5974```
6075
@@ -80,46 +95,56 @@ response = client.smartscraper(
8095
8196</details >
8297
83- ### 📝 Markdownify
98+ ### 🔍 SearchScraper
8499
85- Converts any webpage into clean, formatted markdown .
100+ Perform AI-powered web searches with structured results and reference URLs .
86101
87102``` python
88103from scrapegraph_py import Client
89104
90105client = Client(api_key = " your-api-key-here" )
91106
92- response = client.markdownify (
93- website_url = " https://example.com "
107+ response = client.searchscraper (
108+ user_prompt = " What is the latest version of Python and its main features? "
94109)
95110
96- print (response)
111+ print (f " Answer: { response[' result' ]} " )
112+ print (f " Sources: { response[' reference_urls' ]} " )
97113```
98114
99- ### 💻 LocalScraper
100-
101- Extracts information from HTML content using AI.
115+ <details >
116+ <summary >Output Schema (Optional)</summary >
102117
103118``` python
119+ from pydantic import BaseModel, Field
104120from scrapegraph_py import Client
105121
106122client = Client(api_key = " your-api-key-here" )
107123
108- html_content = """
109- <html>
110- <body>
111- <h1>Company Name</h1>
112- <p>We are a technology company focused on AI solutions.</p>
113- <div class="contact">
114- <p>Email: contact@example.com</p>
115- </div>
116- </body>
117- </html>
118- """
124+ class PythonVersionInfo (BaseModel ):
125+ version: str = Field(description = " The latest Python version number" )
126+ release_date: str = Field(description = " When this version was released" )
127+ major_features: list[str ] = Field(description = " List of main features" )
128+
129+ response = client.searchscraper(
130+ user_prompt = " What is the latest version of Python and its main features?" ,
131+ output_schema = PythonVersionInfo
132+ )
133+ ```
134+
135+ </details >
119136
120- response = client.localscraper(
121- user_prompt = " Extract the company description" ,
122- website_html = html_content
137+ ### 📝 Markdownify
138+
139+ Converts any webpage into clean, formatted markdown.
140+
141+ ``` python
142+ from scrapegraph_py import Client
143+
144+ client = Client(api_key = " your-api-key-here" )
145+
146+ response = client.markdownify(
147+ website_url = " https://example.com"
123148)
124149
125150print (response)
@@ -177,7 +202,7 @@ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file
177202## 🔗 Links
178203
179204- [ Website] ( https://scrapegraphai.com )
180- - [ Documentation] ( https://docs.scrapegraphai.com )
205+ - [ Documentation] ( https://docs.scrapegraphai.com )
181206- [ GitHub] ( https://github.com/ScrapeGraphAI/scrapegraph-sdk )
182207
183208---
0 commit comments