A Next.js application for extracting and processing web content with AI. This project intelligently extracts meaningful content from web pages and provides AI-powered analysis, all with a sleek user interface.
- Extract clean, meaningful content from any URL
- Automatically remove navigation bars, ads, footers, and other clutter
- Process the extracted content with AI based on custom instructions
- Simple, in-memory operations with no database dependencies
- Two-stage AI processing for efficient and high-quality results
- copilot-instructions file
- prompt var in the route files
- instructions input that can translate for example
- Next.js (with App Router)
- TypeScript
- Tailwind CSS
- Google Generative AI SDK for AI processing
- ESLint
First, install the dependencies:
npm install
# or
yarn install
# or
pnpm installNext, create a .env.local file in the root directory with your Google AI API key:
GOOGLE_AI_API_KEY=your-api-key-here
You can obtain an API key from Google AI Studio.
Then, run the development server:
npm run dev
# or
yarn dev
# or
pnpm devOpen http://localhost:3000 with your browser to see the result.
- Enter a valid URL in the input field.
- Enter instructions for the AI in the second input field (e.g., "Summarize this content", "Extract key facts").
- Click the "Analyze Content" button.
- The application will:
- Fetch the webpage content
- Use AI to extract the meaningful parts of the page
- Process the extracted content according to your instructions
- Toggle between the extracted content and AI-processed results using the tabs.
The application uses a two-stage AI process:
-
Content Extraction: The first AI stage analyzes the raw HTML from the webpage and extracts only the meaningful content, removing navigation, ads, footers, and other clutter.
-
Content Processing: The second AI stage takes the extracted content and processes it according to your specific instructions.
This approach provides cleaner, more focused results by allowing the AI to work with already filtered content.
src/app/page.tsx: The main page component with the UI for URL input, AI instructions, and content display.src/app/api/scrape/route.ts: The API route that handles both HTML scraping and initial AI extraction.src/app/api/process/route.ts: The API route that processes the extracted content with Google's Generative AI.public/: Static assets..github/copilot-instructions.md: Instructions for GitHub Copilot on how to assist with this project.
This project is set up with TypeScript, Tailwind CSS, and ESLint for a modern development experience.
npm run build
# or
yarn build
# or
pnpm buildnpm start
# or
yarn start
# or
pnpm start