AI-powered web scraping with self-healing capabilities.
- 🤖 AI-Powered - Uses LLMs to generate Nokogiri scraping code
- 💾 Cache First - Reuse generated scripts to minimize API costs
- 🔄 Self-Healing - Automatically retries with error feedback
- 🔒 Sandboxed - Secure execution with code sanitization
- 🔌 Multi-Provider - OpenAI, Anthropic, Gemini, or custom
gem 'goemon'require 'goemon'
Goemon.configure do |config|
config.ai_api_key = ENV['OPENAI_API_KEY']
end
result = Goemon.scrape(
html: '<html><h1>iPhone 15</h1><span class="price">$999</span></html>',
schema: [:title, :price]
)
result.data # => { title: "iPhone 15", price: "$999" }
result.script # => Generated Ruby code (cache this!)
result.source # => :ai or :cache# OpenAI (default)
Goemon.configure { |c| c.ai_provider = :openai }
# Anthropic Claude
Goemon.configure { |c| c.ai_provider = :anthropic }
# Google Gemini
Goemon.configure { |c| c.ai_provider = :gemini }
# Custom (Grok, Groq, etc.)
Goemon.configure do |c|
c.ai_provider = {
endpoint: "https://api.x.ai/v1/chat/completions",
model: "grok-2",
format: :openai
}
end# First call - generates code via AI
result = Goemon.scrape(html: html, schema: schema)
save_to_db(result.script)
# Subsequent calls - uses cached script
cached = load_from_db
result = Goemon.scrape(html: html, schema: schema, cached_script: cached)
# result.source => :cache (no API call!)- Fork it
- Create your feature branch (
git checkout -b feature/amazing) - Run tests (
bundle exec rspec) - Commit changes (
git commit -am 'Add amazing feature') - Push (
git push origin feature/amazing) - Create Pull Request
MIT