Skip to content

vuth-dogo/goemon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Goemon 🥷

AI-powered web scraping with self-healing capabilities.

CI Gem Version

Features

  • 🤖 AI-Powered - Uses LLMs to generate Nokogiri scraping code
  • 💾 Cache First - Reuse generated scripts to minimize API costs
  • 🔄 Self-Healing - Automatically retries with error feedback
  • 🔒 Sandboxed - Secure execution with code sanitization
  • 🔌 Multi-Provider - OpenAI, Anthropic, Gemini, or custom

Installation

gem 'goemon'

Quick Start

require 'goemon'

Goemon.configure do |config|
  config.ai_api_key = ENV['OPENAI_API_KEY']
end

result = Goemon.scrape(
  html: '<html><h1>iPhone 15</h1><span class="price">$999</span></html>',
  schema: [:title, :price]
)

result.data    # => { title: "iPhone 15", price: "$999" }
result.script  # => Generated Ruby code (cache this!)
result.source  # => :ai or :cache

Providers

# OpenAI (default)
Goemon.configure { |c| c.ai_provider = :openai }

# Anthropic Claude
Goemon.configure { |c| c.ai_provider = :anthropic }

# Google Gemini
Goemon.configure { |c| c.ai_provider = :gemini }

# Custom (Grok, Groq, etc.)
Goemon.configure do |c|
  c.ai_provider = {
    endpoint: "https://api.x.ai/v1/chat/completions",
    model: "grok-2",
    format: :openai
  }
end

Caching (Cost Optimization)

# First call - generates code via AI
result = Goemon.scrape(html: html, schema: schema)
save_to_db(result.script)

# Subsequent calls - uses cached script
cached = load_from_db
result = Goemon.scrape(html: html, schema: schema, cached_script: cached)
# result.source => :cache (no API call!)

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b feature/amazing)
  3. Run tests (bundle exec rspec)
  4. Commit changes (git commit -am 'Add amazing feature')
  5. Push (git push origin feature/amazing)
  6. Create Pull Request

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages